Interim report on updated microarray probes for the LLNL Burkholderia pseudomallei SNP array
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S; Jaing, C
2012-03-27
The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interimmore » report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.« less
Watson, Christopher M.; Crinnion, Laura A.; Gurgel‐Gianetti, Juliana; Harrison, Sally M.; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F.; Pena, Sergio D. J.; Bonthron, David T.
2015-01-01
ABSTRACT Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease‐causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome‐wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution. PMID:26037133
Maslow, Bat-Sheva L; Budinetz, Tara; Sueldo, Carolina; Anspach, Erica; Engmann, Lawrence; Benadiva, Claudio; Nulsen, John C
2015-07-01
To compare the analysis of chromosome number from paraffin-embedded products of conception using single-nucleotide polymorphism (SNP) microarray with the recommended screening for the evaluation of couples presenting with recurrent pregnancy loss who do not have previous fetal cytogenetic data. We performed a retrospective cohort study including all women who presented for a new evaluation of recurrent pregnancy loss over a 2-year period (January 1, 2012, to December 31, 2013). All participants had at least two documented first-trimester losses and both the recommended screening tests and SNP microarray performed on at least one paraffin-embedded products of conception sample. Single-nucleotide polymorphism microarray identifies all 24 chromosomes (22 autosomes, X, and Y). Forty-two women with a total of 178 losses were included in the study. Paraffin-embedded products of conception from 62 losses were sent for SNP microarray. Single-nucleotide polymorphism microarray successfully diagnosed fetal chromosome number in 71% (44/62) of samples, of which 43% (19/44) were euploid and 57% (25/44) were noneuploid. Seven of 42 (17%) participants had abnormalities on recurrent pregnancy loss screening. The per-person detection rate for a cause of pregnancy loss was significantly higher in the SNP microarray (0.50; 95% confidence interval [CI] 0.36-0.64) compared with recurrent pregnancy loss evaluation (0.17; 95% CI 0.08-0.31) (P=.002). Participants with one or more euploid loss identified on paraffin-embedded products of conception were significantly more likely to have an abnormality on recurrent pregnancy loss screening than those with only noneuploid results (P=.028). The significance remained when controlling for age, number of losses, number of samples, and total pregnancies. These results suggest that SNP microarray testing of paraffin-embedded products of conception is a valuable tool for the evaluation of recurrent pregnancy loss in patients without prior fetal cytogenetic results. Recommended recurrent pregnancy loss screening was unnecessary in almost half the patients in our study. II.
Analysis of population structure and genetic history of cattle breeds based on high-density SNP data
USDA-ARS?s Scientific Manuscript database
Advances in single nucleotide polymorphism (SNP) genotyping microarrays have facilitated a new understanding of population structure and evolutionary history for several species. Most existing studies in livestock were based on low density SNP arrays. The first wave of low density SNP studies on cat...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jaing, C; Gardner, S
The goal of this project is to develop forensic genotyping assays for select agent viruses, enhancing the current capabilities for the viral bioforensics and law enforcement community. We used a multipronged approach combining bioinformatics analysis, PCR-enriched samples, microarrays and TaqMan assays to develop high resolution and cost effective genotyping methods for strain level forensic discrimination of viruses. We have leveraged substantial experience and efficiency gained through year 1 on software development, SNP discovery, TaqMan signature design and phylogenetic signature mapping to scale up the development of forensics signatures in year 2. In this report, we have summarized the whole genomemore » wide SNP analysis and microarray probe design for forensics characterization of South American hemorrhagic fever viruses, tick-borne encephalitis viruses and henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus and Japanese encephalitis virus.« less
Alexiev, Borislav A; Zou, Ying S
2014-12-01
Chromosomal microarray analysis using novel Molecular Inversion Probe (MIP) technology demonstrated 2,570 kb copy neutral LOH of 10q11.22 in two clear cell papillary renal cell carcinomas. In addition, one of the tumors had a big 29,784 kb deletion of 13q11-q14.2. There were two variants of unknown significance, a 2,509 kb gain of Xp22.33 and a 257 kb homozygous deletion of 8p11.22. The somatic mutation panel containing 74 mutations in nine genes did not reveal any mutations. Besides identification of submicroscopic duplications or deletions, SNP microarrays can reveal abnormal allelic imbalances including LOH and copy neutral LOH, which cannot be recognized by chromosome, FISH, and non-SNP microarray arrays. To the best of our knowledge, this is the first study demonstrating copy neutral LOH of 10q11.22 in clear cell papillary renal cell carcinomas using the new MIP SNP OncoScan FFPE Assay Kit on formalin-fixed paraffin-embedded tumor samples. Copyright © 2014 Elsevier GmbH. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
SacconePhD, Scott F; Chesler, Elissa J; Bierut, Laura J
Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well representedmore » by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Abbey, Darren; Hickman, Meleah; Gresham, David; Berman, Judith
2011-01-01
Phenotypic diversity can arise rapidly through loss of heterozygosity (LOH) or by the acquisition of copy number variations (CNV) spanning whole chromosomes or shorter contiguous chromosome segments. In Candida albicans, a heterozygous diploid yeast pathogen with no known meiotic cycle, homozygosis and aneuploidy alter clinical characteristics, including drug resistance. Here, we developed a high-resolution microarray that simultaneously detects ∼39,000 single nucleotide polymorphism (SNP) alleles and ∼20,000 copy number variation loci across the C. albicans genome. An important feature of the array analysis is a computational pipeline that determines SNP allele ratios based upon chromosome copy number. Using the array and analysis tools, we constructed a haplotype map (hapmap) of strain SC5314 to assign SNP alleles to specific homologs, and we used it to follow the acquisition of loss of heterozygosity (LOH) and copy number changes in a series of derived laboratory strains. This high-resolution SNP/CGH microarray and the associated hapmap facilitated the phasing of alleles in lab strains and revealed detrimental genome changes that arose frequently during molecular manipulations of laboratory strains. Furthermore, it provided a useful tool for rapid, high-resolution, and cost-effective characterization of changes in allele diversity as well as changes in chromosome copy number in new C. albicans isolates. PMID:22384363
Characterization of genetic variability of Venezuelan equine encephalitis viruses
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...
2016-04-07
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Hartmann, Luise; Stephenson, Christine F; Verkamp, Stephanie R; Johnson, Krystal R; Burnworth, Bettina; Hammock, Kelle; Brodersen, Lisa Eidenschink; de Baca, Monica E; Wells, Denise A; Loken, Michael R; Zehentner, Barbara K
2014-12-01
Array comparative genomic hybridization (aCGH) has become a powerful tool for analyzing hematopoietic neoplasms and identifying genome-wide copy number changes in a single assay. aCGH also has superior resolution compared with fluorescence in situ hybridization (FISH) or conventional cytogenetics. Integration of single nucleotide polymorphism (SNP) probes with microarray analysis allows additional identification of acquired uniparental disomy, a copy neutral aberration with known potential to contribute to tumor pathogenesis. However, a limitation of microarray analysis has been the inability to detect clonal heterogeneity in a sample. This study comprised 16 samples (acute myeloid leukemia, myelodysplastic syndrome, chronic lymphocytic leukemia, plasma cell neoplasm) with complex cytogenetic features and evidence of clonal evolution. We used an integrated manual peak reassignment approach combining analysis of aCGH and SNP microarray data for characterization of subclonal abnormalities. We compared array findings with results obtained from conventional cytogenetic and FISH studies. Clonal heterogeneity was detected in 13 of 16 samples by microarray on the basis of log2 values. Use of the manual peak reassignment analysis approach improved resolution of the sample's clonal composition and genetic heterogeneity in 10 of 13 (77%) patients. Moreover, in 3 patients, clonal disease progression was revealed by array analysis that was not evident by cytogenetic or FISH studies. Genetic abnormalities originating from separate clonal subpopulations can be identified and further characterized by combining aCGH and SNP hybridization results from 1 integrated microarray chip by use of the manual peak reassignment technique. Its clinical utility in comparison to conventional cytogenetic or FISH studies is demonstrated. © 2014 American Association for Clinical Chemistry.
Zhang, Linsheng; Znoyko, Iya; Costa, Luciano J; Conlin, Laura K; Daber, Robert D; Self, Sally E; Wolff, Daynna J
2011-12-01
Chronic lymphocytic leukemia (CLL) is a clinically heterogeneous disease. The methods currently used for monitoring CLL and determining conditions for treatment are limited in their ability to predict disease progression, patient survival, and response to therapy. Although clonal diversity and the acquisition of new chromosomal abnormalities during the disease course (clonal evolution) have been associated with disease progression, their prognostic potential has been underappreciated because cytogenetic and fluorescence in situ hybridization (FISH) studies have a restricted ability to detect genomic abnormalities and clonal evolution. We hypothesized that whole genome analysis using high resolution single nucleotide polymorphism (SNP) microarrays would be useful to detect diversity and infer clonal evolution to offer prognostic information. In this study, we used the Infinium Omni1 BeadChip (Illumina, San Diego, CA) array for the analysis of genetic variation and percent mosaicism in 25 non-selected CLL patients to explore the prognostic value of the assessment of clonal diversity in patients with CLL. We calculated the percentage of mosaicism for each abnormality by applying a mathematical algorithm to the genotype frequency data and by manual determination using the Simulated DNA Copy Number (SiDCoN) tool, which was developed from a computer model of mosaicism. At least one genetic abnormality was identified in each case, and the SNP data was 98% concordant with FISH results. Clonal diversity, defined as the presence of two or more genetic abnormalities with differing percentages of mosaicism, was observed in 12 patients (48%), and the diversity correlated with the disease stage. Clonal diversity was present in most cases of advanced disease (Rai stages III and IV) or those with previous treatment, whereas 9 of 13 patients without detected clonal diversity were asymptomatic or clinically stable. In conclusion, SNP microarray studies with simultaneous evaluation of genomic alterations and mosaic distribution of clones can be used to assess apparent clonal evolution via analysis of clonal diversity. Since clonal evolution in CLL is strongly correlated with disease progression, whole genome SNP microarray analysis provides a new comprehensive and reliable prognostic tool for CLL patients. Copyright © 2011 Elsevier Inc. All rights reserved.
A Discovery Resource of Rare Copy Number Variations in Individuals with Autism Spectrum Disorder
Prasad, Aparna; Merico, Daniele; Thiruvahindrapuram, Bhooma; Wei, John; Lionel, Anath C.; Sato, Daisuke; Rickaby, Jessica; Lu, Chao; Szatmari, Peter; Roberts, Wendy; Fernandez, Bridget A.; Marshall, Christian R.; Hatchwell, Eli; Eis, Peggy S.; Scherer, Stephen W.
2012-01-01
The identification of rare inherited and de novo copy number variations (CNVs) in human subjects has proven a productive approach to highlight risk genes for autism spectrum disorder (ASD). A variety of microarrays are available to detect CNVs, including single-nucleotide polymorphism (SNP) arrays and comparative genomic hybridization (CGH) arrays. Here, we examine a cohort of 696 unrelated ASD cases using a high-resolution one-million feature CGH microarray, the majority of which were previously genotyped with SNP arrays. Our objective was to discover new CNVs in ASD cases that were not detected by SNP microarray analysis and to delineate novel ASD risk loci via combined analysis of CGH and SNP array data sets on the ASD cohort and CGH data on an additional 1000 control samples. Of the 615 ASD cases analyzed on both SNP and CGH arrays, we found that 13,572 of 21,346 (64%) of the CNVs were exclusively detected by the CGH array. Several of the CGH-specific CNVs are rare in population frequency and impact previously reported ASD genes (e.g., NRXN1, GRM8, DPYD), as well as novel ASD candidate genes (e.g., CIB2, DAPP1, SAE1), and all were inherited except for a de novo CNV in the GPHN gene. A functional enrichment test of gene-sets in ASD cases over controls revealed nucleotide metabolism as a potential novel pathway involved in ASD, which includes several candidate genes for follow-up (e.g., DPYD, UPB1, UPP1, TYMP). Finally, this extensively phenotyped and genotyped ASD clinical cohort serves as an invaluable resource for the next step of genome sequencing for complete genetic variation detection. PMID:23275889
Dynamic variable selection in SNP genotype autocalling from APEX microarray data.
Podder, Mohua; Welch, William J; Zamar, Ruben H; Tebbutt, Scott J
2006-11-30
Single nucleotide polymorphisms (SNPs) are DNA sequence variations, occurring when a single nucleotide--adenine (A), thymine (T), cytosine (C) or guanine (G)--is altered. Arguably, SNPs account for more than 90% of human genetic variation. Our laboratory has developed a highly redundant SNP genotyping assay consisting of multiple probes with signals from multiple channels for a single SNP, based on arrayed primer extension (APEX). This mini-sequencing method is a powerful combination of a highly parallel microarray with distinctive Sanger-based dideoxy terminator sequencing chemistry. Using this microarray platform, our current genotype calling system (known as SNP Chart) is capable of calling single SNP genotypes by manual inspection of the APEX data, which is time-consuming and exposed to user subjectivity bias. Using a set of 32 Coriell DNA samples plus three negative PCR controls as a training data set, we have developed a fully-automated genotyping algorithm based on simple linear discriminant analysis (LDA) using dynamic variable selection. The algorithm combines separate analyses based on the multiple probe sets to give a final posterior probability for each candidate genotype. We have tested our algorithm on a completely independent data set of 270 DNA samples, with validated genotypes, from patients admitted to the intensive care unit (ICU) of St. Paul's Hospital (plus one negative PCR control sample). Our method achieves a concordance rate of 98.9% with a 99.6% call rate for a set of 96 SNPs. By adjusting the threshold value for the final posterior probability of the called genotype, the call rate reduces to 94.9% with a higher concordance rate of 99.6%. We also reversed the two independent data sets in their training and testing roles, achieving a concordance rate up to 99.8%. The strength of this APEX chemistry-based platform is its unique redundancy having multiple probes for a single SNP. Our model-based genotype calling algorithm captures the redundancy in the system considering all the underlying probe features of a particular SNP, automatically down-weighting any 'bad data' corresponding to image artifacts on the microarray slide or failure of a specific chemistry. In this regard, our method is able to automatically select the probes which work well and reduce the effect of other so-called bad performing probes in a sample-specific manner, for any number of SNPs.
Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Song, Jiuzhou; Liu, George E
2013-06-25
Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.
Nie, Bei; Yang, Min; Fu, Weiling; Liang, Zhiqing
2015-07-07
The surface invasive cleavage assay, because of its innate accuracy and ability for self-signal amplification, provides a potential route for the mapping of hundreds of thousands of human SNP sites. However, its performance on a high density DNA array has not yet been established, due to the unusual "hairpin" probe design on the microarray and the lack of chemical stability of commercially available substrates. Here we present an applicable method to implement a nanocrystalline diamond thin film as an alternative substrate for fabricating an addressable DNA array using maskless light-directed photochemistry, producing the most chemically stable and biocompatible system for genetic analysis and enzymatic reactions. The surface invasive cleavage reaction, followed by degenerated primer ligation and post-rolling circle amplification is consecutively performed on the addressable diamond DNA array, accurately mapping SNP sites from PCR-amplified human genomic target DNA. Furthermore, a specially-designed DNA array containing dual probes in the same pixel is fabricated by following a reverse light-directed DNA synthesis protocol. This essentially enables us to decipher thousands of SNP alleles in a single-pot reaction by the simple addition of enzyme, target and reaction buffers.
Single-feature polymorphism discovery in the barley transcriptome
Rostoks, Nils; Borevitz, Justin O; Hedley, Peter E; Russell, Joanne; Mudie, Sharon; Morris, Jenny; Cardle, Linda; Marshall, David F; Waugh, Robbie
2005-01-01
A probe-level model for analysis of GeneChip gene-expression data is presented which identified more than 10,000 single-feature polymorphisms (SFP) between two barley genotypes. The method has good sensitivity, as 67% of known single-nucleotide polymorphisms (SNP) were called as SFPs. This method is applicable to all oligonucleotide microarray data, accounts for SNP effects in gene-expression data and represents an efficient and versatile approach for highly parallel marker identification in large genomes. PMID:15960806
Huang, Chao-Wei; Lin, Yu-Tsung; Ding, Shih-Torng; Lo, Ling-Ling; Wang, Pei-Hwa; Lin, En-Chung; Liu, Fang-Wei; Lu, Yen-Wen
2015-01-01
The genetic markers associated with economic traits have been widely explored for animal breeding. Among these markers, single-nucleotide polymorphism (SNPs) are gradually becoming a prevalent and effective evaluation tool. Since SNPs only focus on the genetic sequences of interest, it thereby reduces the evaluation time and cost. Compared to traditional approaches, SNP genotyping techniques incorporate informative genetic background, improve the breeding prediction accuracy and acquiesce breeding quality on the farm. This article therefore reviews the typical procedures of animal breeding using SNPs and the current status of related techniques. The associated SNP information and genotyping techniques, including microarray and Lab-on-a-Chip based platforms, along with their potential are highlighted. Examples in pig and poultry with different SNP loci linked to high economic trait values are given. The recommendations for utilizing SNP genotyping in nimal breeding are summarized. PMID:27600241
Construction of a versatile SNP array for pyramiding useful genes of rice.
Kurokawa, Yusuke; Noda, Tomonori; Yamagata, Yoshiyuki; Angeles-Shim, Rosalyn; Sunohara, Hidehiko; Uehara, Kanako; Furuta, Tomoyuki; Nagai, Keisuke; Jena, Kshirod Kumar; Yasui, Hideshi; Yoshimura, Atsushi; Ashikari, Motoyuki; Doi, Kazuyuki
2016-01-01
DNA marker-assisted selection (MAS) has become an indispensable component of breeding. Single nucleotide polymorphisms (SNP) are the most frequent polymorphism in the rice genome. However, SNP markers are not readily employed in MAS because of limitations in genotyping platforms. Here the authors report a Golden Gate SNP array that targets specific genes controlling yield-related traits and biotic stress resistance in rice. As a first step, the SNP genotypes were surveyed in 31 parental varieties using the Affymetrix Rice 44K SNP microarray. The haplotype information for 16 target genes was then converted to the Golden Gate platform with 143-plex markers. Haplotypes for the 14 useful allele are unique and can discriminate among all other varieties. The genotyping consistency between the Affymetrix microarray and the Golden Gate array was 92.8%, and the accuracy of the Golden Gate array was confirmed in 3 F2 segregating populations. The concept of the haplotype-based selection by using the constructed SNP array was proofed. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
[Genetic analysis of two cases with Dandy-Walker deformed fetus].
Yao, Juan; Fang, Rong; Shen, Xueping; Shen, Guosong; Zhang, Su
2017-10-10
To explore the genetic etiology of two fetuses with Dandy-Walker malformation using single nucleotide polymorphism microarray (SNP-array). The fetuses and their parents were subjected to G banding karyotype analysis. The fetuses were also subjected to SNP-array analysis. The parents of both fetuses showed a normal karyotype. One fetus has a 46,X,?i(X)(q10), while for another conventional cell culture has failed. SNP-array showed that one fetus carried a 6p25.3p25.2 microdeletion, and another carried a Xp22.33p22.2 deletion and a Yq11.221q11 duplication. The abnormal fragments have involved FOXC1, SHOX and STS genes, which are associated with Dandy-Walker malformation. Alteration of 6p25.3p25.2, Xp22.33p22.2 copy numbers probably underlies the Dandy-Walker syndrome in the fetuses. The disorder may be attributed to abnormal expression of FOXC1, SHOX, and STS genes. SNP-array can provide an important supplement for prenatal diagnosis.
2014-01-01
Background Kidney stone disease (KSD) is a complex disorder with unknown etiology in majority of the patients. Genetic and environmental factors may cause the disease. In the present study, we used DNA microarray to genotype single nucleotide polymorphisms (SNP) and performed candidate gene association analysis to determine genetic variations associated with the disease. Methods A whole genome SNP genotyping by DNA microarray was initially conducted in 101 patients and 105 control subjects. A set of 104 candidate genes reported to be involved in KSD, gathered from public databases and candidate gene association study databases, were evaluated for their variations associated with KSD. Results Altogether 82 SNPs distributed within 22 candidate gene regions showed significant differences in SNP allele frequencies between the patient and control groups (P < 0.05). Of these, 4 genes including BGLAP, AHSG, CD44, and HAO1, encoding osteocalcin, fetuin-A, CD44-molecule and glycolate oxidase 1, respectively, were further assessed for their associations with the disease because they carried high proportion of SNPs with statistical differences of allele frequencies between the patient and control groups within the gene. The total of 26 SNPs showed significant differences of allele frequencies between the patient and control groups and haplotypes associated with disease risk were identified. The SNP rs759330 located 144 bp downstream of BGLAP where it is a predicted microRNA binding site at 3′UTR of PAQR6 – a gene encoding progestin and adipoQ receptor family member VI, was genotyped in 216 patients and 216 control subjects and found to have significant differences in its genotype and allele frequencies (P = 0.0007, OR 2.02 and P = 0.0001, OR 2.02, respectively). Conclusions Our results suggest that these candidate genes are associated with KSD and PAQR6 comes into our view as the most potent candidate since associated SNP rs759330 is located in the miRNA binding site and may affect mRNA expression level. PMID:24886237
Miyakawa, Hiroe; Miyamoto, Toshinobu; Koh, Eitetsu; Tsujimura, Akira; Miyagawa, Yasushi; Saijo, Yasuaki; Namiki, Mikio; Sengoku, Kazuo
2012-01-01
Genetic mechanisms have been implicated as a cause of some cases of male infertility. Recently, 10 novel genes involved in human spermatogenesis, including human SEPTIN12, were identified by expression microarray analysis of human testicular tissue. Septin12 is a member of the septin family of conserved cytoskeletal GTPases that form heteropolymeric filamentous structures in interphase cells. It is expressed specifically in the testis. Therefore, we hypothesized that mutation or polymorphisms of SEPTIN12 participate in male infertility, especially Sertoli cell-only syndrome (SCOS). To investigate whether SEPTIN12 gene defects are associated with azoospermia caused by SCOS, mutational analysis was performed in 100 Japanese patients by direct sequencing of coding regions. Statistical analysis was performed in patients with SCOS and in 140 healthy control men. No mutations were found in SEPTIN12 ; however, 8 coding single-nucleotide polymorphisms (SNP1-SNP8) could be detected in the patients with SCOS. The genotype and allele frequencies in SNP3, SNP4, and SNP6 were notably higher in the SCOS group than in the control group (P < .001). These results suggest that SEPTIN12 might play a critical role in human spermatogenesis.
Lopez, G H; Morrison, J; Condon, J A; Wilson, B; Martin, J R; Liew, Y-W; Flower, R L; Hyland, C A
2015-10-01
Duffy blood group phenotypes can be predicted by genotyping for single nucleotide polymorphisms (SNPs) responsible for the Fy(a) /Fy(b) polymorphism, for weak Fy(b) antigen, and for the red cell null Fy(a-b-) phenotype. This study correlates Duffy phenotype predictions with serotyping to assess the most reliable procedure for typing. Samples, n = 155 (135 donors and 20 patients), were genotyped by high-resolution melt PCR and by microarray. Samples were in three serology groups: 1) Duffy patterns expected n = 79, 2) weak and equivocal Fy(b) patterns n = 29 and 3) Fy(a-b-) n = 47 (one with anti-Fy3 antibody). Discrepancies were observed for five samples. For two, SNP genotyping predicted weak Fy(b) expression discrepant with Fy(b-) (Group 1 and 3). For three, SNP genotyping predicted Fy(a) , discrepant with Fy(a-b-) (Group 3). DNA sequencing identified silencing mutations in these FY*A alleles. One was a novel FY*A 719delG. One, the sample with the anti-Fy3, was homozygous for a 14-bp deletion (FY*01N.02); a true null. Both the high-resolution melting analysis and SNP microarray assays were concordant and showed genotyping, as well as phenotyping, is essential to ensure 100% accuracy for Duffy blood group assignments. Sequencing is important to resolve phenotype/genotype conflicts which here identified alleles, one novel, that carry silencing mutations. The risk of alloimmunisation may be dependent on this zygosity status. © 2015 International Society of Blood Transfusion.
Fully automated analysis of multi-resolution four-channel micro-array genotyping data
NASA Astrophysics Data System (ADS)
Abbaspour, Mohsen; Abugharbieh, Rafeef; Podder, Mohua; Tebbutt, Scott J.
2006-03-01
We present a fully-automated and robust microarray image analysis system for handling multi-resolution images (down to 3-micron with sizes up to 80 MBs per channel). The system is developed to provide rapid and accurate data extraction for our recently developed microarray analysis and quality control tool (SNP Chart). Currently available commercial microarray image analysis applications are inefficient, due to the considerable user interaction typically required. Four-channel DNA microarray technology is a robust and accurate tool for determining genotypes of multiple genetic markers in individuals. It plays an important role in the state of the art trend where traditional medical treatments are to be replaced by personalized genetic medicine, i.e. individualized therapy based on the patient's genetic heritage. However, fast, robust, and precise image processing tools are required for the prospective practical use of microarray-based genetic testing for predicting disease susceptibilities and drug effects in clinical practice, which require a turn-around timeline compatible with clinical decision-making. In this paper we have developed a fully-automated image analysis platform for the rapid investigation of hundreds of genetic variations across multiple genes. Validation tests indicate very high accuracy levels for genotyping results. Our method achieves a significant reduction in analysis time, from several hours to just a few minutes, and is completely automated requiring no manual interaction or guidance.
Shalia, Kavita; Saranath, Dhananjaya; Rayar, Jaipreet; Shah, Vinod K.; Mashru, Manoj R.; Soneji, Surendra L.
2017-01-01
Background & objectives: Acute myocardial infarction (AMI) is a major health concern in India. The aim of the study was to identify single nucleotide polymorphisms (SNPs) associated with AMI in patients using dedicated chip and validating the identified SNPs on custom-designed chips using high-throughput microarray analysis. Methods: In pilot phase, 48 AMI patients and 48 healthy controls were screened for SNPs using human CVD55K BeadChip with 48,472 SNP probes on Illumina high-throughput microarray platform. The identified SNPs were validated by genotyping additional 160 patients and 179 controls using custom-made Illumina VeraCode GoldenGate Genotyping Assay. Analysis was carried out using PLINK software. Results: From the pilot phase, 98 SNPs present on 94 genes were identified with increased risk of AMI (odds ratio of 1.84-8.85, P=0.04861-0.003337). Five of these SNPs demonstrated association with AMI in the validation phase (P<0.05). Among these, one SNP rs9978223 on interferon gamma receptor 2 [IFNGR2, interferon (IFN)-gamma transducer 1] gene showed a significant association (P=0.00021) with AMI below Bonferroni corrected P value (P=0.00061). IFNGR2 is the second subunit of the receptor for IFN-gamma, an important cytokine in inflammatory reactions. Interpretation & conclusions: The study identified an SNP rs9978223 on IFNGR2 gene, associated with increased risk in AMI patient from India. PMID:29434065
Venegas-Vega, Carlos A.; Zepeda, Luis M.; Garduño-Zarazúa, Luz M.; Berumen, Jaime; Kofman, Susana; Cervantes, Alicia
2013-01-01
The use of conventional cytogenetic techniques in combination with fluorescent in situ hybridization (FISH) and single-nucleotide polymorphism (SNP) microarrays is necessary for the identification of cryptic rearrangements in the diagnosis of chromosomal syndromes. We report two siblings, a boy of 9 years and 9 months of age and his 7-years- and 5-month-old sister, with the classic Wolf-Hirschhorn syndrome (WHS) phenotype. Using high-resolution GTG- and NOR-banding karyotypes, as well as FISH analysis, we characterized a pure 4p deletion in both sibs and a balanced rearrangement in their father, consisting in an insertion of 4p material within a nucleolar organizing region of chromosome 15. Copy number variant (CNV) analysis using SNP arrays showed that both siblings have a similar size of 4p deletion (~6.5 Mb). Our results strongly support the need for conventional cytogenetic and FISH analysis, as well as high-density microarray mapping for the optimal characterization of the genetic imbalance in patients with WHS; parents must always be studied for recognizing cryptic balanced chromosomal rearrangements for an adequate genetic counseling. PMID:23484094
He, Xianmin; Wei, Qing; Sun, Meiqian; Fu, Xuping; Fan, Sichang; Li, Yao
2006-05-01
Biological techniques such as Array-Comparative genomic hybridization (CGH), fluorescent in situ hybridization (FISH) and affymetrix single nucleotide pleomorphism (SNP) array have been used to detect cytogenetic aberrations. However, on genomic scale, these techniques are labor intensive and time consuming. Comparative genomic microarray analysis (CGMA) has been used to identify cytogenetic changes in hepatocellular carcinoma (HCC) using gene expression microarray data. However, CGMA algorithm can not give precise localization of aberrations, fails to identify small cytogenetic changes, and exhibits false negatives and positives. Locally un-weighted smoothing cytogenetic aberrations prediction (LS-CAP) based on local smoothing and binomial distribution can be expected to address these problems. LS-CAP algorithm was built and used on HCC microarray profiles. Eighteen cytogenetic abnormalities were identified, among them 5 were reported previously, and 12 were proven by CGH studies. LS-CAP effectively reduced the false negatives and positives, and precisely located small fragments with cytogenetic aberrations.
Bisignano, A; Wells, D; Harton, G; Munné, S
2011-12-01
Diagnosis of embryos for chromosome abnormalities, i.e. aneuploidy screening, has been invigorated by the introduction of microarray-based testing methods allowing analysis of 24 chromosomes in one test. Recent data have been suggestive of increased implantation and pregnancy rates following microarray testing. Preimplantation genetic diagnosis for infertility aims to test for gross chromosome changes with the hope that identification and transfer of normal embryos will improve IVF outcomes. Testing by some methods, specifically single-nucleotide polymorphism (SNP) microarrays, allow for more information and potential insight into parental origin of aneuploidy and uniparental disomy. The usefulness and validity of reporting this information is flawed. Numerous papers have shown that the majority of meiotic errors occur in the egg, while mitotic errors in the embryo affect parental chromosomes at random. Potential mistakes made in assigning an error as meiotic or mitotic may lead to erroneous reporting of results with medical consequences. This study's data suggest that the bioinformatic cleaning used to 'fix' the miscalls that plague single-cell whole-genome amplification provides little improvement in the quality of useful data. Based on the information available, SNP-based aneuploidy screening suffers from a number of serious issues that must be resolved. Copyright © 2011 Reproductive Healthcare Ltd. Published by Elsevier Ltd. All rights reserved.
Hu, Guohong; Wang, Hui-Yun; Greenawalt, Danielle M.; Azaro, Marco A.; Luo, Minjie; Tereshchenko, Irina V.; Cui, Xiangfeng; Yang, Qifeng; Gao, Richeng; Shen, Li; Li, Honghua
2006-01-01
Microarray-based analysis of single nucleotide polymorphisms (SNPs) has many applications in large-scale genetic studies. To minimize the influence of experimental variation, microarray data usually need to be processed in different aspects including background subtraction, normalization and low-signal filtering before genotype determination. Although many algorithms are sophisticated for these purposes, biases are still present. In the present paper, new algorithms for SNP microarray data analysis and the software, AccuTyping, developed based on these algorithms are described. The algorithms take advantage of a large number of SNPs included in each assay, and the fact that the top and bottom 20% of SNPs can be safely treated as homozygous after sorting based on their ratios between the signal intensities. These SNPs are then used as controls for color channel normalization and background subtraction. Genotype calls are made based on the logarithms of signal intensity ratios using two cutoff values, which were determined after training the program with a dataset of ∼160 000 genotypes and validated by non-microarray methods. AccuTyping was used to determine >300 000 genotypes of DNA and sperm samples. The accuracy was shown to be >99%. AccuTyping can be downloaded from . PMID:16982644
Guo, Xi; Geng, Peng; Wang, Quan; Cao, Boyang; Liu, Bin
2014-10-01
Severe acute respiratory syndrome (SARS), a disease that spread widely in the world during late 2002 to 2004, severely threatened public health. Although there have been no reported infections since 2004, the extremely pathogenic SARS coronavirus (SARS-CoV), as the causative agent of SARS, has recently been identified in animals, showing the potential for the re-emergence of this disease. Previous studies showed that 27 single nucleotide polymorphism (SNP) mutations among the spike (S) gene of this virus are correlated closely with the SARS pathogenicity and epidemicity. We have developed a SNP DNA microarray in order to detect and genotype these SNPs, and to obtain related information on the pathogenicity and epidemicity of a given strain. The microarray was hybridized with PCR products amplified from cDNAs obtained from different SARS-CoV strains. We were able to detect 24 SNPs and determine the type of a given strain. The hybridization profile showed that 19 samples were detected and genotyped correctly by using our microarray, with 100% accuracy. Our microarray provides a novel method for the detection and epidemiological surveillance of SARS-CoV.
Xie, Yanxin; Xu, Yanwen; Wang, Jing; Miao, Benyu; Zeng, Yanhong; Ding, Chenhui; Gao, Jun; Zhou, Canquan
2018-01-01
The aim of this study was to determine whether an interchromosomal effect (ICE) occurred in embryos obtained from reciprocal translocation (rcp) and Robertsonian translocation (RT) carriers who were following a preimplantation genetic diagnosis (PGD) with whole chromosome screening with an aCGH and SNP microarray. We also analyzed the chromosomal numerical abnormalities in embryos with aneuploidy in parental chromosomes that were not involved with a translocation and balanced in involved parental translocation chromosomes. This retrospective study included 832 embryos obtained from rcp carriers and 382 embryos from RT carriers that were biopsied in 139 PGD cycles. The control group involved embryos obtained from age-matched patient karyotypes who were undergoing preimplantation genetic screening (PGS) with non-translocation, and 579 embryos were analyzed in the control group. A single blastomere at the cleavage stage or trophectoderm from a blastocyst was biopsied, and 24-chromosomal analysis with an aCGH/SNP microarray was conducted using the PGD/PGS protocols. Statistical analyses were implemented on the incidences of cumulative aneuploidy rates between the translocation carriers and the control group. Reliable results were obtained from 138 couples, among whom only one patient was a balanced rcp or RT translocation carrier, undergoing PGD testing in our center from January 2012 to June 2014. For day 3 embryos, the aneuploidy rates were 50.7% for rcp carriers and 49.1% for RT carriers, compared with the control group, with 44.8% at a maternal age < 36 years. When the maternal age was ≥ 36 years, the aneuploidy rates were increased to 61.1% for rcp carriers, 56.7% for RT carriers, and 60.3% for the control group. There were no significant differences. In day 5 embryos, the aneuploidy rates were 24.5% for rcp carriers and 34.9% for RT carriers, compared with the control group with 53.6% at a maternal age < 36 years. When the maternal age was ≥ 36 years, the aneuploidy rates were 10.7% for rcp carriers, 26.3% for RT carriers, and 57.1% for the control group. The cumulative aneuploidy rates of chromosome translocation carriers were significantly lower than the control group. No ICE was observed in cleavage and blastocyst stage embryos obtained from these carriers. Additionally, the risk of chromosomal numerical abnormalities was observed in each of the 23 pairs of autosomes or sex chromosomes from day 3 and day 5 embryos. There was not enough evidence to prove that ICE was present in embryos derived from both rcp and RT translocation carriers, regardless of the maternal age. However, chromosomal numerical abnormalities were noticed in 23 pairs of autosomes and sex chromosomes in parental structurally normal chromosomes. Thus, 24-chromosomal analysis with an aCGH/SNP microarray PGD protocol is required to decrease the risks of failure to diagnose aneuploidy in structurally normal chromosomes.
Karampetsou, Evangelia; Morrogh, Deborah; Chitty, Lyn
2014-01-01
The advantage of microarray (array) over conventional karyotype for the diagnosis of fetal pathogenic chromosomal anomalies has prompted the use of microarrays in prenatal diagnostics. In this review we compare the performance of different array platforms (BAC, oligonucleotide CGH, SNP) and designs (targeted, whole genome, whole genome, and targeted, custom) and discuss their advantages and disadvantages in relation to prenatal testing. We also discuss the factors to consider when implementing a microarray testing service for the diagnosis of fetal chromosomal aberrations. PMID:26237396
Bruno, D L; Ganesamoorthy, D; Schoumans, J; Bankier, A; Coman, D; Delatycki, M; Gardner, R J M; Hunter, M; James, P A; Kannu, P; McGillivray, G; Pachter, N; Peters, H; Rieubland, C; Savarirayan, R; Scheffer, I E; Sheffield, L; Tan, T; White, S M; Yeung, A; Bowman, Z; Ngo, C; Choy, K W; Cacheux, V; Wong, L; Amor, D J; Slater, H R
2009-02-01
Microarray genome analysis is realising its promise for improving detection of genetic abnormalities in individuals with mental retardation and congenital abnormality. Copy number variations (CNVs) are now readily detectable using a variety of platforms and a major challenge is the distinction of pathogenic from ubiquitous, benign polymorphic CNVs. The aim of this study was to investigate replacement of time consuming, locus specific testing for specific microdeletion and microduplication syndromes with microarray analysis, which theoretically should detect all known syndromes with CNV aetiologies as well as new ones. Genome wide copy number analysis was performed on 117 patients using Affymetrix 250K microarrays. 434 CNVs (195 losses and 239 gains) were found, including 18 pathogenic CNVs and 9 identified as "potentially pathogenic". Almost all pathogenic CNVs were larger than 500 kb, significantly larger than the median size of all CNVs detected. Segmental regions of loss of heterozygosity larger than 5 Mb were found in 5 patients. Genome microarray analysis has improved diagnostic success in this group of patients. Several examples of recently discovered "new syndromes" were found suggesting they are more common than previously suspected and collectively are likely to be a major cause of mental retardation. The findings have several implications for clinical practice. The study revealed the potential to make genetic diagnoses that were not evident in the clinical presentation, with implications for pretest counselling and the consent process. The importance of contributing novel CNVs to high quality databases for genotype-phenotype analysis and review of guidelines for selection of individuals for microarray analysis is emphasised.
Fine-scaled human genetic structure revealed by SNP microarrays.
Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B
2009-05-01
We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.
Jain, K K
2001-02-01
Cambridge Healthtech Institute's Third Annual Conference on Lab-on-a-Chip and Microarray technology covered the latest advances in this technology and applications in life sciences. Highlights of the meetings are reported briefly with emphasis on applications in genomics, drug discovery and molecular diagnostics. There was an emphasis on microfluidics because of the wide applications in laboratory and drug discovery. The lab-on-a-chip provides the facilities of a complete laboratory in a hand-held miniature device. Several microarray systems have been used for hybridisation and detection techniques. Oligonucleotide scanning arrays provide a versatile tool for the analysis of nucleic acid interactions and provide a platform for improving the array-based methods for investigation of antisense therapeutics. A method for analysing combinatorial DNA arrays using oligonucleotide-modified gold nanoparticle probes and a conventional scanner has considerable potential in molecular diagnostics. Various applications of microarray technology for high-throughput screening in drug discovery and single nucleotide polymorphisms (SNP) analysis were discussed. Protein chips have important applications in proteomics. With the considerable amount of data generated by the different technologies using microarrays, it is obvious that the reading of the information and its interpretation and management through the use of bioinformatics is essential. Various techniques for data analysis were presented. Biochip and microarray technology has an essential role to play in the evolving trends in healthcare, which integrate diagnosis with prevention/treatment and emphasise personalised medicines.
GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies
Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio
2013-01-01
We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243
Report for the NGFA-5 project.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jaing, C; Jackson, P; Thissen, J
The objective of this project is to provide DHS a comprehensive evaluation of the current genomic technologies including genotyping, TaqMan PCR, multiple locus variable tandem repeat analysis (MLVA), microarray and high-throughput DNA sequencing in the analysis of biothreat agents from complex environmental samples. To effectively compare the sensitivity and specificity of the different genomic technologies, we used SNP TaqMan PCR, MLVA, microarray and high-throughput illumine and 454 sequencing to test various strains from B. anthracis, B. thuringiensis, BioWatch aerosol filter extracts or soil samples that were spiked with B. anthracis, and samples that were previously collected during DHS and EPAmore » environmental release exercises that were known to contain B. thuringiensis spores. The results of all the samples against the various assays are discussed in this report.« less
Gao, Z J; Jiang, Q; Cheng, D Z; Yan, X X; Chen, Q; Xu, K M
2016-10-02
Objective: To evaluate the application of single nucleotide polymorphism (SNP)-microarray and target gene sequencing technology in the clinical molecular genetic diagnosis of unexplained intellectual disability(ID) or developmental delay (DD). Method: Patients with ID or DD were recruited in the Department of Neurology, Affiliated Children's Hospital of Capital Institute of Pediatrics between September 2015 and February 2016. The intellectual assessment of the patients was performed using 0-6-year-old pediatric examination table of neuropsychological development or Wechsler intelligence scale (>6 years). Patients with a DQ less than 49 or IQ less than 51 were included in this study. The patients were scanned by SNP-array for detection of genomic copy number variations (CNV), and the revealed genomic imbalance was confirmed by quantitative real time-PCR. Candidate gene mutation screening was carried out by target gene sequencing technology.Causal mutations or likely pathogenic variants were verified by polymerase chain reaction and direct sequencing. Result: There were 15 children with ID or DD enrolled, 9 males and 6 females. The age of these patients was 7 months-16 years and 9 months. SNP-array revealed that two of the 15 patients had genomic CNV. Both CNV were de novo micro deletions, one involved 11q24.1q25 and the other micro deletion located on 21q22.2q22.3. Both micro deletions were proved to have a clinical significance due to their association with ID, brain DD, unusual faces etc. by querying Decipher database. Thirteen patients with negative findings in SNP-array were consequently examined with target gene sequencing technology, genotype-phenotype correlation analysis and genetic analysis. Five patients were diagnosed with monogenic disorder, two were diagnosed with suspected genetic disorder and six were still negative. Conclusion: Sequential use of SNP-array and target gene sequencing technology can significantly increase the molecular genetic etiologic diagnosis rate of the patients with unexplained ID or DD. Combined use of these technologies can serve as a useful examinational method in assisting differential diagnosis of children with unexplained ID or DD.
Grote, Lauren; Myers, Melanie; Lovell, Anne; Saal, Howard; Sund, Kristen Lipscomb
2014-01-01
SNP microarrays are capable of detecting regions of homozygosity (ROH) which can suggest parental relatedness. This study was designed to describe pre- and post-test counseling practices of genetics professionals regarding ROH, explore perceived comfort and ethical concerns in the follow-up of such results, demonstrate awareness of laws surrounding duty to report consanguinity and incest, and allow respondents to share their personal experiences with results suggesting a parental relationship. A 35 question survey was administered to 240 genetic counselors and geneticists who had ordered or counseled for SNP microarray. The results are presented using descriptive statistics. There was variation in both pre- and post-test counseling practices of genetics professionals. Twenty-five percent of respondents reported pre-test counseling that ROH can indicate parental relatedness. The most commonly reported ethical concern was disclosure of findings suggesting parental relatedness to parents of the patient; only 48.4% reported disclosing parental relatedness when indicated. Fifty-seven percent felt comfortable receiving results suggesting parental consanguinity while 17% felt comfortable receiving results suggesting parental incest. Twenty percent of respondents were extremely/moderately familiar with the laws about duty to report incest. Personal experiences in post-test counseling included both parental acknowledgement and denial of relatedness. This study highlights the differences in genetics professionals' pre- and post-test counseling practices, comfort, and experiences surrounding parental relatedness suggested by SNP microarray results. It identifies a need for professional organizations to offer guidance to genetics professionals about how to respond to and counsel for molecular results suggesting parental consanguinity or incest. © 2013 Wiley Periodicals, Inc.
Aravind Kumar, M; Singh, Vineeta; Naushad, Shaik Mohammad; Shanker, Uday; Lakshmi Narasu, M
2018-05-01
In the view of aggressive nature of Triple-Negative Breast cancer (TNBC) due to the lack of receptors (ER, PR, HER2) and high incidence of drug resistance associated with it, a case-control association study was conducted to identify the contributing genetic risk factors for Triple-negative breast cancer (TNBC). A total of 30 TNBC patients and 50 age and gender-matched controls of Indian origin were screened for 9,00,000 SNP markers using microarray-based SNP genotyping approach. The initial PLINK association analysis (p < 0.01, MAF 0.14-0.44, OR 10-24) identified 28 non-synonymous SNPs and one stop gain mutation in the exonic region as possible determinants of TNBC risk. All the 29 SNPs were annotated using ANNOVAR. The interactions between these markers were evaluated using Multifactor dimensionality reduction (MDR) analysis. The interactions were in the following order: exm408776 > exm1278309 > rs316389 > rs1651654 > rs635538 > exm1292477. Recursive partitioning analysis (RPA) was performed to construct decision tree useful in predicting TNBC risk. As shown in this analysis, rs1651654 and exm585172 SNPs are found to be determinants of TNBC risk. Artificial neural network model was used to generate the Receiver operating characteristic curves (ROC), which showed high sensitivity and specificity (AUC-0.94) of these markers. To conclude, among the 9,00,000 SNPs tested, CCDC42 exm1292477, ANXA3 exm408776, SASH1 exm585172 are found to be the most significant genetic predicting factors for TNBC. The interactions among exm408776, exm1278309, rs316389, rs1651654, rs635538, exm1292477 SNPs inflate the risk for TNBC further. Targeted analysis of these SNPs and genes alone also will have similar clinical utility in predicting TNBC.
Combined array CGH plus SNP genome analyses in a single assay for optimized clinical testing
Wiszniewska, Joanna; Bi, Weimin; Shaw, Chad; Stankiewicz, Pawel; Kang, Sung-Hae L; Pursley, Amber N; Lalani, Seema; Hixson, Patricia; Gambin, Tomasz; Tsai, Chun-hui; Bock, Hans-Georg; Descartes, Maria; Probst, Frank J; Scaglia, Fernando; Beaudet, Arthur L; Lupski, James R; Eng, Christine; Wai Cheung, Sau; Bacino, Carlos; Patel, Ankita
2014-01-01
In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60 000 SNP probes, referred to as Chromosomal Microarray Analysis – Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner. PMID:23695279
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.
Guzzi, Pietro Hiram; Cannataro, Mario
2013-08-01
A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Quantitative phenotyping via deep barcode sequencing.
Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey
2009-10-01
Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.
Miyamoto, T; Koh, E; Tsujimura, A; Miyagawa, Y; Saijo, Y; Namiki, M; Sengoku, K
2014-04-01
Genetic mechanisms have been implicated as a cause of some cases of male infertility. Recently, ten novel genes involved in human spermatogenesis, including human LRWD1, have been identified by expression microarray analysis of human testictissue. The human LRWD1 protein mediates the origin recognition complex in chromatin, which is critical for the initiation of pre-replication complex assembly in G1 and chromatin organization in post-G1 cells. The Lrwd1 gene expression is specific to the testis in mice. Therefore, we hypothesized that mutation or polymorphisms of LRWD1 participate in male infertility, especially azoospermia. To investigate whether LRWD1 gene defects are associated with azoospermia caused by SCOS and meiotic arrest (MA), mutational analysis was performed in 100 and 30 Japanese patients by direct sequencing of the coding regions, respectively. Statistical analysis was performed for patients with SCOS and MA and in 100 healthy control men. No mutations were found in LRWD1; however, three coding single-nucleotide polymorphisms (SNP1-SNP3) could be detected in the patients. The genotype and allele frequencies in SNP1 and SNP2 were notably higher in the SCOS group than in the control group (P < 0.05). These results suggest the critical role of LRWD1 in human spermatogenesis. © 2013 Blackwell Verlag GmbH.
Analysis and visualization of chromosomal abnormalities in SNP data with SNPscan
Ting, Jason C; Ye, Ying; Thomas, George H; Ruczinski, Ingo; Pevsner, Jonathan
2006-01-01
Background A variety of diseases are caused by chromosomal abnormalities such as aneuploidies (having an abnormal number of chromosomes), microdeletions, microduplications, and uniparental disomy. High density single nucleotide polymorphism (SNP) microarrays provide information on chromosomal copy number changes, as well as genotype (heterozygosity and homozygosity). SNP array studies generate multiple types of data for each SNP site, some with more than 100,000 SNPs represented on each array. The identification of different classes of anomalies within SNP data has been challenging. Results We have developed SNPscan, a web-accessible tool to analyze and visualize high density SNP data. It enables researchers (1) to visually and quantitatively assess the quality of user-generated SNP data relative to a benchmark data set derived from a control population, (2) to display SNP intensity and allelic call data in order to detect chromosomal copy number anomalies (duplications and deletions), (3) to display uniparental isodisomy based on loss of heterozygosity (LOH) across genomic regions, (4) to compare paired samples (e.g. tumor and normal), and (5) to generate a file type for viewing SNP data in the University of California, Santa Cruz (UCSC) Human Genome Browser. SNPscan accepts data exported from Affymetrix Copy Number Analysis Tool as its input. We validated SNPscan using data generated from patients with known deletions, duplications, and uniparental disomy. We also inspected previously generated SNP data from 90 apparently normal individuals from the Centre d'Étude du Polymorphisme Humain (CEPH) collection, and identified three cases of uniparental isodisomy, four females having an apparently mosaic X chromosome, two mislabelled SNP data sets, and one microdeletion on chromosome 2 with mosaicism from an apparently normal female. These previously unrecognized abnormalities were all detected using SNPscan. The microdeletion was independently confirmed by fluorescence in situ hybridization, and a region of homozygosity in a UPD case was confirmed by sequencing of genomic DNA. Conclusion SNPscan is useful to identify chromosomal abnormalities based on SNP intensity (such as chromosomal copy number changes) and heterozygosity data (including regions of LOH and some cases of UPD). The program and source code are available at the SNPscan website . PMID:16420694
High Frequency of Copy-Neutral Loss of Heterozygosity in Patients with Myelofibrosis.
Rego de Paula Junior, Milton; Nonino, Alexandre; Minuncio Nascimento, Juliana; Bonadio, Raphael S; Pic-Taylor, Aline; de Oliveira, Silviene F; Wellerson Pereira, Rinaldo; do Couto Mascarenhas, Cintia; Forte Mazzeu, Juliana
2018-01-01
Myelofibrosis is the rarest and most severe type of Philadelphia-negative classical myeloproliferative neoplasms. Although mutually exclusive driver mutations in JAK2, MPL, or CALR that activate JAK-STAT pathway have been related to the pathogenesis of the disease, chromosome abnormalities have also been associated with the phenotype and prognosis of the disease. Here, we report the use of a chromosomal microarray platform consisting of both oligo and SNP probes to improve the detection of chromosome abnormalities in patients with myelofibrosis. Sixteen patients with myelofibrosis were tested, and the results were compared to karyotype analysis. Driver mutations in JAK2, MPL, or CALR were investigated by PCR and MLPA. Conventional cytogenetics revealed chromosome abnormalities in 3 out of 16 cases (18.7%), while chromosomal microarray analysis detected copy-number variations (CNV) or copy-neutral loss of heterozygosity (CN-LOH) alterations in 11 out of 16 (68.7%) patients. These included 43 CN-LOH, 14 deletions, 1 trisomy, and 1 duplication. Ten patients showed multiple chromosomal abnormalities, varying from 2 to 13 CNVs or CN-LOHs. Mutational status for JAK2, CALR, and MPL by MLPA revealed a total of 3/16 (18.7%) patients positive for the JAK2 V617F mutation, 9 with CALR deletion or insertion and 1 positive for MPL mutation. Considering that most of the CNVs identified were smaller than the karyotype resolution and the high frequency of CN-LOHs in our study, we propose that chromosomal microarray platforms that combine oligos and SNP should be used as a first-tier genetic test in patients with myelofibrosis. © 2018 S. Karger AG, Basel.
Pounds, Stan; Cao, Xueyuan; Cheng, Cheng; Yang, Jun; Campana, Dario; Evans, William E.; Pui, Ching-Hon; Relling, Mary V.
2010-01-01
Powerful methods for integrated analysis of multiple biological data sets are needed to maximize interpretation capacity and acquire meaningful knowledge. We recently developed Projection Onto the Most Interesting Statistical Evidence (PROMISE). PROMISE is a statistical procedure that incorporates prior knowledge about the biological relationships among endpoint variables into an integrated analysis of microarray gene expression data with multiple biological and clinical endpoints. Here, PROMISE is adapted to the integrated analysis of pharmacologic, clinical, and genome-wide genotype data that incorporating knowledge about the biological relationships among pharmacologic and clinical response data. An efficient permutation-testing algorithm is introduced so that statistical calculations are computationally feasible in this higher-dimension setting. The new method is applied to a pediatric leukemia data set. The results clearly indicate that PROMISE is a powerful statistical tool for identifying genomic features that exhibit a biologically meaningful pattern of association with multiple endpoint variables. PMID:21516175
Dutra, Roberta L; Piazzon, Flavia B; Zanardo, Évelin A; Costa, Thais Virginia Moura Machado; Montenegro, Marília M; Novo-Filho, Gil M; Dias, Alexandre T; Nascimento, Amom M; Kim, Chong Ae; Kulikowski, Leslie D
2015-12-01
Williams-Beuren syndrome (WBS) is caused by a hemizygous contiguous gene microdeletion of 1.55-1.84 Mb at 7q11.23 region. Approximately, 28 genes have been shown to contribute to classical phenotype of SWB with presence of dysmorphic facial features, supravalvular aortic stenosis (SVAS), intellectual disability, and overfriendliness. With the use of Microarray-based comparative genomic hybridization and other molecular cytogenetic techniques, is possible define with more accuracy partial or atypical deletion and refine the genotype-phenotype correlation. Here, we report on a rare genomic structural rearrangement in a boy with atypical deletion in 7q11.23 and XYY syndrome with characteristic clinical signs, but not sufficient for the diagnosis of WBS. Cytogenetic analysis of G-banding showed a karyotype 47,XYY. Analysis of DNA with the technique of MLPA (Multiplex Ligation-dependent Probe Amplification) using kits a combination of kits (P064, P036, P070, and P029) identified an atypical deletion on 7q11.23. In addition, high resolution SNP Oligonucleotide Microarray Analysis (SNP-array) confirmed the alterations found by MLPA and revealed others pathogenic CNVs, in the chromosomes 7 and X. The present report demonstrates an association not yet described in literature, between Williams-Beuren syndrome and 47,XYY. The identification of atypical deletion in 7q11.23 concomitant to additional pathogenic CNVs in others genomic regions allows a better comprehension of clinical consequences of atypical genomic rearrangements. © 2015 Wiley Periodicals, Inc.
Dumitriu, Alexandra; Latourelle, Jeanne C; Hadzi, Tiffany C; Pankratz, Nathan; Garza, Dan; Miller, John P; Vance, Jeffery M; Foroud, Tatiana; Beach, Thomas G; Myers, Richard H
2012-06-01
Parkinson disease (PD) is a complex neurodegenerative disorder with largely unknown genetic mechanisms. While the degeneration of dopaminergic neurons in PD mainly takes place in the substantia nigra pars compacta (SN) region, other brain areas, including the prefrontal cortex, develop Lewy bodies, the neuropathological hallmark of PD. We generated and analyzed expression data from the prefrontal cortex Brodmann Area 9 (BA9) of 27 PD and 26 control samples using the 44K One-Color Agilent 60-mer Whole Human Genome Microarray. All samples were male, without significant Alzheimer disease pathology and with extensive pathological annotation available. 507 of the 39,122 analyzed expression probes were different between PD and control samples at false discovery rate (FDR) of 5%. One of the genes with significantly increased expression in PD was the forkhead box O1 (FOXO1) transcription factor. Notably, genes carrying the FoxO1 binding site were significantly enriched in the FDR-significant group of genes (177 genes covered by 189 probes), suggesting a role for FoxO1 upstream of the observed expression changes. Single-nucleotide polymorphisms (SNPs) selected from a recent meta-analysis of PD genome-wide association studies (GWAS) were successfully genotyped in 50 out of the 53 microarray brains, allowing a targeted expression-SNP (eSNP) analysis for 52 SNPs associated with PD affection at genome-wide significance and the 189 probes from FoxO1 regulated genes. A significant association was observed between a SNP in the cyclin G associated kinase (GAK) gene and a probe in the spermine oxidase (SMOX) gene. Further examination of the FOXO1 region in a meta-analysis of six available GWAS showed two SNPs significantly associated with age at onset of PD. These results implicate FOXO1 as a PD-relevant gene and warrant further functional analyses of its transcriptional regulatory mechanisms.
Dumitriu, Alexandra; Latourelle, Jeanne C.; Hadzi, Tiffany C.; Pankratz, Nathan; Garza, Dan; Miller, John P.; Vance, Jeffery M.; Foroud, Tatiana; Beach, Thomas G.; Myers, Richard H.
2012-01-01
Parkinson disease (PD) is a complex neurodegenerative disorder with largely unknown genetic mechanisms. While the degeneration of dopaminergic neurons in PD mainly takes place in the substantia nigra pars compacta (SN) region, other brain areas, including the prefrontal cortex, develop Lewy bodies, the neuropathological hallmark of PD. We generated and analyzed expression data from the prefrontal cortex Brodmann Area 9 (BA9) of 27 PD and 26 control samples using the 44K One-Color Agilent 60-mer Whole Human Genome Microarray. All samples were male, without significant Alzheimer disease pathology and with extensive pathological annotation available. 507 of the 39,122 analyzed expression probes were different between PD and control samples at false discovery rate (FDR) of 5%. One of the genes with significantly increased expression in PD was the forkhead box O1 (FOXO1) transcription factor. Notably, genes carrying the FoxO1 binding site were significantly enriched in the FDR–significant group of genes (177 genes covered by 189 probes), suggesting a role for FoxO1 upstream of the observed expression changes. Single-nucleotide polymorphisms (SNPs) selected from a recent meta-analysis of PD genome-wide association studies (GWAS) were successfully genotyped in 50 out of the 53 microarray brains, allowing a targeted expression–SNP (eSNP) analysis for 52 SNPs associated with PD affection at genome-wide significance and the 189 probes from FoxO1 regulated genes. A significant association was observed between a SNP in the cyclin G associated kinase (GAK) gene and a probe in the spermine oxidase (SMOX) gene. Further examination of the FOXO1 region in a meta-analysis of six available GWAS showed two SNPs significantly associated with age at onset of PD. These results implicate FOXO1 as a PD–relevant gene and warrant further functional analyses of its transcriptional regulatory mechanisms. PMID:22761592
Novel approach for deriving genome wide SNP analysis data from archived blood spots
2012-01-01
Background The ability to transport and store DNA at room temperature in low volumes has the advantage of optimising cost, time and storage space. Blood spots on adapted filter papers are popular for this, with FTA (Flinders Technology Associates) Whatman™TM technology being one of the most recent. Plant material, plasmids, viral particles, bacteria and animal blood have been stored and transported successfully using this technology, however the method of porcine DNA extraction from FTA Whatman™TM cards is a relatively new approach, allowing nucleic acids to be ready for downstream applications such as PCR, whole genome amplification, sequencing and subsequent application to single nucleotide polymorphism microarrays has hitherto been under-explored. Findings DNA was extracted from FTA Whatman™TM cards (following adaptations of the manufacturer’s instructions), whole genome amplified and subsequently analysed to validate the integrity of the DNA for downstream SNP analysis. DNA was successfully extracted from 288/288 samples and amplified by WGA. Allele dropout post WGA, was observed in less than 2% of samples and there was no clear evidence of amplification bias nor contamination. Acceptable call rates on porcine SNP chips were also achieved using DNA extracted and amplified in this way. Conclusions DNA extracted from FTA Whatman cards is of a high enough quality and quantity following whole genomic amplification to perform meaningful SNP chip studies. PMID:22974252
[Phenotype-genotype correlation analysis of 12 cases with Angelman/Prader-Willi syndrome].
Chen, Chen; Peng, Ying; Xia, Yan; Li, Haoxian; Zhu, Huimin; Pan, Qian; Yin, Fei; Wu, Lingqian
2014-12-01
To investigate the genotype-phenotype correlation in patients with Angelman syndrome/Prader-Willi syndrome (AS/PWS) and assess the application value of high-resolution single nucleotide polymorphism microarrays (SNP array) for such diseases. Twelve AS/PWS patients were diagnosed through SNP array, fluorescence in situ hybridization (FISH) and karyotype analysis. Clinical characteristics were analyzed. Deletions ranging from 4.8 Mb to 7.0 Mb on chromosome 15q11.2-13 were detected in 11 patients. Uniparental disomy (UPD) was detected in only 1 patient. Patients with deletions could be divided into 2 groups, including 7 cases with class I and 4 with class II. The two groups however had no significant phenotypic difference. The UPD patient had relatively better development and language ability. Deletions of 6 patients were confirmed by FISH to be of de novo in origin. The risk to their sibs was determined to be less than 1%. The phenotypic differences between AS/PWS patients with class I and class II deletion need to be further studied. SNP array is useful in detecting and distinguishing of patients with deletion or UPD. This method may be applied for studying the genotype-phenotype association and the mechanism underlying AS/PWS.
SNP discovery and genotyping using Genotyping-by-Sequencing in Pekin ducks.
Zhu, Feng; Cui, Qian-Qian; Hou, Zhuo-Cheng
2016-11-15
Genomic selection and genome-wide association studies need thousands to millions of SNPs. However, many non-model species do not have reference chips for detecting variation. Our goal was to develop and validate an inexpensive but effective method for detecting SNP variation. Genotyping by sequencing (GBS) can be a highly efficient strategy for genome-wide SNP detection, as an alternative to microarray chips. Here, we developed a GBS protocol for ducks and tested it to genotype 49 Pekin ducks. A total of 169,209 SNPs were identified from all animals, with a mean of 55,920 SNPs per individual. The average SNP density reached 1156 SNPs/MB. In this study, the first application of GBS to ducks, we demonstrate the power and simplicity of this method. GBS can be used for genetic studies in to provide an effective method for genome-wide SNP discovery.
Quantitative phenotyping via deep barcode sequencing
Smith, Andrew M.; Heisler, Lawrence E.; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J.; Chee, Mark; Roth, Frederick P.; Giaever, Guri; Nislow, Corey
2009-01-01
Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or “Bar-seq,” outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that ∼20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene–environment interactions on a genome-wide scale. PMID:19622793
Patel, Isha R.; Gangiredla, Jayanthi; Lacher, David W.; Mammel, Mark K.; Jackson, Scott A.; Lampel, Keith A.
2016-01-01
ABSTRACT Most Escherichia coli strains are nonpathogenic. However, for clinical diagnosis and food safety analysis, current identification methods for pathogenic E. coli either are time-consuming and/or provide limited information. Here, we utilized a custom DNA microarray with informative genetic features extracted from 368 sequence sets for rapid and high-throughput pathogen identification. The FDA Escherichia coli Identification (FDA-ECID) platform contains three sets of molecularly informative features that together stratify strain identification and relatedness. First, 53 known flagellin alleles, 103 alleles of wzx and wzy, and 5 alleles of wzm provide molecular serotyping utility. Second, 41,932 probe sets representing the pan-genome of E. coli provide strain-level gene content information. Third, approximately 125,000 single nucleotide polymorphisms (SNPs) of available whole-genome sequences (WGS) were distilled to 9,984 SNPs capable of recapitulating the E. coli phylogeny. We analyzed 103 diverse E. coli strains with available WGS data, including those associated with past foodborne illnesses, to determine robustness and accuracy. The array was able to accurately identify the molecular O and H serotypes, potentially correcting serological failures and providing better resolution for H-nontypeable/nonmotile phenotypes. In addition, molecular risk assessment was possible with key virulence marker identifications. Epidemiologically, each strain had a unique comparative genomic fingerprint that was extended to an additional 507 food and clinical isolates. Finally, a 99.7% phylogenetic concordance was established between microarray analysis and WGS using SNP-level data for advanced genome typing. Our study demonstrates FDA-ECID as a powerful tool for epidemiology and molecular risk assessment with the capacity to profile the global landscape and diversity of E. coli. IMPORTANCE This study describes a robust, state-of-the-art platform developed from available whole-genome sequences of E. coli and Shigella spp. by distilling useful signatures for epidemiology and molecular risk assessment into one assay. The FDA-ECID microarray contains features that enable comprehensive molecular serotyping and virulence profiling along with genome-scale genotyping and SNP analysis. Hence, it is a molecular toolbox that stratifies strain identification and pathogenic potential in the contexts of epidemiology and phylogeny. We applied this tool to strains from food, environmental, and clinical sources, resulting in significantly greater phylogenetic and strain-specific resolution than previously reported for available typing methods. PMID:27037122
UPD detection using homozygosity profiling with a SNP genotyping microarray.
Papenhausen, Peter; Schwartz, Stuart; Risheg, Hiba; Keitges, Elisabeth; Gadi, Inder; Burnside, Rachel D; Jaswaney, Vikram; Pappas, John; Pasion, Romela; Friedman, Kenneth; Tepperberg, James
2011-04-01
Single nucleotide polymorphism (SNP) based chromosome microarrays provide both a high-density whole genome analysis of copy number and genotype. In the past 21 months we have analyzed over 13,000 samples primarily referred for developmental delay using the Affymetrix SNP/CN 6.0 version array platform. In addition to copy number, we have focused on the relative distribution of allele homozygosity (HZ) throughout the genome to confirm a strong association of uniparental disomy (UPD) with regions of isoallelism found in most confirmed cases of UPD. We sought to determine whether a long contiguous stretch of HZ (LCSH) greater than a threshold value found only in a single chromosome would correlate with UPD of that chromosome. Nine confirmed UPD cases were retrospectively analyzed with the array in the study, each showing the anticipated LCSH with the smallest 13.5 Mb in length. This length is well above the average longest run of HZ in a set of control patients and was then set as the prospective threshold for reporting possible UPD correlation. Ninety-two cases qualified at that threshold, 46 of those had molecular UPD testing and 29 were positive. Including retrospective cases, 16 showed complete HZ across the chromosome, consistent with total isoUPD. The average size LCSH in the 19 cases that were not completely HZ was 46.3 Mb with a range of 13.5-127.8 Mb. Three patients showed only segmental UPD. Both the size and location of the LCSH are relevant to correlation with UPD. Further studies will continue to delineate an optimal threshold for LCSH/UPD correlation. Copyright © 2011 Wiley-Liss, Inc.
Mason, Jane A; Aung, Hnin T; Nandini, Adayapalam; Woods, Rickie G; Fairbairn, David J; Rowell, John A; Young, David; Susman, Rachel D; Brown, Simon A; Hyland, Valentine J; Robertson, Jeremy D
2018-05-01
We report a kindred referred for molecular investigation of severe hemophilia A in a young female in which extremely skewed X-inactivation was observed in both the proband and her clinically normal mother. Bidirectional Sanger sequencing of all F8 gene coding regions and exon/intron boundaries was undertaken. Methylation-sensitive restriction enzymes were utilized to investigate skewed X-inactivation using both a classical human androgen receptor (HUMARA) assay, and a novel method targeting differential methylation patterns in multiple informative X-chromosome SNPs. Illumina Whole-Genome Infinium microarray analysis was performed in the case-parent trio (proband and both parents), and the proband's maternal grandmother. The proband was a cytogenetically normal female with severe hemophilia A resulting from a heterozygous F8 pathogenic variant inherited from her similarly affected father. No F8 mutation was identified in the proband's mother, however, both the proband and her mother both demonstrated completely skewed X-chromosome inactivation (100%) in association with a previously unreported 2.3 Mb deletion at Xp22.2. At least three disease-associated genes (FANCB, AP1S2, and PIGA) were contained within the deleted region. We hypothesize that true "extreme" skewing of X-inactivation (≥95%) is a rare occurrence, but when defined correctly there is a high probability of finding an X-chromosome disease-causing variant or larger deletion resulting in X-inactivation through a survival disadvantage or cell lethal mechanism. We postulate that the 2.3 Mb Xp22.2 deletion identified in our kindred arose de novo in the proband's mother (on the grandfather's homolog), and produced extreme skewing of X-inactivation via a "cell lethal" mechanism. We introduce a novel multitarget approach for X-inactivation analysis using multiple informative differentially methylated SNPs, as an alternative to the classical single locus (HUMARA) method. We propose that for females with unexplained severe phenotypic expression of an X-linked recessive disorder trio-SNP microarray should be undertaken in combination with X-inactivation analysis. © 2018 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.
Xu, Huihui; Xiao, Bing; Ji, Xing; Hu, Qin; Chen, Yingwei; Qiu, Wenjuan
2014-07-01
Tetrasomy for the distal chromosome 15q is rare, and only 22 patients (including 6 cases without detailed information) have been described to date in the literature. Here we report on another patient with nonmosaic tetrasomy 15q25.2-qter resulted from an inverted duplication of distal chromosome 15. This patient presents with features of development delay, arachnodactyly, joint contractures and typical facial dysmorphism including frontal bossing, short palpebral fissures, long philtrum, low-set ears, high-arched palate and retrognathia. Unlike most of the related patients, abdominal ultrasound test and brain MRI showed normal. Karyotyping analysis revealed a supernumerary marker chromosome presented in all metaphase cells examined. Parental karyotyping analysis was normal, indicating a de novo chromosome aberration of the patient. SNP microarray analysis found a two copy gain of 17.7 Mb from the distal long arm of chromosome 15 (15q25.2-qter). Further FISH analysis using SureFISH 15q26.3 IGF1R probe proved an inverted duplication of distal long arm of chromosome 15. The segmental duplications which lie in the hotspots of 15q24-26 might increase the susceptibility of chromosome rearrangement. Compared with the George-Abraham' study [2012], ADAMTSL3 might be more related to the cardiac disorders in tetrasomy 15q patients. Considering all patients reported in the literature, different mosaic degrees and segmental sizes don't correlate to the severity of phenotypes. A clear delineation on tetrasomy for distal chromosome 15q could still be investigated. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Consolandi, Clarissa
2009-01-01
One major goal of genetic research is to understand the role of genetic variation in living systems. In humans, by far the most common type of such variation involves differences in single DNA nucleotides, and is thus termed single nucleotide polymorphism (SNP). The need for improvement in throughput and reliability of traditional techniques makes it necessary to develop new technologies. Thus the past few years have witnessed an extraordinary surge of interest in DNA microarray technology. This new technology offers the first great hope for providing a systematic way to explore the genome. It permits a very rapid analysis of thousands genes for the purpose of gene discovery, sequencing, mapping, expression, and polymorphism detection. We generated a series of analytical tools to address the manufacturing, detection and data analysis components of a microarray experiment. In particular, we set up a universal array approach in combination with a PCR-LDR (polymerase chain reaction-ligation detection reaction) strategy for allele identification in the HLA gene.
Helm, Benjamin M; Langley, Katherine; Spangler, Brooke; Vergano, Samantha
2014-08-01
Single nucleotide polymorphism microarrays have the ability to reveal parental consanguinity which may or may not be known to healthcare providers. Consanguinity can have significant implications for the health of patients and for individual and family psychosocial well-being. These results often present ethical and legal dilemmas that can have important ramifications. Unexpected consanguinity can be confounding to healthcare professionals who may be unprepared to handle these results or to communicate them to families or other appropriate representatives. There are few published accounts of experiences with consanguinity and SNP arrays. In this paper we discuss three cases where molecular evidence of parental incest was identified by SNP microarray. We hope to further highlight consanguinity as a potential incidental finding, how the cases were handled by the clinical team, and what resources were found to be most helpful. This paper aims to contribute further to professional discourse on incidental findings with genomic technology and how they were addressed clinically. These experiences may provide some guidance on how others can prepare for these findings and help improve practice. As genetic and genomic testing is utilized more by non-genetics providers, we also hope to inform about the importance of engaging with geneticists and genetic counselors when addressing these findings.
Constitutional Chromoanagenesis of Distal 13q in a Young Adult with Recurrent Strokes.
Burnside, Rachel D; Harris, April; Speyer, Darrow; Burgin, W Scott; Rose, David Z; Sanchez-Valle, Amarilis
2016-01-01
Constitutional chromoanagenesis events, which include chromoanasynthesis and chromothripsis and result in highly complex rearrangements, have been reported for only a few individuals. While rare, these phenomena have likely been underestimated in a constitutional setting as technologies that can accurately detect such complexity are relatively new to the mature field of clinical cytogenetics. G-banding is not likely to accurately identify chromoanasynthesis or chromothripsis, since the banding patterns of chromosomes are likely to be misidentified or oversimplified due to a much lower resolution. We describe a patient who was initially referred for cytogenetic testing as a child for speech delay. As a young adult, he was referred again for recurrent strokes. Chromosome analysis was performed, and the rearrangement resembled a simple duplication of 13q32q34. However, SNP microarray analysis showed a complex pattern of copy number gains and a loss consistent with chromoanasynthesis involving distal 13q (13q32.1q34). This report emphasizes the value of performing microarray analysis for individuals with abnormal or complex chromosome rearrangements. © 2016 S. Karger AG, Basel.
Fagerholm, Rainer; Schmidt, Marjanka K; Khan, Sofia; Rafiq, Sajjad; Tapper, William; Aittomäki, Kristiina; Greco, Dario; Heikkinen, Tuomas; Muranen, Taru A; Fasching, Peter A; Janni, Wolfgang; Weinshilboum, Richard; Loehberg, Christian R; Hopper, John L; Southey, Melissa C; Keeman, Renske; Lindblom, Annika; Margolin, Sara; Mannermaa, Arto; Kataja, Vesa; Chenevix-Trench, Georgia; Lambrechts, Diether; Wildiers, Hans; Chang-Claude, Jenny; Seibold, Petra; Couch, Fergus J; Olson, Janet E; Andrulis, Irene L; Knight, Julia A; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J; Jager, Agnes; Shah, Mitul; Perkins, Barbara J; Luben, Robert; Hamann, Ute; Kabisch, Maria; Czene, Kamila; Hall, Per; Easton, Douglas F; Pharoah, Paul D P; Liu, Jianjun; Eccles, Diana; Blomqvist, Carl; Nevanlinna, Heli
2015-04-10
We have utilized a two-stage study design to search for SNPs associated with the survival of breast cancer patients treated with adjuvant chemotherapy. Our initial GWS data set consisted of 805 Finnish breast cancer cases (360 treated with adjuvant chemotherapy). The top 39 SNPs from this stage were analyzed in three independent data sets: iCOGS (n=6720 chemotherapy-treated cases), SUCCESS-A (n=3596), and POSH (n=518). Two SNPs were successfully validated: rs6500843 (any chemotherapy; per-allele HR 1.16, 95% C.I. 1.08-1.26, p=0.0001, p(adjusted)=0.0091), and rs11155012 (anthracycline therapy; per-allele HR 1.21, 95% C.I. 1.08-1.35, p=0.0010, p(adjusted)=0.0270). The SNP rs6500843 was found to specifically interact with adjuvant chemotherapy, independently of standard prognostic markers (p(interaction)=0.0009), with the rs6500843-GG genotype corresponding to the highest hazard among chemotherapy-treated cases (HR 1.47, 95% C.I. 1.20-1.80). Upon trans-eQTL analysis of public microarray data, the rs6500843 locus was found to associate with the expression of a group of genes involved in cell cycle control, notably AURKA, the expression of which also exhibited differential prognostic value between chemotherapy-treated and untreated cases in our analysis of microarray data. Based on previously published information, we propose that the eQTL genes may be connected to the rs6500843 locus via a RBFOX1-FOXM1 -mediated regulatory pathway.
KinSNP software for homozygosity mapping of disease genes using SNP microarrays
2010-01-01
Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from http://bioinfo.bgu.ac.il/bsu/software/kinSNP. PMID:20846928
Shen, Wei; Paxton, Christian N; Szankasi, Philippe; Longhurst, Maria; Schumacher, Jonathan A; Frizzell, Kimberly A; Sorrells, Shelly M; Clayton, Adam L; Jattani, Rakhi P; Patel, Jay L; Toydemir, Reha; Kelley, Todd W; Xu, Xinjie
2018-04-01
Genetic abnormalities, including copy number variants (CNV), copy number neutral loss of heterozygosity (CN-LOH) and gene mutations, underlie the pathogenesis of myeloid malignancies and serve as important diagnostic, prognostic and/or therapeutic markers. Currently, multiple testing strategies are required for comprehensive genetic testing in myeloid malignancies. The aim of this proof-of-principle study was to investigate the feasibility of combining detection of genome-wide large CNVs, CN-LOH and targeted gene mutations into a single assay using next-generation sequencing (NGS). For genome-wide CNV detection, we designed a single nucleotide polymorphism (SNP) sequencing backbone with 22 762 SNP regions evenly distributed across the entire genome. For targeted mutation detection, 62 frequently mutated genes in myeloid malignancies were targeted. We combined this SNP sequencing backbone with a targeted mutation panel, and sequenced 9 healthy individuals and 16 patients with myeloid malignancies using NGS. We detected 52 somatic CNVs, 11 instances of CN-LOH and 39 oncogenic mutations in the 16 patients with myeloid malignancies, and none in the 9 healthy individuals. All CNVs and CN-LOH were confirmed by SNP microarray analysis. We describe a genome-wide SNP sequencing backbone which allows for sensitive detection of genome-wide CNVs and CN-LOH using NGS. This proof-of-principle study has demonstrated that this strategy can provide more comprehensive genetic profiling for patients with myeloid malignancies using a single assay. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Hoffmann, Thomas J; Zhan, Yiping; Kvale, Mark N; Hesselson, Stephanie E; Gollub, Jeremy; Iribarren, Carlos; Lu, Yontao; Mei, Gangwu; Purdy, Matthew M; Quesenberry, Charles; Rowell, Sarah; Shapero, Michael H; Smethurst, David; Somkin, Carol P; Van den Eeden, Stephen K; Walter, Larry; Webster, Teresa; Whitmer, Rachel A; Finn, Andrea; Schaefer, Catherine; Kwok, Pui-Yan; Risch, Neil
2011-12-01
Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. Copyright © 2011 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Hongna; Li, Song; Wang, Zhifei; Li, Zhiyang; Deng, Yan; Wang, Hua; Shi, Zhiyang; He, Nongyue
2008-11-01
Single nucleotide polymorphisms (SNPs) comprise the most abundant source of genetic variation in the human genome wide codominant SNPs identification. Therefore, large-scale codominant SNPs identification, especially for those associated with complex diseases, has induced the need for completely high-throughput and automated SNP genotyping method. Herein, we present an automated detection system of SNPs based on two kinds of functional magnetic nanoparticles (MNPs) and dual-color hybridization. The amido-modified MNPs (NH 2-MNPs) modified with APTES were used for DNA extraction from whole blood directly by electrostatic reaction, and followed by PCR, was successfully performed. Furthermore, biotinylated PCR products were captured on the streptavidin-coated MNPs (SA-MNPs) and interrogated by hybridization with a pair of dual-color probes to determine SNP, then the genotype of each sample can be simultaneously identified by scanning the microarray printed with the denatured fluorescent probes. This system provided a rapid, sensitive and highly versatile automated procedure that will greatly facilitate the analysis of different known SNPs in human genome.
2011-01-01
Background During gene conversion, genetic information is transferred unidirectionally between highly homologous but non-allelic regions of DNA. While germ-line gene conversion has been implicated in the pathogenesis of some diseases, somatic gene conversion has remained technically difficult to investigate on a large scale. Methods A novel analysis technique is proposed for detecting the signature of somatic gene conversion from SNP microarray data. The Wellcome Trust Case Control Consortium has gathered SNP microarray data for two control populations and cohorts for bipolar disorder (BD), cardiovascular disease (CAD), Crohn's disease (CD), hypertension (HT), rheumatoid arthritis (RA), type-1 diabetes (T1D) and type-2 diabetes (T2D). Using the new analysis technique, the seven disease cohorts are analyzed to identify cohort-specific SNPs at which conversion is predicted. The quality of the predictions is assessed by identifying known disease associations for genes in the homologous duplicons, and comparing the frequency of such associations with background rates. Results Of 28 disease/locus pairs meeting stringent conditions, 22 show various degrees of disease association, compared with only 8 of 70 in a mock study designed to measure the background association rate (P < 10-9). Additional candidate genes are identified using less stringent filtering conditions. In some cases, somatic deletions appear likely. RA has a distinctive pattern of events relative to other diseases. Similarities in patterns are apparent between BD and HT. Conclusions The associations derived represent the first evidence that somatic gene conversion could be a significant causative factor in each of the seven diseases. The specific genes provide potential insights about disease mechanisms, and are strong candidates for further study. Please see Commentary: http://www.biomedcentral.com/1741-7015/9/13/abstract. PMID:21291537
Ross, Kenneth Andrew
2011-02-03
During gene conversion, genetic information is transferred unidirectionally between highly homologous but non-allelic regions of DNA. While germ-line gene conversion has been implicated in the pathogenesis of some diseases, somatic gene conversion has remained technically difficult to investigate on a large scale. A novel analysis technique is proposed for detecting the signature of somatic gene conversion from SNP microarray data. The Wellcome Trust Case Control Consortium has gathered SNP microarray data for two control populations and cohorts for bipolar disorder (BD), cardiovascular disease (CAD), Crohn's disease (CD), hypertension (HT), rheumatoid arthritis (RA), type-1 diabetes (T1D) and type-2 diabetes (T2D). Using the new analysis technique, the seven disease cohorts are analyzed to identify cohort-specific SNPs at which conversion is predicted. The quality of the predictions is assessed by identifying known disease associations for genes in the homologous duplicons, and comparing the frequency of such associations with background rates. Of 28 disease/locus pairs meeting stringent conditions, 22 show various degrees of disease association, compared with only 8 of 70 in a mock study designed to measure the background association rate (P < 10-9). Additional candidate genes are identified using less stringent filtering conditions. In some cases, somatic deletions appear likely. RA has a distinctive pattern of events relative to other diseases. Similarities in patterns are apparent between BD and HT. The associations derived represent the first evidence that somatic gene conversion could be a significant causative factor in each of the seven diseases. The specific genes provide potential insights about disease mechanisms, and are strong candidates for further study.
Daca-Roszak, P; Pfeifer, A; Żebracka-Gala, J; Jarząb, B; Witt, M; Ziętkiewicz, E
2016-01-01
Assays that allow analysis of the biogeographic origin of biological samples in a standard forensic laboratory have to target a small number of highly differentiating markers. Such markers should be easy to multiplex and the assay must perform well in the degraded and scarce biological material. SNPs localized in the genome regions, which in the past were subjected to differential selective pressure in various populations, are the most widely used markers in the studies of biogeographic affiliation. SNPs reflecting biogeographic differences not related to any phenotypic traits are not sufficiently explored. The goal of our study was to identify a small set of SNPs not related to any known pigmentation/phenotype-specific genes, which would allow efficient discrimination between populations of Europe and East Asia. The selection of SNPs was based on the comparative analysis of representative European and Chinese/Japanese samples (B-lymphocyte cell lines), genotyped using the Infinium HumanOmniExpressExome microarray (Illumina). The classifier, consisting of 24 unlinked SNPs (24-SNP classifier), was selected. The performance of a 14-SNP subset of this classifier (14-SNP subclassifier) was tested using genotype data from several populations. The 14-SNP subclassifier differentiated East Asians, Europeans and Africans with ∼100% accuracy; Palestinians, representative of the Middle East, clustered with Europeans, while Amerindians and Pakistani were placed between East Asian and European populations. Based on these results, we have developed a SNaPshot assay (EurEAs_Gplex) for genotyping SNPs from the 14-SNP subclassifier, combined with an additional marker for gender identification. Forensic utility of the EurEAs_Gplex was verified using degraded and low quantity DNA samples. The performance of the EurEAs_Gplex was satisfactory when using degraded DNA; tests using low quantity DNA samples revealed a previously not described source of genotyping errors, potentially important for any SNaPshot-based assays. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Keitges, Elisabeth A; Pasion, Romela; Burnside, Rachel D; Mason, Carla; Gonzalez-Ruiz, Antonio; Dunn, Teresa; Masiello, Meredith; Gebbia, Joseph A; Fernandez, Carlos O; Risheg, Hiba
2013-07-01
Microdeletions of 8p23.1 are mediated by low copy repeats and can cause congenital diaphragmatic hernia (CDH) and cardiac defects. Within this region, point mutations of the GATA4 gene have been shown to cause cardiac defects. However, the cause of CDH in these deletions has been difficult to determine due to the paucity of mutations that result in CDH, the lack of smaller deletions to refine the region and the reduced penetrance of CDH in these large deletions. Mice deficient for one copy of the Gata4 gene have been described with CDH and heart defects suggesting mutations in Gata4 can cause the phenotype in mice. We report on the SNP microarray analysis on two fetuses with deletions of 8p23.1. The first had CDH and a ventricular septal defect (VSD) on ultrasonography and a family history of a maternal VSD. Microarray analysis detected a 127-kb deletion which included the GATA4 and NEIL2 genes which was inherited from the mother. The second fetus had an incomplete atrioventricular canal defect on ultrasonography. Microarray analysis showed a 315-kb deletion that included seven genes, GATA4, NEIL2, FDFT1, CTSB, DEFB136, DEFB135, and DEFB134. These results suggest that haploinsufficiency of the two genes in common within 8p23.1; GATA4 and NEIL2 can cause CDH and cardiac defects in humans. Copyright © 2013 Wiley Periodicals, Inc.
Genetic Mapping and Exome Sequencing Identify Variants Associated with Five Novel Diseases
Puffenberger, Erik G.; Jinks, Robert N.; Sougnez, Carrie; Cibulskis, Kristian; Willert, Rebecca A.; Achilly, Nathan P.; Cassidy, Ryan P.; Fiorentini, Christopher J.; Heiken, Kory F.; Lawrence, Johnny J.; Mahoney, Molly H.; Miller, Christopher J.; Nair, Devika T.; Politi, Kristin A.; Worcester, Kimberly N.; Setton, Roni A.; DiPiazza, Rosa; Sherman, Eric A.; Eastman, James T.; Francklyn, Christopher; Robey-Bond, Susan; Rider, Nicholas L.; Gabriel, Stacey; Morton, D. Holmes; Strauss, Kevin A.
2012-01-01
The Clinic for Special Children (CSC) has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain) children. Among the Plain people, we have used single nucleotide polymorphism (SNP) microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb) that contain many genes (mean = 79). For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data. PMID:22279524
KinSNP software for homozygosity mapping of disease genes using SNP microarrays.
Amir, El-Ad David; Bartal, Ofer; Morad, Efrat; Nagar, Tal; Sheynin, Jony; Parvari, Ruti; Chalifa-Caspi, Vered
2010-08-01
Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from.
Laios, Eleftheria; Drogari, Euridiki
2006-12-01
Three mutations in the low density lipoprotein receptor (LDLR) gene account for 49% of familial hypercholesterolemia (FH) cases in Greece. We used the microelectronic array technology of the NanoChip Molecular Biology Workstation to develop a multiplex method to analyze these single-nucleotide polymorphisms (SNPs). Primer pairs amplified the region encompassing each SNP. The biotinylated PCR amplicon was electronically addressed to streptavidin-coated microarray sites. Allele-specific fluorescently labeled oligonucleotide reporters were designed and used for detection of wild-type and SNP sequences. Genotypes were compared to PCR-restriction fragment length polymorphism (PCR-RFLP). We developed three monoplex assays (1 SNP/site) and an optimized multiplex assay (3SNPs/site). We performed 92 Greece II, 100 Genoa, and 98 Afrikaner-2 NanoChip monoplex assays (addressed to duplicate sites and analyzed separately). Of the 580 monoplex genotypings (290 samples), 579 agreed with RFLP. Duplicate sites of one sample were not in agreement with each other. Of the 580 multiplex genotypings, 576 agreed with the monoplex results. Duplicate sites of three samples were not in agreement with each other, indicating requirement for repetition upon which discrepancies were resolved. The multiplex assay detects common LDLR mutations in Greek FH patients and can be extended to accommodate additional mutations.
Haraksingh, Rajini R; Abyzov, Alexej; Urban, Alexander Eckehart
2017-04-24
High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies.
A remark on copy number variation detection methods.
Li, Shuo; Dou, Xialiang; Gao, Ruiqi; Ge, Xinzhou; Qian, Minping; Wan, Lin
2018-01-01
Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily.
Equalizer reduces SNP bias in Affymetrix microarrays.
Quigley, David
2015-07-30
Gene expression microarrays measure the levels of messenger ribonucleic acid (mRNA) in a sample using probe sequences that hybridize with transcribed regions. These probe sequences are designed using a reference genome for the relevant species. However, most model organisms and all humans have genomes that deviate from their reference. These variations, which include single nucleotide polymorphisms, insertions of additional nucleotides, and nucleotide deletions, can affect the microarray's performance. Genetic experiments comparing individuals bearing different population-associated single nucleotide polymorphisms that intersect microarray probes are therefore subject to systemic bias, as the reduction in binding efficiency due to a technical artifact is confounded with genetic differences between parental strains. This problem has been recognized for some time, and earlier methods of compensation have attempted to identify probes affected by genome variants using statistical models. These methods may require replicate microarray measurement of gene expression in the relevant tissue in inbred parental samples, which are not always available in model organisms and are never available in humans. By using sequence information for the genomes of organisms under investigation, potentially problematic probes can now be identified a priori. However, there is no published software tool that makes it easy to eliminate these probes from an annotation. I present equalizer, a software package that uses genome variant data to modify annotation files for the commonly used Affymetrix IVT and Gene/Exon platforms. These files can be used by any microarray normalization method for subsequent analysis. I demonstrate how use of equalizer on experiments mapping germline influence on gene expression in a genetic cross between two divergent mouse species and in human samples significantly reduces probe hybridization-induced bias, reducing false positive and false negative findings. The equalizer package reduces probe hybridization bias from experiments performed on the Affymetrix microarray platform, allowing accurate assessment of germline influence on gene expression.
Delaneau, Olivier; Marchini, Jonathan
2014-06-13
A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.
Performance evaluation of DNA copy number segmentation methods.
Pierre-Jean, Morgane; Rigaill, Guillem; Neuvial, Pierre
2015-07-01
A number of bioinformatic or biostatistical methods are available for analyzing DNA copy number profiles measured from microarray or sequencing technologies. In the absence of rich enough gold standard data sets, the performance of these methods is generally assessed using unrealistic simulation studies, or based on small real data analyses. To make an objective and reproducible performance assessment, we have designed and implemented a framework to generate realistic DNA copy number profiles of cancer samples with known truth. These profiles are generated by resampling publicly available SNP microarray data from genomic regions with known copy-number state. The original data have been extracted from dilutions series of tumor cell lines with matched blood samples at several concentrations. Therefore, the signal-to-noise ratio of the generated profiles can be controlled through the (known) percentage of tumor cells in the sample. This article describes this framework and its application to a comparison study between methods for segmenting DNA copy number profiles from SNP microarrays. This study indicates that no single method is uniformly better than all others. It also helps identifying pros and cons of the compared methods as a function of biologically informative parameters, such as the fraction of tumor cells in the sample and the proportion of heterozygous markers. This comparison study may be reproduced using the open source and cross-platform R package jointseg, which implements the proposed data generation and evaluation framework: http://r-forge.r-project.org/R/?group_id=1562. © The Author 2014. Published by Oxford University Press.
2011-01-01
Background Copy number aberrations (CNAs) are an important molecular signature in cancer initiation, development, and progression. However, these aberrations span a wide range of chromosomes, making it hard to distinguish cancer related genes from other genes that are not closely related to cancer but are located in broadly aberrant regions. With the current availability of high-resolution data sets such as single nucleotide polymorphism (SNP) microarrays, it has become an important issue to develop a computational method to detect driving genes related to cancer development located in the focal regions of CNAs. Results In this study, we introduce a novel method referred to as the wavelet-based identification of focal genomic aberrations (WIFA). The use of the wavelet analysis, because it is a multi-resolution approach, makes it possible to effectively identify focal genomic aberrations in broadly aberrant regions. The proposed method integrates multiple cancer samples so that it enables the detection of the consistent aberrations across multiple samples. We then apply this method to glioblastoma multiforme and lung cancer data sets from the SNP microarray platform. Through this process, we confirm the ability to detect previously known cancer related genes from both cancer types with high accuracy. Also, the application of this approach to a lung cancer data set identifies focal amplification regions that contain known oncogenes, though these regions are not reported using a recent CNAs detecting algorithm GISTIC: SMAD7 (chr18q21.1) and FGF10 (chr5p12). Conclusions Our results suggest that WIFA can be used to reveal cancer related genes in various cancer data sets. PMID:21569311
Population-genetic nature of copy number variations in the human genome.
Kato, Mamoru; Kawaguchi, Takahisa; Ishikawa, Shumpei; Umeda, Takayoshi; Nakamichi, Reiichiro; Shapero, Michael H; Jones, Keith W; Nakamura, Yusuke; Aburatani, Hiroyuki; Tsunoda, Tatsuhiko
2010-03-01
Copy number variations (CNVs) are universal genetic variations, and their association with disease has been increasingly recognized. We designed high-density microarrays for CNVs, and detected 3000-4000 CNVs (4-6% of the genomic sequence) per population that included CNVs previously missed because of smaller sizes and residing in segmental duplications. The patterns of CNVs across individuals were surprisingly simple at the kilo-base scale, suggesting the applicability of a simple genetic analysis for these genetic loci. We utilized the probabilistic theory to determine integer copy numbers of CNVs and employed a recently developed phasing tool to estimate the population frequencies of integer copy number alleles and CNV-SNP haplotypes. The results showed a tendency toward a lower frequency of CNV alleles and that most of our CNVs were explained only by zero-, one- and two-copy alleles. Using the estimated population frequencies, we found several CNV regions with exceptionally high population differentiation. Investigation of CNV-SNP linkage disequilibrium (LD) for 500-900 bi- and multi-allelic CNVs per population revealed that previous conflicting reports on bi-allelic LD were unexpectedly consistent and explained by an LD increase correlated with deletion-allele frequencies. Typically, the bi-allelic LD was lower than SNP-SNP LD, whereas the multi-allelic LD was somewhat stronger than the bi-allelic LD. After further investigation of tag SNPs for CNVs, we conclude that the customary tagging strategy for disease association studies can be applicable for common deletion CNVs, but direct interrogation is needed for other types of CNVs.
Mehrian-Shai, Ruty; Yalon, Michal; Moshe, Itai; Barshack, Iris; Nass, Dvorah; Jacob, Jasmine; Dor, Chen; Reichardt, Juergen K V; Constantini, Shlomi; Toren, Amos
2016-01-14
The genetic mechanisms underlying hemangioblastoma development are still largely unknown. We used high-resolution single nucleotide polymorphism microarrays and droplet digital PCR analysis to detect copy number variations (CNVs) in total of 45 hemangioblastoma tumors. We identified 94 CNVs with a median of 18 CNVs per sample. The most frequently gained regions were on chromosomes 1 (p36.32) and 7 (p11.2). These regions contain the EGFR and PRDM16 genes. Recurrent losses were located at chromosome 12 (q24.13), which includes the gene PTPN11. Our findings provide the first high-resolution genome-wide view of chromosomal changes in hemangioblastoma and identify 23 candidate genes: EGFR, PRDM16, PTPN11, HOXD11, HOXD13, FLT3, PTCH, FGFR1, FOXP1, GPC3, HOXC13, HOXC11, MKL1, CHEK2, IRF4, GPHN, IKZF1, RB1, HOXA9, and micro RNA, such as hsa-mir-196a-2 for hemangioblastoma pathogenesis. Furthermore, our data implicate that cell proliferation and angiogenesis promoting pathways may be involved in the molecular pathogenesis of hemangioblastoma.
Fine-Scale Variation and Genetic Determinants of Alternative Splicing across Individuals
Coulombe-Huntington, Jasmin; Lam, Kevin C. L.; Dias, Christel; Majewski, Jacek
2009-01-01
Recently, thanks to the increasing throughput of new technologies, we have begun to explore the full extent of alternative pre–mRNA splicing (AS) in the human transcriptome. This is unveiling a vast layer of complexity in isoform-level expression differences between individuals. We used previously published splicing sensitive microarray data from lymphoblastoid cell lines to conduct an in-depth analysis on splicing efficiency of known and predicted exons. By combining publicly available AS annotation with a novel algorithm designed to search for AS, we show that many real AS events can be detected within the usually unexploited, speculative majority of the array and at significance levels much below standard multiple-testing thresholds, demonstrating that the extent of cis-regulated differential splicing between individuals is potentially far greater than previously reported. Specifically, many genes show subtle but significant genetically controlled differences in splice-site usage. PCR validation shows that 42 out of 58 (72%) candidate gene regions undergo detectable AS, amounting to the largest scale validation of isoform eQTLs to date. Targeted sequencing revealed a likely causative SNP in most validated cases. In all 17 incidences where a SNP affected a splice-site region, in silico splice-site strength modeling correctly predicted the direction of the micro-array and PCR results. In 13 other cases, we identified likely causative SNPs disrupting predicted splicing enhancers. Using Fst and REHH analysis, we uncovered significant evidence that 2 putative causative SNPs have undergone recent positive selection. We verified the effect of five SNPs using in vivo minigene assays. This study shows that splicing differences between individuals, including quantitative differences in isoform ratios, are frequent in human populations and that causative SNPs can be identified using in silico predictions. Several cases affected disease-relevant genes and it is likely some of these differences are involved in phenotypic diversity and susceptibility to complex diseases. PMID:20011102
A de novo Mutation in KMT2A (MLL) in monozygotic twins with Wiedemann-Steiner syndrome.
Dunkerton, Sophie; Field, Matthew; Cho, Vicki; Bertram, Edward; Whittle, Belinda; Groves, Alexandra; Goel, Himanshu
2015-09-01
Growth deficiency, psychomotor delay, and facial dysmorphism was originally described in a male patient in 1989 by Wiedemann et al. and later in 2000 by Steiner et al. Wiedemann-Steiner syndrome (WSS) has since been described only a few times in the literature, with the phenotypic spectrum both expanding and becoming more delineated with each patient reported. We report on the clinical and molecular features of monozygotic twins with a de novo mutation in KMT2A. Single nucleotide polymorphism (SNP) microarray was done on both twins and whole-exome sequencing was done using both parents and one of the affected twins. SNP microarray confirmed that they were monozygotic twins. A de novo heterozygous variant (p. Arg1083*) in the KMT2A gene was identified through whole-exome sequencing, confirming the diagnosis of WSS. In this study, we have identified a de novo mutation in KMT2A associated with psychomotor developmental delay, facial dysmorphism, short stature, hypertrichosis cubiti, and small kidneys. This finding in monozygotic twins gives specificity to the WSS. The description of more cases of WSS is needed for further delineation of this condition. Small kidneys with normal function have not been described in this condition in the medical literature before. © 2015 Wiley Periodicals, Inc.
Martin, Maureen V; Rollins, Brandi; Sequeira, P Adolfo; Mesén, Andrea; Byerley, William; Stein, Richard; Moon, Emily A; Akil, Huda; Jones, Edward G; Watson, Stanley J; Barchas, Jack; DeLisi, Lynn E; Myers, Richard M; Schatzberg, Alan; Bunney, William E; Vawter, Marquis P
2009-01-01
Background The purpose of this study was to examine the effects of glucose reduction stress on lymphoblastic cell line (LCL) gene expression in subjects with schizophrenia compared to non-psychotic relatives. Methods LCLs were grown under two glucose conditions to measure the effects of glucose reduction stress on exon expression in subjects with schizophrenia compared to unaffected family member controls. A second aim of this project was to identify cis-regulated transcripts associated with diagnosis. Results There were a total of 122 transcripts with significant diagnosis by probeset interaction effects and 328 transcripts with glucose deprivation by probeset interaction probeset effects after corrections for multiple comparisons. There were 8 transcripts with expression significantly affected by the interaction between diagnosis and glucose deprivation and probeset after correction for multiple comparisons. The overall validation rate by qPCR of 13 diagnosis effect genes identified through microarray was 62%, and all genes tested by qPCR showed concordant up- or down-regulation by qPCR and microarray. We assessed brain gene expression of five genes found to be altered by diagnosis and glucose deprivation in LCLs and found a significant decrease in expression of one gene, glutaminase, in the dorsolateral prefrontal cortex (DLPFC). One SNP with previously identified regulation by a 3' UTR SNP was found to influence IRF5 expression in both brain and lymphocytes. The relationship between the 3' UTR rs10954213 genotype and IRF5 expression was significant in LCLs (p = 0.0001), DLPFC (p = 0.007), and anterior cingulate cortex (p = 0.002). Conclusion Experimental manipulation of cells lines from subjects with schizophrenia may be a useful approach to explore stress related gene expression alterations in schizophrenia and to identify SNP variants associated with gene expression. PMID:19772658
Automation of complex assays: pharmacogenetics of warfarin dosing.
Wu, Whei-Kuo; Hujsak, Paul G; Kureshy, Fareed
2007-10-01
AutoGenomics, Inc. (Carlsbad, CA, USA) have developed a multiplex microarray assay for genotyping both VKORC1 and CYP2C9 using the INFINITI(™) Analyzer. Multiple alleles in each DNA sample are analyzed by polymerase chain reaction amplification, followed by detection primer extension using the INFINITI Analyzer. The INFINITI Analyzer performs single-nucleotide polymorphism (SNP) analysis using universal oligonucleotides immobilized on the biochip. To genotype broader ethnic groups, genomic DNA from whole blood was tested for nine SNPs for VKORC1 and six for CYP2C9 genotypes. Information related to all 15 SNPs is needed to determine dosing of population of diverse ethnic origin. The INFINITI system provides genotyping information for same day dosing of warfarin.
Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays.
Seiser, Eric L; Innocenti, Federico
2014-01-01
Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.
Zhao, Linlu; Bracken, Michael B.; DeWan, Andrew T.
2013-01-01
Summary A genome-wide association study was undertaken to identify maternal single nucleotide polymorphisms (SNPs) and copy-number variants (CNVs) associated with preeclampsia. Case-control analysis was performed on 1070 Afro-Caribbean (n=21 cases and 1049 controls) and 723 Hispanic (n=62 cases and 661 controls) mothers and 1257 mothers of European ancestry (n=50 cases and 1207 controls) from the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) study. European ancestry subjects were genotyped on Illumina Human610-Quad and Afro-Caribbean and Hispanic subjects were genotyped on Illumina Human1M-Duo BeadChip microarrays. Genome-wide SNP data were analyzed using PLINK. CNVs were called using three detection algorithms (GNOSIS, PennCNV, and QuantiSNP), merged using CNVision, and then screened using stringent criteria. SNP and CNV findings were compared to those of the Study of Pregnancy Hypertension in Iowa (SOPHIA), an independent preeclampsia case-control dataset of Caucasian mothers (n=177 cases and 116 controls). A list of top SNPs were identified for each of the HAPO ethnic groups, but none reached Bonferroni-corrected significance. Novel candidate CNVs showing enrichment among preeclampsia cases were also identified in each of the three ethnic groups. Several variants were suggestively replicated in SOPHIA. The discovered SNPs and copy-number variable regions present interesting candidate genetic variants for preeclampsia that warrant further replication and investigation. PMID:23551011
[C677T-SNP of methylenetetrahydrofolate reductase gene and breast cancer in Mexican women].
Calderón-Garcidueñas, Ana Laura; Cerda-Flores, Ricardo Martín; Castruita-Ávila, Ana Lilia; González-Guerrero, Juan Francisco; Barrera-Saldaña, Hugo Alberto
2017-01-01
Low-penetrance susceptibility genes such as 5,10-methylenetetrahydrofolate reductase gene (MTHFR) have been considered in the progression of breast cancer (BC). Cancer is a result of genetic, environmental and epigenetic interactions; therefore, these genes should be studied in environmental context, because the results can vary between populations and even within the same country. The objective was to analyze the allelic and genotypic frequencies of the MTHFR C667T SNP in Mexican Mestizo patients with BC and controls from Northeastern Mexico. 243 patients and 118 healthy women were studied. The analysis of the polymorphism was performed with a DNA microarray. Once the frequency of the polymorphism was obtained, Hardy-Weinberg equilibrium test was carried out for the genotypes. Chi square test was used to compare the distribution of frequencies. The allele frequency in patients was: C = 0.5406; T = 0.4594 and in controls C = 0.5678, T = 0.4322. Genotype in BC patients was: C / C = 29.9%, C / T = 48.3% and T / T = 21.8. The distribution in controls was: C / C = 31.4%, C / T = 50.8%, T / T = 17.8% (chi squared 0.77, p = 0.6801). Northeastern Mexican women in this study showed no association between MTFHR C667T SNP and the risk of BC. It seems that the contribution of this polymorphism to BC in Mexico varies depending on various factors, both genetic and environmental.
Evaluation of Parkinson Disease Risk Variants as Expression-QTLs
Latourelle, Jeanne C.; Dumitriu, Alexandra; Hadzi, Tiffany C.; Beach, Thomas G.; Myers, Richard H.
2012-01-01
The recent Parkinson Disease GWAS Consortium meta-analysis and replication study reports association at several previously confirmed risk loci SNCA, MAPT, GAK/DGKQ, and HLA and identified a novel risk locus at RIT2. To further explore functional consequences of these associations, we investigated modification of gene expression in prefrontal cortex brain samples of pathologically confirmed PD cases (N = 26) and controls (N = 24) by 67 associated SNPs in these 5 loci. Association between the eSNPs and expression was evaluated using a 2-degrees of freedom test of both association and difference in association between cases and controls, adjusted for relevant covariates. SNPs at each of the 5 loci were tested for cis-acting effects on all probes within 250 kb of each locus. Trans-effects of the SNPs on the 39,122 probes passing all QC on the microarray were also examined. From the analysis of cis-acting SNP effects, several SNPs in the MAPT region show significant association to multiple nearby probes, including two strongly correlated probes targeting the gene LOC644246 and the duplicated genes LRRC37A and LRRC37A2, and a third uncorrelated probe targeting the gene DCAKD. Significant cis-associations were also observed between SNPs and two probes targeting genes in the HLA region on chromosome 6. Expanding the association study to examine trans effects revealed an additional 23 SNP-probe associations reaching statistical significance (p<2.8×10−8) including SNPs from the SNCA, MAPT and RIT2 regions. These findings provide additional context for the interpretation of PD associated SNPs identified in recent GWAS as well as potential insight into the mechanisms underlying the observed SNP associations. PMID:23071545
Peter, Harald; Berggrav, Kathrine; Thomas, Peter; Pfeifer, Yvonne; Witte, Wolfgang; Templeton, Kate
2012-01-01
Klebsiella pneumoniae carbapenemases (KPCs) are considered a serious threat to antibiotic therapy, as they confer resistance to carbapenems, which are used to treat extended-spectrum beta-lactamase (ESBL)-producing bacteria. Here, we describe the development and evaluation of a DNA microarray for the detection and genotyping of KPC genes (blaKPC) within a 5-h period. To test the whole assay procedure (DNA extraction plus a DNA microarray assay) directly from clinical specimens, we compared two commercial DNA extraction kits (the QIAprep Spin miniprep kit [Qiagen] and the urine bacterial DNA isolation kit [Norgen]) for the direct DNA extraction from urine samples (dilution series spiked in human urine). Reliable single nucleotide polymorphism (SNP) typing was demonstrated using 1 × 105 CFU/ml urine for Escherichia coli (Qiagen and Norgen) and 80 CFU/ml urine, on average, for K. pneumoniae (Norgen). This study presents, for the first time, the combination of a new KPC microarray with commercial sample preparation for detecting and genotyping microbial pathogens directly from clinical specimens; this paves the way toward tests providing epidemiological and diagnostic data, enabling better antimicrobial stewardship. PMID:23035190
Su, Lining; Wang, Chunjie; Zheng, Chenqing; Wei, Huiping; Song, Xiaoqing
2018-04-13
Parkinson's disease (PD) is a long-term degenerative disease that is caused by environmental and genetic factors. The networks of genes and their regulators that control the progression and development of PD require further elucidation. We examine common differentially expressed genes (DEGs) from several PD blood and substantia nigra (SN) microarray datasets by meta-analysis. Further we screen the PD-specific genes from common DEGs using GCBI. Next, we used a series of bioinformatics software to analyze the miRNAs, lncRNAs and SNPs associated with the common PD-specific genes, and then identify the mTF-miRNA-gene-gTF network. Our results identified 36 common DEGs in PD blood studies and 17 common DEGs in PD SN studies, and five of the genes were previously known to be associated with PD. Further study of the regulatory miRNAs associated with the common PD-specific genes revealed 14 PD-specific miRNAs in our study. Analysis of the mTF-miRNA-gene-gTF network about PD-specific genes revealed two feed-forward loops: one involving the SPRK2 gene, hsa-miR-19a-3p and SPI1, and the second involving the SPRK2 gene, hsa-miR-17-3p and SPI. The long non-coding RNA (lncRNA)-mediated regulatory network identified lncRNAs associated with PD-specific genes and PD-specific miRNAs. Moreover, single nucleotide polymorphism (SNP) analysis of the PD-specific genes identified two significant SNPs, and SNP analysis of the neurodegenerative disease-specific genes identified seven significant SNPs. Most of these SNPs are present in the 3'-untranslated region of genes and are controlled by several miRNAs. Our study identified a total of 53 common DEGs in PD patients compared with healthy controls in blood and brain datasets and five of these genes were previously linked with PD. Regulatory network analysis identified PD-specific miRNAs, associated long non-coding RNA and feed-forward loops, which contribute to our understanding of the mechanisms underlying PD. The SNPs identified in our study can determine whether a genetic variant is associated with PD. Overall, these findings will help guide our study of the complex molecular mechanism of PD.
SEURAT: visual analytics for the integrated analysis of microarray data.
Gribov, Alexander; Sill, Martin; Lück, Sonja; Rücker, Frank; Döhner, Konstanze; Bullinger, Lars; Benner, Axel; Unwin, Antony
2010-06-03
In translational cancer research, gene expression data is collected together with clinical data and genomic data arising from other chip based high throughput technologies. Software tools for the joint analysis of such high dimensional data sets together with clinical data are required. We have developed an open source software tool which provides interactive visualization capability for the integrated analysis of high-dimensional gene expression data together with associated clinical data, array CGH data and SNP array data. The different data types are organized by a comprehensive data manager. Interactive tools are provided for all graphics: heatmaps, dendrograms, barcharts, histograms, eventcharts and a chromosome browser, which displays genetic variations along the genome. All graphics are dynamic and fully linked so that any object selected in a graphic will be highlighted in all other graphics. For exploratory data analysis the software provides unsupervised data analytics like clustering, seriation algorithms and biclustering algorithms. The SEURAT software meets the growing needs of researchers to perform joint analysis of gene expression, genomical and clinical data.
ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays.
Rigaill, Guillem; Hupé, Philippe; Almeida, Anna; La Rosa, Philippe; Meyniel, Jean-Philippe; Decraene, Charles; Barillot, Emmanuel
2008-03-15
Affymetrix SNP arrays can be used to determine the DNA copy number measurement of 11 000-500 000 SNPs along the genome. Their high density facilitates the precise localization of genomic alterations and makes them a powerful tool for studies of cancers and copy number polymorphism. Like other microarray technologies it is influenced by non-relevant sources of variation, requiring correction. Moreover, the amplitude of variation induced by non-relevant effects is similar or greater than the biologically relevant effect (i.e. true copy number), making it difficult to estimate non-relevant effects accurately without including the biologically relevant effect. We addressed this problem by developing ITALICS, a normalization method that estimates both biological and non-relevant effects in an alternate, iterative manner, accurately eliminating irrelevant effects. We compared our normalization method with other existing and available methods, and found that ITALICS outperformed these methods for several in-house datasets and one public dataset. These results were validated biologically by quantitative PCR. The R package ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Snp arrays) has been submitted to Bioconductor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S; Jaing, C
The goal of this project is to develop forensic genotyping assays for select agent viruses, addressing a significant capability gap for the viral bioforensics and law enforcement community. We used a multipronged approach combining bioinformatics analysis, PCR-enriched samples, microarrays and TaqMan assays to develop high resolution and cost effective genotyping methods for strain level forensic discrimination of viruses. We have leveraged substantial experience and efficiency gained through year 1 on software development, SNP discovery, TaqMan signature design and phylogenetic signature mapping to scale up the development of forensics signatures in year 2. In this report, we have summarized the Taqmanmore » signature development for South American hemorrhagic fever viruses, tick-borne encephalitis viruses and henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus and Japanese encephalitis virus.« less
Chang, M-T; Cheng, Y-S; Huang, M-C
2013-02-01
In our previous cDNA microarray study, we found that the carbonic anhydrase II (CA2) gene is one of the differentially expressed transcripts in the duck isthmus epithelium during egg formation period. The aim of this study was to identify the single-nucleotide polymorphisms (SNPs) in the CA2 gene of Tsaiya ducks. The relationship of SNP genotype with egg production and reproduction traits was also investigated. A total of 317 ducks from two lines, a control line with no selection and a selected line, were employed for testing. Three SNPs (C37T, A62G and A65G) in the 3'-untranslated region of the CA2 gene were found. SNP-trait association analysis showed that SNP C37T and A62G were associated with duck egg weight besides fertility. The ducks with the CT and AG genotypes had a 1.46 and 1.62 g/egg lower egg weight as compared with ducks with the CC and AA genotypes, respectively (p < 0.05). But the ducks with CT and AG genotypes had 5.20% and 4.22% higher fertility than those with CC and AA genotypes, respectively (p < 0.05). Diplotype constructed on these three SNPs was associated with duck fertility, and the diplotype H1H4 was dominant for duck fertility. These findings might provide the basis for balanced selection and may be used in marker-assisted selection to improve egg weight and fertility simultaneously in the Tsaiya ducks. © 2012 Blackwell Verlag GmbH.
Metastatic breast carcinomas display genomic and transcriptomic heterogeneity
Weigelt, Britta; Ng, Charlotte KY; Shen, Ronglai; Popova, Tatiana; Schizas, Michail; Natrajan, Rachael; Mariani, Odette; Stern, Marc-Henri; Norton, Larry; Vincent-Salomon, Anne; Reis-Filho, Jorge S
2015-01-01
Metaplastic breast carcinoma is a rare and aggressive histologic type of breast cancer, preferentially displaying a triple-negative phenotype. We sought to define the transcriptomic heterogeneity of metaplastic breast cancers on the basis of current gene expression microarray-based classifiers, and to determine whether these tumors display gene copy number profiles consistent with those of BRCA1-associated breast cancers. Twenty-eight consecutive triple-negative metaplastic breast carcinomas were reviewed, and the metaplastic component present in each frozen specimen was defined (ie, spindle cell, squamous, chondroid metaplasia). RNA and DNA extracted from frozen sections with tumor cell content >60% were subjected to gene expression (Illumina HumanHT-12 v4) and copy number profiling (Affymetrix SNP 6.0), respectively. Using the best practice PAM50/claudin-low microarray-based classifier, all metaplastic breast carcinomas with spindle cell metaplasia were of claudin-low subtype, whereas those with squamous or chondroid metaplasia were preferentially of basal-like subtype. Triple-negative breast cancer subtyping using a dedicated website (http://cbc.mc.vanderbilt.edu/tnbc/) revealed that all metaplastic breast carcinomas with chondroid metaplasia were of mesenchymal-like subtype, spindle cell carcinomas preferentially of unstable or mesenchymal stem-like subtype, and those with squamous metaplasia were of multiple subtypes. None of the cases was classified as immunomodulatory or luminal androgen receptor subtype. Integrative clustering, combining gene expression and gene copy number data, revealed that metaplastic breast carcinomas with spindle cell and chondroid metaplasia were preferentially classified as of integrative clusters 4 and 9, respectively, whereas those with squamous metaplasia were classified into six different clusters. Eight of the 26 metaplastic breast cancers subjected to SNP6 analysis were classified as BRCA1-like. The diversity of histologic features of metaplastic breast carcinomas is reflected at the transcriptomic level, and an association between molecular subtypes and histology was observed. BRCA1-like genomic profiles were found only in a subset (31%) of metaplastic breast cancers, and were not associated with a specific molecular or histologic subtype. PMID:25412848
SEURAT: Visual analytics for the integrated analysis of microarray data
2010-01-01
Background In translational cancer research, gene expression data is collected together with clinical data and genomic data arising from other chip based high throughput technologies. Software tools for the joint analysis of such high dimensional data sets together with clinical data are required. Results We have developed an open source software tool which provides interactive visualization capability for the integrated analysis of high-dimensional gene expression data together with associated clinical data, array CGH data and SNP array data. The different data types are organized by a comprehensive data manager. Interactive tools are provided for all graphics: heatmaps, dendrograms, barcharts, histograms, eventcharts and a chromosome browser, which displays genetic variations along the genome. All graphics are dynamic and fully linked so that any object selected in a graphic will be highlighted in all other graphics. For exploratory data analysis the software provides unsupervised data analytics like clustering, seriation algorithms and biclustering algorithms. Conclusions The SEURAT software meets the growing needs of researchers to perform joint analysis of gene expression, genomical and clinical data. PMID:20525257
Serin, Elise A. R.; Snoek, L. B.; Nijveen, Harm; Willems, Leo A. J.; Jiménez-Gómez, Jose M.; Hilhorst, Henk W. M.; Ligterink, Wilco
2017-01-01
High-density genetic maps are essential for high resolution mapping of quantitative traits. Here, we present a new genetic map for an Arabidopsis Bayreuth × Shahdara recombinant inbred line (RIL) population, built on RNA-seq data. RNA-seq analysis on 160 RILs of this population identified 30,049 single-nucleotide polymorphisms (SNPs) covering the whole genome. Based on a 100-kbp window SNP binning method, 1059 bin-markers were identified, physically anchored on the genome. The total length of the RNA-seq genetic map spans 471.70 centimorgans (cM) with an average marker distance of 0.45 cM and a maximum marker distance of 4.81 cM. This high resolution genotyping revealed new recombination breakpoints in the population. To highlight the advantages of such high-density map, we compared it to two publicly available genetic maps for the same population, comprising 69 PCR-based markers and 497 gene expression markers derived from microarray data, respectively. In this study, we show that SNP markers can effectively be derived from RNA-seq data. The new RNA-seq map closes many existing gaps in marker coverage, saturating the previously available genetic maps. Quantitative trait locus (QTL) analysis for published phenotypes using the available genetic maps showed increased QTL mapping resolution and reduced QTL confidence interval using the RNA-seq map. The new high-density map is a valuable resource that facilitates the identification of candidate genes and map-based cloning approaches. PMID:29259624
Łastowska, M; Viprey, V; Santibanez-Koref, M; Wappler, I; Peters, H; Cullinane, C; Roberts, P; Hall, A G; Tweddle, D A; Pearson, A D J; Lewis, I; Burchill, S A; Jackson, M S
2007-11-22
Identifying genes, whose expression is consistently altered by chromosomal gains or losses, is an important step in defining genes of biological relevance in a wide variety of tumour types. However, additional criteria are needed to discriminate further among the large number of candidate genes identified. This is particularly true for neuroblastoma, where multiple genomic copy number changes of proven prognostic value exist. We have used Affymetrix microarrays and a combination of fluorescent in situ hybridization and single nucleotide polymorphism (SNP) microarrays to establish expression profiles and delineate copy number alterations in 30 primary neuroblastomas. Correlation of microarray data with patient survival and analysis of expression within rodent neuroblastoma cell lines were then used to define further genes likely to be involved in the disease process. Using this approach, we identify >1000 genes within eight recurrent genomic alterations (loss of 1p, 3p, 4p, 10q and 11q, 2p gain, 17q gain, and the MYCN amplicon) whose expression is consistently altered by copy number change. Of these, 84 correlate with patient survival, with the minimal regions of 17q gain and 4p loss being enriched significantly for such genes. These include genes involved in RNA and DNA metabolism, and apoptosis. Orthologues of all but one of these genes on 17q are overexpressed in rodent neuroblastoma cell lines. A significant excess of SNPs whose copy number correlates with survival is also observed on proximal 4p in stage 4 tumours, and we find that deletion of 4p is associated with improved outcome in an extended cohort of tumours. These results define the major impact of genomic copy number alterations upon transcription within neuroblastoma, and highlight genes on distal 17q and proximal 4p for downstream analyses. They also suggest that integration of discriminators, such as survival and comparative gene expression, with microarray data may be useful in the identification of critical genes within regions of loss or gain in many human cancers.
Identification and characterization of nuclear genes involved in photosynthesis in Populus
2014-01-01
Background The gap between the real and potential photosynthetic rate under field conditions suggests that photosynthesis could potentially be improved. Nuclear genes provide possible targets for improving photosynthetic efficiency. Hence, genome-wide identification and characterization of the nuclear genes affecting photosynthetic traits in woody plants would provide key insights on genetic regulation of photosynthesis and identify candidate processes for improvement of photosynthesis. Results Using microarray and bulked segregant analysis strategies, we identified differentially expressed nuclear genes for photosynthesis traits in a segregating population of poplar. We identified 515 differentially expressed genes in this population (FC ≥ 2 or FC ≤ 0.5, P < 0.05), 163 up-regulated and 352 down-regulated. Real-time PCR expression analysis confirmed the microarray data. Singular Enrichment Analysis identified 48 significantly enriched GO terms for molecular functions (28), biological processes (18) and cell components (2). Furthermore, we selected six candidate genes for functional examination by a single-marker association approach, which demonstrated that 20 SNPs in five candidate genes significantly associated with photosynthetic traits, and the phenotypic variance explained by each SNP ranged from 2.3% to 12.6%. This revealed that regulation of photosynthesis by the nuclear genome mainly involves transport, metabolism and response to stimulus functions. Conclusions This study provides new genome-scale strategies for the discovery of potential candidate genes affecting photosynthesis in Populus, and for identification of the functions of genes involved in regulation of photosynthesis. This work also suggests that improving photosynthetic efficiency under field conditions will require the consideration of multiple factors, such as stress responses. PMID:24673936
A molecular scheme for improved characterization of human embryonic stem cell lines
Josephson, Richard; Sykes, Gregory; Liu, Ying; Ording, Carol; Xu, Weining; Zeng, Xianmin; Shin, Soojung; Loring, Jeanne; Maitra, Anirban; Rao, Mahendra S; Auerbach, Jonathan M
2006-01-01
Background Human embryonic stem cells (hESC) offer a renewable source of a wide range of cell types for use in research and cell-based therapies to treat disease. Inspection of protein markers provides important information about the current state of the cells and data for subsequent manipulations. However, hESC must be routinely analyzed at the genomic level to guard against deleterious changes during extensive propagation, expansion, and manipulation in vitro. Results We found that short tandem repeat (STR) analysis, human leukocyte antigen (HLA) typing, single nucleotide polymorphism (SNP) genomic analysis, mitochondrial DNA sequencing, and gene expression analysis by microarray can be used to fully describe any hESC culture in terms of its identity, stability, and undifferentiated state. Conclusion Here we describe, using molecular biology alone, a comprehensive characterization of 17 different hESC lines. The use of amplified nucleic acids means that for the first time full characterization of hESC lines can be performed with little time investment and a minimum of material. The information thus gained will facilitate comparison of lines and replication of results between laboratories. PMID:16919167
Dai, Yilin; Guo, Ling; Li, Meng; Chen, Yi-Bu
2012-06-08
Microarray data analysis presents a significant challenge to researchers who are unable to use the powerful Bioconductor and its numerous tools due to their lack of knowledge of R language. Among the few existing software programs that offer a graphic user interface to Bioconductor packages, none have implemented a comprehensive strategy to address the accuracy and reliability issue of microarray data analysis due to the well known probe design problems associated with many widely used microarray chips. There is also a lack of tools that would expedite the functional analysis of microarray results. We present Microarray Я US, an R-based graphical user interface that implements over a dozen popular Bioconductor packages to offer researchers a streamlined workflow for routine differential microarray expression data analysis without the need to learn R language. In order to enable a more accurate analysis and interpretation of microarray data, we incorporated the latest custom probe re-definition and re-annotation for Affymetrix and Illumina chips. A versatile microarray results output utility tool was also implemented for easy and fast generation of input files for over 20 of the most widely used functional analysis software programs. Coupled with a well-designed user interface, Microarray Я US leverages cutting edge Bioconductor packages for researchers with no knowledge in R language. It also enables a more reliable and accurate microarray data analysis and expedites downstream functional analysis of microarray results.
Rare De Novo Copy Number Variants in Patients with Congenital Pulmonary Atresia
Xie, Li; Chen, Jin-Lan; Zhang, Wei-Zhi; Wang, Shou-Zheng; Zhao, Tian-Li; Huang, Can; Wang, Jian; Yang, Jin-Fu; Yang, Yi-Feng; Tan, Zhi-Ping
2014-01-01
Background Ongoing studies using genomic microarrays and next-generation sequencing have demonstrated that the genetic contributions to cardiovascular diseases have been significantly ignored in the past. The aim of this study was to identify rare copy number variants in individuals with congenital pulmonary atresia (PA). Methods and Results Based on the hypothesis that rare structural variants encompassing key genes play an important role in heart development in PA patients, we performed high-resolution genome-wide microarrays for copy number variations (CNVs) in 82 PA patient-parent trios and 189 controls with an Illumina SNP array platform. CNVs were identified in 17/82 patients (20.7%), and eight of these CNVs (9.8%) are considered potentially pathogenic. Five de novo CNVs occurred at two known congenital heart disease (CHD) loci (16p13.1 and 22q11.2). Two de novo CNVs that may affect folate and vitamin B12 metabolism were identified for the first time. A de novo 1-Mb deletion at 17p13.2 may represent a rare genomic disorder that involves mild intellectual disability and associated facial features. Conclusions Rare CNVs contribute to the pathogenesis of PA (9.8%), suggesting that the causes of PA are heterogeneous and pleiotropic. Together with previous data from animal models, our results might help identify a link between CHD and folate-mediated one-carbon metabolism (FOCM). With the accumulation of high-resolution SNP array data, these previously undescribed rare CNVs may help reveal critical gene(s) in CHD and may provide novel insights about CHD pathogenesis. PMID:24826987
Rare de novo copy number variants in patients with congenital pulmonary atresia.
Xie, Li; Chen, Jin-Lan; Zhang, Wei-Zhi; Wang, Shou-Zheng; Zhao, Tian-Li; Huang, Can; Wang, Jian; Yang, Jin-Fu; Yang, Yi-Feng; Tan, Zhi-Ping
2014-01-01
Ongoing studies using genomic microarrays and next-generation sequencing have demonstrated that the genetic contributions to cardiovascular diseases have been significantly ignored in the past. The aim of this study was to identify rare copy number variants in individuals with congenital pulmonary atresia (PA). Based on the hypothesis that rare structural variants encompassing key genes play an important role in heart development in PA patients, we performed high-resolution genome-wide microarrays for copy number variations (CNVs) in 82 PA patient-parent trios and 189 controls with an Illumina SNP array platform. CNVs were identified in 17/82 patients (20.7%), and eight of these CNVs (9.8%) are considered potentially pathogenic. Five de novo CNVs occurred at two known congenital heart disease (CHD) loci (16p13.1 and 22q11.2). Two de novo CNVs that may affect folate and vitamin B12 metabolism were identified for the first time. A de novo 1-Mb deletion at 17p13.2 may represent a rare genomic disorder that involves mild intellectual disability and associated facial features. Rare CNVs contribute to the pathogenesis of PA (9.8%), suggesting that the causes of PA are heterogeneous and pleiotropic. Together with previous data from animal models, our results might help identify a link between CHD and folate-mediated one-carbon metabolism (FOCM). With the accumulation of high-resolution SNP array data, these previously undescribed rare CNVs may help reveal critical gene(s) in CHD and may provide novel insights about CHD pathogenesis.
Genetic Structures of Copy Number Variants Revealed by Genotyping Single Sperm
Luo, Minjie; Cui, Xiangfeng; Fredman, David; Brookes, Anthony J.; Azaro, Marco A.; Greenawalt, Danielle M.; Hu, Guohong; Wang, Hui-Yun; Tereshchenko, Irina V.; Lin, Yong; Shentu, Yue; Gao, Richeng; Shen, Li; Li, Honghua
2009-01-01
Background Copy number variants (CNVs) occupy a significant portion of the human genome and may have important roles in meiotic recombination, human genome evolution and gene expression. Many genetic diseases may be underlain by CNVs. However, because of the presence of their multiple copies, variability in copy numbers and the diploidy of the human genome, detailed genetic structure of CNVs cannot be readily studied by available techniques. Methodology/Principal Findings Single sperm samples were used as the primary subjects for the study so that CNV haplotypes in the sperm donors could be studied individually. Forty-eight CNVs characterized in a previous study were analyzed using a microarray-based high-throughput genotyping method after multiplex amplification. Seventeen single nucleotide polymorphisms (SNPs) were also included as controls. Two single-base variants, either allelic or paralogous, could be discriminated for all markers. Microarray data were used to resolve SNP alleles and CNV haplotypes, to quantitatively assess the numbers and compositions of the paralogous segments in each CNV haplotype. Conclusions/Significance This is the first study of the genetic structure of CNVs on a large scale. Resulting information may help understand evolution of the human genome, gain insight into many genetic processes, and discriminate between CNVs and SNPs. The highly sensitive high-throughput experimental system with haploid sperm samples as subjects may be used to facilitate detailed large-scale CNV analysis. PMID:19384415
Analysis of high-order SNP barcodes in mitochondrial D-loop for chronic dialysis susceptibility.
Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh; Chang, Hsueh-Wei
2016-10-01
Positively identifying disease-associated single nucleotide polymorphism (SNP) markers in genome-wide studies entails the complex association analysis of a huge number of SNPs. Such large numbers of SNP barcode (SNP/genotype combinations) continue to pose serious computational challenges, especially for high-dimensional data. We propose a novel exploiting SNP barcode method based on differential evolution, termed IDE (improved differential evolution). IDE uses a "top combination strategy" to improve the ability of differential evolution to explore high-order SNP barcodes in high-dimensional data. We simulate disease data and use real chronic dialysis data to test four global optimization algorithms. In 48 simulated disease models, we show that IDE outperforms existing global optimization algorithms in terms of exploring ability and power to detect the specific SNP/genotype combinations with a maximum difference between cases and controls. In real data, we show that IDE can be used to evaluate the relative effects of each individual SNP on disease susceptibility. IDE generated significant SNP barcode with less computational complexity than the other algorithms, making IDE ideally suited for analysis of high-order SNP barcodes. Copyright © 2016 Elsevier Inc. All rights reserved.
Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin
2018-01-01
Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139
Wang, Rongyue; Lei, Tingying; Fu, Fang; Li, Ru; Jing, Xiangyi; Yang, Xin; Liu, Juan; Li, Dongzhi; Liao, Can
2018-03-26
Chromosome microarray analysis (CMA) is currently the first-tier diagnostic assay for the evaluation of developmental delay (DD) and intellectual disability (ID) with unknown etiology. Here, we present our clinical experience in implementing whole-genome high-resolution single nucleotide polymorphism (SNP) arrays to investigate 489 patients with unexplained DD/ID in whom standard karyotyping analyses showed normal karyotypes. This study aimed to assess the usefulness of CMA for clinical diagnostic testing in the Chinese population. A total of 489 children were classified into three groups: isolated DD/ID (n = 358), DD/ID with epilepsy (n = 49), and DD/ID with other structural anomalies (n = 82). We identified 126 cases (25.8%, 126/489) of pathogenic copy number variants (CNVs) by CMA, including 89 (24.9%, 89/358) with isolated DD/ID, 13 (26.5%, 13/49) with DD/ID with epilepsy, and 24 (29.3%, 24/82) with DD/ID with other structural anomalies. Among the 126 cases of pathogenic CNVs, 79 cases were identified as microdeletion/microduplication syndromes, among which 76 cases were classified as common syndromes, and 3 cases were classified as rare syndromes, including 15q24 microdeletion syndrome, Xq28 microduplication syndrome and Lowe syndrome. Additionally, there were forty-seven cases of non-syndromic pathogenic CNVs. The ABAT, FTSJ1, DYNC1H1, and SETBP1 genes were identified as DD/ID candidate genes. Our findings suggest the necessity of CMA as a routine diagnostic test for unexplained DD/ID in South China. Copyright © 2018. Published by Elsevier B.V.
LD2SNPing: linkage disequilibrium plotter and RFLP enzyme mining for tag SNPs
Chang, Hsueh-Wei; Chuang, Li-Yeh; Chang, Yan-Jhu; Cheng, Yu-Huei; Hung, Yu-Chen; Chen, Hsiang-Chi; Yang, Cheng-Hong
2009-01-01
Background Linkage disequilibrium (LD) mapping is commonly used to evaluate markers for genome-wide association studies. Most types of LD software focus strictly on LD analysis and visualization, but lack supporting services for genotyping. Results We developed a freeware called LD2SNPing, which provides a complete package of mining tools for genotyping and LD analysis environments. The software provides SNP ID- and gene-centric online retrievals for SNP information and tag SNP selection from dbSNP/NCBI and HapMap, respectively. Restriction fragment length polymorphism (RFLP) enzyme information for SNP genotype is available to all SNP IDs and tag SNPs. Single and multiple SNP inputs are possible in order to perform LD analysis by online retrieval from HapMap and NCBI. An LD statistics section provides D, D', r2, δQ, ρ, and the P values of the Hardy-Weinberg Equilibrium for each SNP marker, and Chi-square and likelihood-ratio tests for the pair-wise association of two SNPs in LD calculation. Finally, 2D and 3D plots, as well as plain-text output of the results, can be selected. Conclusion LD2SNPing thus provides a novel visualization environment for multiple SNP input, which facilitates SNP association studies. The software, user manual, and tutorial are freely available at . PMID:19500380
Analysis of High-Throughput ELISA Microarray Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, Amanda M.; Daly, Don S.; Zangar, Richard C.
Our research group develops analytical methods and software for the high-throughput analysis of quantitative enzyme-linked immunosorbent assay (ELISA) microarrays. ELISA microarrays differ from DNA microarrays in several fundamental aspects and most algorithms for analysis of DNA microarray data are not applicable to ELISA microarrays. In this review, we provide an overview of the steps involved in ELISA microarray data analysis and how the statistically sound algorithms we have developed provide an integrated software suite to address the needs of each data-processing step. The algorithms discussed are available in a set of open-source software tools (http://www.pnl.gov/statistics/ProMAT).
Zhu, Yuerong; Zhu, Yuelin; Xu, Wei
2008-01-01
Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from . PMID:18218103
HuH-7 reference genome profile: complex karyotype composed of massive loss of heterozygosity.
Kasai, Fumio; Hirayama, Noriko; Ozawa, Midori; Satoh, Motonobu; Kohara, Arihiro
2018-05-17
Human cell lines represent a valuable resource as in vitro experimental models. A hepatoma cell line, HuH-7 (JCRB0403), has been used extensively in various research fields and a number of studies using this line have been published continuously since it was established in 1982. However, an accurate genome profile, which can be served as a reliable reference, has not been available. In this study, we performed M-FISH, SNP microarray and amplicon sequencing to characterize the cell line. Single cell analysis of metaphases revealed a high level of heterogeneity with a mode of 60 chromosomes. Cytogenetic results demonstrated chromosome abnormalities involving every chromosome in addition to a massive loss of heterozygosity, which accounts for 55.3% of the genome, consistent with the homozygous variants seen in the sequence analysis. We provide empirical data that the HuH-7 cell line is composed of highly heterogeneous cell populations, suggesting that besides cell line authentication, the quality of cell lines needs to be taken into consideration in the future use of tumor cell lines.
Chemiluminescence microarrays in analytical chemistry: a critical review.
Seidel, Michael; Niessner, Reinhard
2014-09-01
Multi-analyte immunoassays on microarrays and on multiplex DNA microarrays have been described for quantitative analysis of small organic molecules (e.g., antibiotics, drugs of abuse, small molecule toxins), proteins (e.g., antibodies or protein toxins), and microorganisms, viruses, and eukaryotic cells. In analytical chemistry, multi-analyte detection by use of analytical microarrays has become an innovative research topic because of the possibility of generating several sets of quantitative data for different analyte classes in a short time. Chemiluminescence (CL) microarrays are powerful tools for rapid multiplex analysis of complex matrices. A wide range of applications for CL microarrays is described in the literature dealing with analytical microarrays. The motivation for this review is to summarize the current state of CL-based analytical microarrays. Combining analysis of different compound classes on CL microarrays reduces analysis time, cost of reagents, and use of laboratory space. Applications are discussed, with examples from food safety, water safety, environmental monitoring, diagnostics, forensics, toxicology, and biosecurity. The potential and limitations of research on multiplex analysis by use of CL microarrays are discussed in this review.
Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism
Xia, Rui; Vattathil, Selina; Scheet, Paul
2014-01-01
Genetic heterogeneity in a mixed sample of tumor and normal DNA can confound characterization of the tumor genome. Numerous computational methods have been proposed to detect aberrations in DNA samples from tumor and normal tissue mixtures. Most of these require tumor purities to be at least 10–15%. Here, we present a statistical model to capture information, contained in the individual's germline haplotypes, about expected patterns in the B allele frequencies from SNP microarrays while fully modeling their magnitude, the first such model for SNP microarray data. Our model consists of a pair of hidden Markov models—one for the germline and one for the tumor genome—which, conditional on the observed array data and patterns of population haplotype variation, have a dependence structure induced by the relative imbalance of an individual's inherited haplotypes. Together, these hidden Markov models offer a powerful approach for dealing with mixtures of DNA where the main component represents the germline, thus suggesting natural applications for the characterization of primary clones when stromal contamination is extremely high, and for identifying lesions in rare subclones of a tumor when tumor purity is sufficient to characterize the primary lesions. Our joint model for germline haplotypes and acquired DNA aberration is flexible, allowing a large number of chromosomal alterations, including balanced and imbalanced losses and gains, copy-neutral loss-of-heterozygosity (LOH) and tetraploidy. We found our model (which we term J-LOH) to be superior for localizing rare aberrations in a simulated 3% mixture sample. More generally, our model provides a framework for full integration of the germline and tumor genomes to deal more effectively with missing or uncertain features, and thus extract maximal information from difficult scenarios where existing methods fail. PMID:25166618
Integrative Assessment of Chlorine-Induced Acute Lung Injury in Mice
Pope-Varsalona, Hannah; Concel, Vincent J.; Liu, Pengyuan; Bein, Kiflai; Berndt, Annerose; Martin, Timothy M.; Ganguly, Koustav; Jang, An Soo; Brant, Kelly A.; Dopico, Richard A.; Upadhyay, Swapna; Di, Y. P. Peter; Hu, Zhen; Vuga, Louis J.; Medvedovic, Mario; Kaminski, Naftali; You, Ming; Alexander, Danny C.; McDunn, Jonathan E.; Prows, Daniel R.; Knoell, Daren L.
2012-01-01
The genetic basis for the underlying individual susceptibility to chlorine-induced acute lung injury is unknown. To uncover the genetic basis and pathophysiological processes that could provide additional homeostatic capacities during lung injury, 40 inbred murine strains were exposed to chlorine, and haplotype association mapping was performed. The identified single-nucleotide polymorphism (SNP) associations were evaluated through transcriptomic and metabolomic profiling. Using ≥ 10% allelic frequency and ≥ 10% phenotype explained as threshold criteria, promoter SNPs that could eliminate putative transcriptional factor recognition sites in candidate genes were assessed by determining transcript levels through microarray and reverse real-time PCR during chlorine exposure. The mean survival time varied by approximately 5-fold among strains, and SNP associations were identified for 13 candidate genes on chromosomes 1, 4, 5, 9, and 15. Microarrays revealed several differentially enriched pathways, including protein transport (decreased more in the sensitive C57BLKS/J lung) and protein catabolic process (increased more in the resistant C57BL/10J lung). Lung metabolomic profiling revealed 95 of the 280 metabolites measured were altered by chlorine exposure, and included alanine, which decreased more in the C57BLKS/J than in the C57BL/10J strain, and glutamine, which increased more in the C57BL/10J than in the C57BLKS/J strain. Genetic associations from haplotype mapping were strengthened by an integrated assessment using transcriptomic and metabolomic profiling. The leading candidate genes associated with increased susceptibility to acute lung injury in mice included Klf4, Sema7a, Tns1, Aacs, and a gene that encodes an amino acid carrier, Slc38a4. PMID:22447970
Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam
2012-01-15
Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.
Gelernter, Joel; Sherva, Richard; Koesterer, Ryan; Almasy, Laura; Zhao, Hongyu; Kranzler, Henry R.; Farrer, Lindsay
2013-01-01
We report a GWAS for cocaine dependence (CD) in three sets of African- and European-American subjects (AAs and EAs, respectively), to identify pathways, genes, and alleles important in CD risk. The discovery GWAS dataset (n=5,697 subjects) was genotyped using the Illumina OmniQuad microarray (890,000 analyzed SNPs). Additional genotypes were imputed based on the 1000 Genomes reference panel. Top-ranked findings were evaluated by incorporating information from publicly available GWAS data from 4,063 subjects. Then, the most significant GWAS SNPs were genotyped in 2,549 independent subjects. We observed one genomewide-significant (GWS) result: rs7086629 at the FAM53B (“family with sequence similarity 53, member B”) locus. This was supported in both AAs and EAs; p-value (meta-analysis of all samples) =4.28×10−8. The gene maps to the same chromosomal region as the maximum peak we observed in a previous linkage study. NCOR2 (nuclear receptor corepressor 1) SNP rs150954431 was associated with p=1.19×10−9 in the EA discovery sample. SNP rs2456778, which maps to CDK1 (“cyclin-dependent kinase 1”), was associated with cocaine-induced paranoia in AAs in the discovery sample only (p=4.68×10−8). This is the first study to identify risk variants for CD using GWAS. Our results implicate novel risk loci and provide insights into potential therapeutic and prevention strategies. PMID:23958962
Rong, E G; Yang, H; Zhang, Z W; Wang, Z P; Yan, X H; Li, H; Wang, N
2015-10-01
Methionine synthase (MTR) plays a crucial role in maintaining homeostasis of intracellular methionine, folate, and homocysteine, and its activity correlates with DNA methylation in many mammalian tissues. Our previous genomewide association study identified that 1 SNP located in the gene was associated with several wool production and quality traits in Chinese Merino. To confirm the potential involvement of the gene in sheep wool production and quality traits, we performed sheep tissue expression profiling, SNP detection, and association analysis with sheep wool production and quality traits. The semiquantitative reverse transcription PCR analysis showed that the gene was differentially expressed in skin from Merino and Kazak sheep. The sequencing analysis identified a total of 13 SNP in the gene from Chinese Merino sheep. Comparison of the allele frequencies revealed that these 13 identified SNP were significantly different among the 6 tested Chinese Merino strains ( < 0.001). Linkage disequilibrium analysis showed that SNP 3 to 11 were strongly linked in a single haplotype block in the tested population. Association analysis showed that SNP 2 to 11 were significantly associated with the average wool fiber diameter and the fineness SD and that SNP 4 to 11 were significantly associated with the CV of fiber diameter trait ( < 0.05). Single nucleotide polymorphism 2 and SNP 5 to 12 were weakly associated with wool crimp. Similarly, the haplotypes derived from these 13 identified SNP were also significantly associated with the average wool fiber diameter, fineness SD, and the CV of fiber diameter ( < 0.05). Our results suggest that is a candidate gene for sheep wool production and quality traits, and the identified SNP might be used in sheep breeding.
Yucesoy, Berran; Kaufman, Kenneth M.; Lummus, Zana L.; Weirauch, Matthew T.; Zhang, Ge; Cartier, André; Boulet, Louis-Philippe; Sastre, Joaquin; Quirce, Santiago; Tarlo, Susan M.; Cruz, Maria-Jesus; Munoz, Xavier; Harley, John B.; Bernstein, David I.
2015-01-01
Diisocyanates, reactive chemicals used to produce polyurethane products, are the most common causes of occupational asthma. The aim of this study is to identify susceptibility gene variants that could contribute to the pathogenesis of diisocyanate asthma (DA) using a Genome-Wide Association Study (GWAS) approach. Genome-wide single nucleotide polymorphism (SNP) genotyping was performed in 74 diisocyanate-exposed workers with DA and 824 healthy controls using Omni-2.5 and Omni-5 SNP microarrays. We identified 11 SNPs that exceeded genome-wide significance; the strongest association was for the rs12913832 SNP located on chromosome 15, which has been mapped to the HERC2 gene (p = 6.94 × 10−14). Strong associations were also found for SNPs near the ODZ3 and CDH17 genes on chromosomes 4 and 8 (rs908084, p = 8.59 × 10−9 and rs2514805, p = 1.22 × 10−8, respectively). We also prioritized 38 SNPs with suggestive genome-wide significance (p < 1 × 10−6). Among them, 17 SNPs map to the PITPNC1, ACMSD, ZBTB16, ODZ3, and CDH17 gene loci. Functional genomics data indicate that 2 of the suggestive SNPs (rs2446823 and rs2446824) are located within putative binding sites for the CCAAT/Enhancer Binding Protein (CEBP) and Hepatocyte Nuclear Factor 4, Alpha transcription factors (TFs), respectively. This study identified SNPs mapping to the HERC2, CDH17, and ODZ3 genes as potential susceptibility loci for DA. Pathway analysis indicated that these genes are associated with antigen processing and presentation, and other immune pathways. Overlap of 2 suggestive SNPs with likely TF binding sites suggests possible roles in disruption of gene regulation. These results provide new insights into the genetic architecture of DA and serve as a basis for future functional and mechanistic studies. PMID:25918132
Bentz, Eva-Katrin; Hefler, Lukas A; Kaufmann, Ulrike; Huber, Johannes C; Kolbus, Andrea; Tempfer, Clemens B
2008-07-01
To assess the association between transsexualism and allele and genotype frequencies of the common cytochrome P450 (CYP) 17 -34 T>C single nucleotide polymorphism (SNP). Case-control study. Academic research institution. 102 male-to-female (MtF) and 49 female-to-male (FtM) transsexuals, 756 male controls, and 915 female controls. Buccal swabs and multiplex polymerase chain reaction on a microarray system. Analysis of the CYP17 -34 T>C SNP. CYP17 -34 T>C SNP allele frequencies were statistically significantly different between FtM transsexuals and female controls (CYP17 T: 55/98 [56%] and CYP17 C: 43/98 [44%] versus CYP17 T: 1253/1826 [69%] and CYP17 C: 573/1826 [31%], respectively). In accordance, genotype distributions were also different between FtM transsexuals and female controls using a recessive genotype model (CYP17 T/T+T/C: 39/49 [80%] and C/C 10/49 [20%] vs. CYP17 T/T+T/C: 821/913 [90%] and C/C 92/913 [10%], respectively). The CYP17 -34 T>C allele and genotype distributions were not statistically significantly different between MtF transsexuals and male controls. Of note, the CYP17 -34 T>C allele distribution was gender-specific among controls (CYP17 C: males; 604 of 1512 [40%] vs. females; 573 of 1826 [31%]). The MtF transsexuals had an allele distribution equivalent to male controls, whereas FtM transsexuals did not follow the gender-specific allele distribution of female controls but rather had an allele distribution equivalent to MtF transsexuals and male controls. These data support CYP17 as a candidate gene of FtM transsexualism and indicate that loss of a female-specific CYP17 T -34C allele distribution pattern is associated with FtM transsexualism.
Mismatch and G-Stack Modulated Probe Signals on SNP Microarrays
Binder, Hans; Fasold, Mario; Glomb, Torsten
2009-01-01
Background Single nucleotide polymorphism (SNP) arrays are important tools widely used for genotyping and copy number estimation. This technology utilizes the specific affinity of fragmented DNA for binding to surface-attached oligonucleotide DNA probes. We analyze the variability of the probe signals of Affymetrix GeneChip SNP arrays as a function of the probe sequence to identify relevant sequence motifs which potentially cause systematic biases of genotyping and copy number estimates. Methodology/Principal Findings The probe design of GeneChip SNP arrays enables us to disentangle different sources of intensity modulations such as the number of mismatches per duplex, matched and mismatched base pairings including nearest and next-nearest neighbors and their position along the probe sequence. The effect of probe sequence was estimated in terms of triple-motifs with central matches and mismatches which include all 256 combinations of possible base pairings. The probe/target interactions on the chip can be decomposed into nearest neighbor contributions which correlate well with free energy terms of DNA/DNA-interactions in solution. The effect of mismatches is about twice as large as that of canonical pairings. Runs of guanines (G) and the particular type of mismatched pairings formed in cross-allelic probe/target duplexes constitute sources of systematic biases of the probe signals with consequences for genotyping and copy number estimates. The poly-G effect seems to be related to the crowded arrangement of probes which facilitates complex formation of neighboring probes with at minimum three adjacent G's in their sequence. Conclusions The applied method of “triple-averaging” represents a model-free approach to estimate the mean intensity contributions of different sequence motifs which can be applied in calibration algorithms to correct signal values for sequence effects. Rules for appropriate sequence corrections are suggested. PMID:19924253
Leblanc, N; Cortey, M; Fernandez Pinero, J; Gallardo, C; Masembe, C; Okurut, A R; Heath, L; van Heerden, J; Sánchez-Vizcaino, J M; Ståhl, K; Belák, S
2013-08-01
African swine fever virus (ASFV) causes one of the most dreaded transboundary animal diseases (TADs) in Suidae. African swine fever (ASF) often causes high rates of morbidity and mortality, which can reach 100% in domestic swine. To date, serological diagnosis has the drawback of not being able to differentiate variants of this virus. Previous studies have identified the 22 genotypes based on sequence variation in the C-terminal region of the p72 gene, which has become the standard for categorizing ASFVs. This article describes a genotyping assay developed using a segment of PCR-amplified genomic DNA of approximately 450 bp, which encompasses the C-terminal end of the p72 gene. Complementary paired DNA probes of 15 or 17 bp in length, which are identical except for a single nucleotide polymorphism (SNP) in the central position, were designed to either individually or in combination differentiate between the 22 genotypes. The assay was developed using xMAP technology; probes were covalently linked to microspheres, hybridized to PCR product, labelled with a reporter and read in the Luminex 200 analyzer. Characterization of the sample was performed by comparing fluorescence of the paired SNP probes, that is, the probe with higher fluorescence in a complementary pair identified the SNP that a particular sample possessed. In the final assay, a total of 52 probes were employed, 24 SNP pairs and 4 for general detection. One or more samples from each of the 22 genotypes were tested. The assay was able to detect and distinguish all 22 genotypes. This novel assay provides a powerful novel tool for the simultaneous rapid diagnosis and genotypic differentiation of ASF. © 2012 Blackwell Verlag GmbH.
Lê Cao, Kim-Anh; Boitard, Simon; Besse, Philippe
2011-06-22
Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits. A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework. sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.
cDNA microarray analysis of esophageal cancer: discoveries and prospects.
Shimada, Yutaka; Sato, Fumiaki; Shimizu, Kazuharu; Tsujimoto, Gozoh; Tsukada, Kazuhiro
2009-07-01
Recent progress in molecular biology has revealed many genetic and epigenetic alterations that are involved in the development and progression of esophageal cancer. Microarray analysis has also revealed several genetic networks that are involved in esophageal cancer. However, clinical application of microarray techniques and use of microarray data have not yet occurred. In this review, we focus on the recent developments and problems with microarray analysis of esophageal cancer.
Zheng, Jie; Gaunt, Tom R; Day, Ian N M
2013-01-01
Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool (http://apps.biocompute.org.uk/sssrap/sssrap.cgi). To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data. © 2012 Blackwell Publishing Ltd/University College London.
A genome-wide perspective on the evolutionary history of enigmatic wolf-like canids
vonHoldt, Bridgett M.; Pollinger, John P.; Earl, Dent A.; Knowles, James C.; Boyko, Adam R.; Parker, Heidi; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Jedrzejewska, Bogumila; Sidorovich, Vadim; Greco, Claudia; Randi, Ettore; Musiani, Marco; Kays, Roland; Bustamante, Carlos D.; Ostrander, Elaine A.; Novembre, John; Wayne, Robert K.
2011-01-01
High-throughput genotyping technologies developed for model species can potentially increase the resolution of demographic history and ancestry in wild relatives. We use a SNP genotyping microarray developed for the domestic dog to assay variation in over 48K loci in wolf-like species worldwide. Despite the high mobility of these large carnivores, we find distinct hierarchical population units within gray wolves and coyotes that correspond with geographic and ecologic differences among populations. Further, we test controversial theories about the ancestry of the Great Lakes wolf and red wolf using an analysis of haplotype blocks across all 38 canid autosomes. We find that these enigmatic canids are highly admixed varieties derived from gray wolves and coyotes, respectively. This divergent genomic history suggests that they do not have a shared recent ancestry as proposed by previous researchers. Interspecific hybridization, as well as the process of evolutionary divergence, may be responsible for the observed phenotypic distinction of both forms. Such admixture complicates decisions regarding endangered species restoration and protection. PMID:21566151
Clinical comparison of overlapping deletions of 19p13.3.
Risheg, Hiba; Pasion, Romela; Sacharow, Stephanie; Proud, Virginia; Immken, LaDonna; Schwartz, Stuart; Tepperberg, Jim H; Papenhausen, Peter; Tan, Tiong Y; Andrieux, Joris; Plessis, Ghislaine; Amor, David J; Keitges, Elisabeth A
2013-05-01
We present three patients with overlapping interstitial deletions of 19p13.3 identified by high resolution SNP microarray analysis. All three had a similar phenotype characterized by intellectual disability or developmental delay, structural heart abnormalities, large head relative to height and weight or macrocephaly, and minor facial anomalies. Deletion sizes ranged from 792 Kb to 1.0 Mb and included a common region arr [hg19] 19p13.3 (3,814,392-4,136,989), containing eight genes: ZFR2, ATCAY, NMRK2, DAPK3, EEF2, PIAS4, ZBTB7A, MAP2K2, and two non-coding RNA's MIR637 and SNORDU37. The patient phenotypes were compared with three previous single patient reports with similar interstitial 19p13.3 deletions and six additional patients from the DECIPHER and ISCA databases to determine if a common haploinsufficient phenotype for the region can be established. Copyright © 2013 Wiley Periodicals, Inc.
Importing MAGE-ML format microarray data into BioConductor.
Durinck, Steffen; Allemeersch, Joke; Carey, Vincent J; Moreau, Yves; De Moor, Bart
2004-12-12
The microarray gene expression markup language (MAGE-ML) is a widely used XML (eXtensible Markup Language) standard for describing and exchanging information about microarray experiments. It can describe microarray designs, microarray experiment designs, gene expression data and data analysis results. We describe RMAGEML, a new Bioconductor package that provides a link between cDNA microarray data stored in MAGE-ML format and the Bioconductor framework for preprocessing, visualization and analysis of microarray experiments. http://www.bioconductor.org. Open Source.
A highly efficient multi-core algorithm for clustering extremely large datasets
2010-01-01
Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
Audo, Isabelle; Bujakowska, Kinga; Mohand-Saïd, Saddek; Tronche, Sophie; Lancelot, Marie-Elise; Antonio, Aline; Germain, Aurore; Lonjou, Christine; Carpentier, Wassila; Sahel, José-Alain; Bhattacharya, Shomi; Zeitz, Christina
2011-01-01
To identify the genetic defect of a consanguineous Portuguese family with rod-cone dystrophy and varying degrees of decreased audition. A detailed ophthalmic and auditory examination was performed on a Portuguese patient with severe autosomal recessive rod-cone dystrophy. Known genetic defects were excluded by performing autosomal recessive retinitis pigmentosa (arRP) genotyping microarray analysis and by Sanger sequencing of the coding exons and flanking intronic regions of eyes shut homolog-drosophila (EYS) and chromosome 2 open reading frame 71 (C2orf71). Subsequently, genome-wide homozygosity mapping was performed in DNA samples from available family members using a 700K single nucleotide polymorphism (SNP) microarray. Candidate genes present in the significantly large homozygous regions were screened for mutations using Sanger sequencing. The largest homozygous region (~11 Mb) in the affected family members was mapped to chromosome 9, which harbors deafness, autosomal recessive 31 (DFNB31; a gene previously associated with Usher syndrome). Mutation analysis of DFNB31 in the index patient identified a novel one-base-pair deletion (c.737delC), which is predicted to lead to a truncated protein (p.Pro246HisfsX13) and co-segregated with the disease in the family. Ophthalmic examination of the index patient and the affected siblings showed severe rod-cone dystrophy. Pure tone audiometry revealed a moderate hearing loss in the index patient, whereas the affected siblings were reported with more profound and early onset hearing impairment. We report a novel truncating mutation in DFNB31 associated with severe rod-cone dystrophy and varying degrees of hearing impairment in a consanguineous family of Portuguese origin. This is the second report of DFNB31 implication in Usher type 2.
High-density fiber optic biosensor arrays
NASA Astrophysics Data System (ADS)
Epstein, Jason R.; Walt, David R.
2002-02-01
Novel approaches are required to coordinate the immense amounts of information derived from diverse genomes. This concept has influenced the expanded role of high-throughput DNA detection and analysis in the biological sciences. A high-density fiber optic DNA biosensor was developed consisting of oligonucleotide-functionalized, 3.1 mm diameter microspheres deposited into the etched wells on the distal face of a 500 micrometers imaging fiber bundle. Imaging fiber bundles containing thousands of optical fibers, each associated with a unique oligonucleotide probe sequence, were the foundation for an optically connected, individually addressable DNA detection platform. Different oligonucleotide-functionalized microspheres were combined in a stock solution, and randomly dispersed into the etched wells. Microsphere positions were registered from optical dyes incorporated onto the microspheres. The distribution process provided an inherent redundancy that increases the signal-to-noise ratio as the square root of the number of sensors examined. The representative amount of each probe-type in the array was dependent on their initial stock solution concentration, and as other sequences of interest arise, new microsphere elements can be added to arrays without altering the existing detection capabilities. The oligonucleotide probe sequences hybridize to fluorescently-labeled, complementary DNA target solutions. Fiber optic DNA microarray research has included DNA-protein interaction profiles, microbial strain differentiation, non-labeled target interrogation with molecular beacons, and single cell-based assays. This biosensor array is proficient in DNA detection linked to specific disease states, single nucleotide polymorphism (SNP's) discrimination, and gene expression analysis. This array platform permits multiple detection formats, provides smaller feature sizes, and enables sensor design flexibility. High-density fiber optic microarray biosensors provide a fast, reversible format with the detection limit of a few hundred molecules.
Killion, Patrick J; Sherlock, Gavin; Iyer, Vishwanath R
2003-01-01
Background The power of microarray analysis can be realized only if data is systematically archived and linked to biological annotations as well as analysis algorithms. Description The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux. It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases. LAD is available at Conclusions Our development of LAD provides a simple, free, open, reliable and proven solution for storage and analysis of two-color microarray data. PMID:12930545
Guo, Liyuan; Wang, Jing
2018-01-04
Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
2018-01-01
Abstract Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element–target gene pairs (E–G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. PMID:29140525
Two Novel SNPs of PPARγ Significantly Affect Weaning Growth Traits of Nanyang Cattle.
Huang, Jieping; Chen, Ningbo; Li, Xin; An, Shanshan; Zhao, Minghui; Sun, Taihong; Hao, Ruijie; Ma, Yun
2018-01-02
Peroxisome-proliferator-activated receptor gamma (PPARγ) is a key transcription factor that controls adipocyte differentiation and energy in mammals. Therefore, PPARγ is a potential factor influencing animal growth traits. This study primarily evaluates PPARγ as candidate gene for growth traits of cattle and identifies potential molecular marker for cattle breeding. Per previous studies, PPARγ mRNA was mainly expressed at extremely high levels in adipose tissues as shown by quantitative real-time polymerase chain reaction analysis. Three novel SNPs of the bovine PPARγ gene were identified in 514 individuals from six Chinese cattle breeds: SNP1 (AC_000179.1 g.57386668 C > G) in intron 2 and SNP2 (AC_000179.1 g.57431964 C > T) and SNP3 (AC_000179.1 g.57431994 T > C) in exon 7. The present study also investigated genetic characteristics of these SNP loci in six populations. Association analysis showed that SNP1 and SNP3 loci significantly affect weaning growth traits, especially body weight of Nanyang cattle. These results revealed that SNP1 and SNP3 are potential molecular markers for cattle breeding.
The Microarray Revolution: Perspectives from Educators
ERIC Educational Resources Information Center
Brewster, Jay L.; Beason, K. Beth; Eckdahl, Todd T.; Evans, Irene M.
2004-01-01
In recent years, microarray analysis has become a key experimental tool, enabling the analysis of genome-wide patterns of gene expression. This review approaches the microarray revolution with a focus upon four topics: 1) the early development of this technology and its application to cancer diagnostics; 2) a primer of microarray research,…
Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA
2008-01-01
Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks through a single easy-to-use web application. The system architecture is flexible and scalable to allow new array types, analysis algorithms and tools to be added with relative ease and to cope with large increases in data volume. PMID:19032776
An Introduction to MAMA (Meta-Analysis of MicroArray data) System.
Zhang, Zhe; Fenstermacher, David
2005-01-01
Analyzing microarray data across multiple experiments has been proven advantageous. To support this kind of analysis, we are developing a software system called MAMA (Meta-Analysis of MicroArray data). MAMA utilizes a client-server architecture with a relational database on the server-side for the storage of microarray datasets collected from various resources. The client-side is an application running on the end user's computer that allows the user to manipulate microarray data and analytical results locally. MAMA implementation will integrate several analytical methods, including meta-analysis within an open-source framework offering other developers the flexibility to plug in additional statistical algorithms.
GrigoraSNPs: Optimized Analysis of SNPs for DNA Forensics.
Ricke, Darrell O; Shcherbina, Anna; Michaleas, Adam; Fremont-Smith, Philip
2018-04-16
High-throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) enables additional DNA forensic capabilities not attainable using traditional STR panels. However, the inclusion of sets of loci selected for mixture analysis, extended kinship, phenotype, biogeographic ancestry prediction, etc., can result in large panel sizes that are difficult to analyze in a rapid fashion. GrigoraSNP was developed to address the allele-calling bottleneck that was encountered when analyzing SNP panels with more than 5000 loci using HTS. GrigoraSNPs uses a MapReduce parallel data processing on multiple computational threads plus a novel locus-identification hashing strategy leveraging target sequence tags. This tool optimizes the SNP calling module of the DNA analysis pipeline with runtimes that scale linearly with the number of HTS reads. Results are compared with SNP analysis pipelines implemented with SAMtools and GATK. GrigoraSNPs removes a computational bottleneck for processing forensic samples with large HTS SNP panels. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.
Aberrant estrogen regulation of PEMT results in choline deficiency-associated liver dysfunction.
Resseguie, Mary E; da Costa, Kerry-Ann; Galanko, Joseph A; Patel, Mukund; Davis, Ian J; Zeisel, Steven H
2011-01-14
When dietary choline is restricted, most men and postmenopausal women develop multiorgan dysfunction marked by hepatic steatosis (choline deficiency syndrome (CDS)). However, a significant subset of premenopausal women is protected from CDS. Because hepatic PEMT (phosphatidylethanolamine N-methyltransferase) catalyzes de novo biosynthesis of choline and this gene is under estrogenic control, we hypothesized that there are SNPs in PEMT that disrupt the hormonal regulation of PEMT and thereby put women at risk for CDS. In this study, we performed transcript-specific gene expression analysis, which revealed that estrogen regulates PEMT in an isoform-specific fashion. Locus-wide SNP analysis identified a risk-associated haplotype that was selectively associated with loss of hormonal activation. Chromatin immunoprecipitation, analyzed by locus-wide microarray studies, comprehensively identified regions of estrogen receptor binding in PEMT. The polymorphism (rs12325817) most highly linked with the development of CDS (p < 0.00006) was located within 1 kb of the critical estrogen response element. The risk allele failed to bind either the estrogen receptor or the pioneer factor FOXA1. These data demonstrate that allele-specific ablation of estrogen receptor-DNA interaction in the PEMT locus prevents hormone-inducible PEMT expression, conferring risk of CDS in women.
Uehara, Yuriko; Oda, Katsutoshi; Ikeda, Yuji; Koso, Takahiro; Tsuji, Shingo; Yamamoto, Shogo; Asada, Kayo; Sone, Kenbun; Kurikawa, Reiko; Makii, Chinami; Hagiwara, Otoe; Tanikawa, Michihiro; Maeda, Daichi; Hasegawa, Kosei; Nakagawa, Shunsuke; Wada-Hiraike, Osamu; Kawana, Kei; Fukayama, Masashi; Fujiwara, Keiichi; Yano, Tetsu; Osuga, Yutaka; Fujii, Tomoyuki; Aburatani, Hiroyuki
2015-01-01
Ovarian clear cell carcinoma (CCC) is generally associated with chemoresistance and poor clinical outcome, even with early diagnosis; whereas high-grade serous carcinomas (SCs) and endometrioid carcinomas (ECs) are commonly chemosensitive at advanced stages. Although an integrated genomic analysis of SC has been performed, conclusive views on copy number and expression profiles for CCC are still limited. In this study, we performed single nucleotide polymorphism analysis with 57 epithelial ovarian cancers (31 CCCs, 14 SCs, and 12 ECs) and microarray expression analysis with 55 cancers (25 CCCs, 16 SCs, and 14 ECs). We then evaluated PIK3CA mutations and ARID1A expression in CCCs. SNP array analysis classified 13% of CCCs into a cluster with high frequency and focal range of copy number alterations (CNAs), significantly lower than for SCs (93%, P < 0.01) and ECs (50%, P = 0.017). The ratio of whole-arm to all CNAs was higher in CCCs (46.9%) than SCs (21.7%; P < 0.0001). SCs with loss of heterozygosity (LOH) of BRCA1 (85%) also had LOH of NF1 and TP53, and LOH of BRCA2 (62%) coexisted with LOH of RB1 and TP53. Microarray analysis classified CCCs into three clusters. One cluster (CCC-2, n = 10) showed more favorable prognosis than the CCC-1 and CCC-3 clusters (P = 0.041). Coexistent alterations of PIK3CA and ARID1A were more common in CCC-1 and CCC-3 (7/11, 64%) than in CCC-2 (0/10, 0%; P < 0.01). Being in cluster CCC-2 was an independent favorable prognostic factor in CCC. In conclusion, CCC was characterized by a high ratio of whole-arm CNAs; whereas CNAs in SC were mainly focal, but preferentially caused LOH of well-known tumor suppressor genes. As such, expression profiles might be useful for sub-classification of CCC, and might provide useful information on prognosis. PMID:26043110
Novel applications of array comparative genomic hybridization in molecular diagnostics.
Cheung, Sau W; Bi, Weimin
2018-05-31
In 2004, the implementation of array comparative genomic hybridization (array comparative genome hybridization [CGH]) into clinical practice marked a new milestone for genetic diagnosis. Array CGH and single-nucleotide polymorphism (SNP) arrays enable genome-wide detection of copy number changes in a high resolution, and therefore microarray has been recognized as the first-tier test for patients with intellectual disability or multiple congenital anomalies, and has also been applied prenatally for detection of clinically relevant copy number variations in the fetus. Area covered: In this review, the authors summarize the evolution of array CGH technology from their diagnostic laboratory, highlighting exonic SNP arrays developed in the past decade which detect small intragenic copy number changes as well as large DNA segments for the region of heterozygosity. The applications of array CGH to human diseases with different modes of inheritance with the emphasis on autosomal recessive disorders are discussed. Expert commentary: An exonic array is a powerful and most efficient clinical tool in detecting genome wide small copy number variants in both dominant and recessive disorders. However, whole-genome sequencing may become the single integrated platform for detection of copy number changes, single-nucleotide changes as well as balanced chromosomal rearrangements in the near future.
Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice[S
Leduc, Magalie S.; Hageman, Rachael S.; Verdugo, Ricardo A.; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A.; Paigen, Beverly
2011-01-01
To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a “toolbox” of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits. PMID:21622629
Current genetic methodologies in the identification of disaster victims and in forensic analysis.
Ziętkiewicz, Ewa; Witt, Magdalena; Daca, Patrycja; Zebracka-Gala, Jadwiga; Goniewicz, Mariusz; Jarząb, Barbara; Witt, Michał
2012-02-01
This review presents the basic problems and currently available molecular techniques used for genetic profiling in disaster victim identification (DVI). The environmental conditions of a mass disaster often result in severe fragmentation, decomposition and intermixing of the remains of victims. In such cases, traditional identification based on the anthropological and physical characteristics of the victims is frequently inconclusive. This is the reason why DNA profiling became the gold standard for victim identification in mass-casualty incidents (MCIs) or any forensic cases where human remains are highly fragmented and/or degraded beyond recognition. The review provides general information about the sources of genetic material for DNA profiling, the genetic markers routinely used during genetic profiling (STR markers, mtDNA and single-nucleotide polymorphisms [SNP]) and the basic statistical approaches used in DNA-based disaster victim identification. Automated technological platforms that allow the simultaneous analysis of a multitude of genetic markers used in genetic identification (oligonucleotide microarray techniques and next-generation sequencing) are also presented. Forensic and population databases containing information on human variability, routinely used for statistical analyses, are discussed. The final part of this review is focused on recent developments, which offer particularly promising tools for forensic applications (mRNA analysis, transcriptome variation in individuals/populations and genetic profiling of specific cells separated from mixtures).
Irvin, Marguerite R; Sitlani, Colleen M; Noordam, Raymond; Avery, Christie L; Bis, Joshua C; Floyd, James S; Li, Jin; Limdi, Nita A; Srinivasasainagendra, Vinodh; Stewart, James; de Mutsert, Renée; Mook-Kanamori, Dennis O; Lipovich, Leonard; Kleinbrink, Erica L; Smith, Albert; Bartz, Traci M; Whitsel, Eric A; Uitterlinden, Andre G; Wiggins, Kerri L; Wilson, James G; Zhi, Degui; Stricker, Bruno H; Rotter, Jerome I; Arnett, Donna K; Psaty, Bruce M; Lange, Leslie A
2018-06-01
We evaluated interactions of SNP-by-ACE-I/ARB and SNP-by-TD on serum potassium (K+) among users of antihypertensive treatments (anti-HTN). Our study included seven European-ancestry (EA) (N = 4835) and four African-ancestry (AA) cohorts (N = 2016). We performed race-stratified, fixed-effect, inverse-variance-weighted meta-analyses of 2.5 million SNP-by-drug interaction estimates; race-combined meta-analysis; and trans-ethnic fine-mapping. Among EAs, we identified 11 significant SNPs (P < 5 × 10 -8 ) for SNP-ACE-I/ARB interactions on serum K+ that were located between NR2F1-AS1 and ARRDC3-AS1 on chromosome 5 (top SNP rs6878413 P = 1.7 × 10 -8 ; ratio of serum K+ in ACE-I/ARB exposed compared to unexposed is 1.0476, 1.0280, 1.0088 for the TT, AT, and AA genotypes, respectively). Trans-ethnic fine mapping identified the same group of SNPs on chromosome 5 as genome-wide significant for the ACE-I/ARB analysis. In conclusion, SNP-by-ACE-I /ARB interaction analyses uncovered loci that, if replicated, could have future implications for the prevention of arrhythmias due to anti-HTN treatment-related hyperkalemia. Before these loci can be identified as clinically relevant, future validation studies of equal or greater size in comparison to our discovery effort are needed.
Mendoza Lopez, Pablo; Golby, Paul; Wooff, Esen; Garcia, Javier Nunez; Garcia Pelayo, M. Carmen; Conlon, Kevin; Gema Camacho, Ana; Hewinson, R. Glyn; Polaina, Julio; Suárez García, Antonio; Gordon, Stephen V.
2010-01-01
A number of single-nucleotide polymorphisms (SNPs) have been identified in the genome of Mycobacterium bovis BCG Pasteur compared with the sequenced strain M. bovis 2122/97. The functional consequences of many of these mutations remain to be described; however, mutations in genes encoding regulators may be particularly relevant to global phenotypic changes such as loss of virulence, since alteration of a regulator's function will affect the expression of a wide range of genes. One such SNP falls in bcg3145, encoding a member of the AfsR/DnrI/SARP class of global transcriptional regulators, that replaces a highly conserved glutamic acid residue at position 159 (E159G) with glycine in a tetratricopeptide repeat (TPR) located in the bacterial transcriptional activation (BTA) domain of BCG3145. TPR domains are associated with protein–protein interactions, and a conserved core (helices T1–T7) of the BTA domain seems to be required for proper function of SARP-family proteins. Structural modelling predicted that the E159G mutation perturbs the third α-helix of the BTA domain and could therefore have functional consequences. The E159G SNP was found to be present in all BCG strains, but absent from virulent M. bovis and Mycobacterium tuberculosis strains. By overexpressing BCG3145 and Rv3124 in BCG and H37Rv and monitoring transcriptome changes using microarrays, we determined that BCG3145/Rv3124 acts as a positive transcriptional regulator of the molybdopterin biosynthesis moa1 locus, and we suggest that rv3124 be renamed moaR1. The SNP in bcg3145 was found to have a subtle effect on the activity of MoaR1, suggesting that this mutation is not a key event in the attenuation of BCG. PMID:20378651
Li, Dongmei; Le Pape, Marc A; Parikh, Nisha I; Chen, Will X; Dye, Timothy D
2013-01-01
Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.
Transfection microarray and the applications.
Miyake, Masato; Yoshikawa, Tomohiro; Fujita, Satoshi; Miyake, Jun
2009-05-01
Microarray transfection has been extensively studied for high-throughput functional analysis of mammalian cells. However, control of efficiency and reproducibility are the critical issues for practical use. By using solid-phase transfection accelerators and nano-scaffold, we provide a highly efficient and reproducible microarray-transfection device, "transfection microarray". The device would be applied to the limited number of available primary cells and stem cells not only for large-scale functional analysis but also reporter-based time-lapse cellular event analysis.
Prioritizing individual genetic variants after kernel machine testing using variable selection.
He, Qianchuan; Cai, Tianxi; Liu, Yang; Zhao, Ni; Harmon, Quaker E; Almli, Lynn M; Binder, Elisabeth B; Engel, Stephanie M; Ressler, Kerry J; Conneely, Karen N; Lin, Xihong; Wu, Michael C
2016-12-01
Kernel machine learning methods, such as the SNP-set kernel association test (SKAT), have been widely used to test associations between traits and genetic polymorphisms. In contrast to traditional single-SNP analysis methods, these methods are designed to examine the joint effect of a set of related SNPs (such as a group of SNPs within a gene or a pathway) and are able to identify sets of SNPs that are associated with the trait of interest. However, as with many multi-SNP testing approaches, kernel machine testing can draw conclusion only at the SNP-set level, and does not directly inform on which one(s) of the identified SNP set is actually driving the associations. A recently proposed procedure, KerNel Iterative Feature Extraction (KNIFE), provides a general framework for incorporating variable selection into kernel machine methods. In this article, we focus on quantitative traits and relatively common SNPs, and adapt the KNIFE procedure to genetic association studies and propose an approach to identify driver SNPs after the application of SKAT to gene set analysis. Our approach accommodates several kernels that are widely used in SNP analysis, such as the linear kernel and the Identity by State (IBS) kernel. The proposed approach provides practically useful utilities to prioritize SNPs, and fills the gap between SNP set analysis and biological functional studies. Both simulation studies and real data application are used to demonstrate the proposed approach. © 2016 WILEY PERIODICALS, INC.
Cho, Young-Il; Ahn, Yul-Kyun; Tripathi, Swati; Kim, Jeong-Ho; Lee, Hye-Eun; Kim, Do-Sun
2015-01-01
Numerous studies using single nucleotide polymorphisms (SNPs) have been conducted in humans, and other animals, and in major crops, including rice, soybean, and Chinese cabbage. However, the number of SNP studies in cabbage is limited. In this present study, we evaluated whether 7,645 SNPs previously identified as molecular markers linked to disease resistance in the Brassica rapa genome could be applied to B. oleracea. In a BLAST analysis using the SNP sequences of B. rapa and B. oleracea genomic sequence data registered in the NCBI database, 256 genes for which SNPs had been identified in B. rapa were found in B. oleracea. These genes were classified into three functional groups: molecular function (64 genes), biological process (96 genes), and cellular component (96 genes). A total of 693 SNP markers, including 145 SNP markers [BRH—developed from the B. rapa genome for high-resolution melt (HRM) analysis], 425 SNP markers (BRP—based on the B. rapa genome that could be applied to B. oleracea), and 123 new SNP markers (BRS—derived from BRP and designed for HRM analysis), were investigated for their ability to amplify sequences from cabbage genomic DNA. In total, 425 of the SNP markers (BRP-based on B. rapa genome), selected from 7,645 SNPs, were successfully applied to B. oleracea. Using PCR, 108 of 145 BRH (74.5%), 415 of 425 BRP (97.6%), and 118 of 123 BRS (95.9%) showed amplification, suggesting that it is possible to apply SNP markers developed based on the B. rapa genome to B. oleracea. These results provide valuable information that can be utilized in cabbage genetics and breeding programs using molecular markers derived from other Brassica species. PMID:25790283
Zhang, Zhaowei; Li, Peiwu; Hu, Xiaofeng; Zhang, Qi; Ding, Xiaoxia; Zhang, Wen
2012-01-01
Chemical contaminants in food have caused serious health issues in both humans and animals. Microarray technology is an advanced technique suitable for the analysis of chemical contaminates. In particular, immuno-microarray approach is one of the most promising methods for chemical contaminants analysis. The use of microarrays for the analysis of chemical contaminants is the subject of this review. Fabrication strategies and detection methods for chemical contaminants are discussed in detail. Application to the analysis of mycotoxins, biotoxins, pesticide residues, and pharmaceutical residues is also described. Finally, future challenges and opportunities are discussed.
Liu, Kaihua; Zhang, Bin; Teng, Zhaochun; Wang, Youtao; Dong, Guodong; Xu, Cong; Qin, Bo; Song, Chunlian; Chai, Jun; Li, Yang; Shi, Xianwei; Shu, Xianghua; Zhang, Yifang
2017-03-01
We investigated the associations between SLC11A1 polymorphisms and susceptibility to tuberculosis (TB) in Chinese Holstein cattle, using a case-control study of 136 animals that had positive reactions to TB tests and showed symptoms and 96 animals that had negative reactions to tests and showed no symptoms. Polymerase chain reaction (PCR) sequencing and the restriction fragment length polymorphism (RFLP) technique were used to detect and determine SLC11A1 polymorphisms. Association analysis identified significant correlations between SLC11A1 polymorphisms and susceptibility/resistance to TB, and two genetic markers for SLC11A1 were established using PCR-RFLP. Sequence alignment of SLC11A1 revealed seven single-nucleotide polymorphisms (SNPs). This is the first report of MaeII PCR-RFLP markers for the SLC11A1-SNP3 site and PstI PCR-RFLP markers for the SLC11A1-SNP5 and SLC11A1-SNP6 sites in Chinese Holstein cattle. Logistic regression analysis indicated that SLC11A1-SNP1, SLC11A1-SNP3, and SLC11A1-SNP5 were significantly associated with susceptibility/resistance to TB. Two genotypes of SLC11A1-SNP3 were susceptible to TB, whereas one genotype of SLC11A1-SNP1 and two genotypes of SLC11A1-SNP5 were resistant. Haplotype analysis showed that nine haplotypes were potentially resistant to TB. After Bonferroni correction, three of the haplotypes remained significantly associated with TB resistance. SLC11A1 is a useful candidate gene related to TB in Chinese Holstein cattle. Copyright © 2016 Elsevier Ltd. All rights reserved.
Laing, Chad R; Buchanan, Cody; Taboada, Eduardo N; Zhang, Yongxiang; Karmali, Mohamed A; Thomas, James E; Gannon, Victor Pj
2009-06-29
Many approaches have been used to study the evolution, population structure and genetic diversity of Escherichia coli O157:H7; however, observations made with different genotyping systems are not easily relatable to each other. Three genetic lineages of E. coli O157:H7 designated I, II and I/II have been identified using octamer-based genome scanning and microarray comparative genomic hybridization (mCGH). Each lineage contains significant phenotypic differences, with lineage I strains being the most commonly associated with human infections. Similarly, a clade of hyper-virulent O157:H7 strains implicated in the 2006 spinach and lettuce outbreaks has been defined using single-nucleotide polymorphism (SNP) typing. In this study an in silico comparison of six different genotyping approaches was performed on 19 E. coli genome sequences from 17 O157:H7 strains and single O145:NM and K12 MG1655 strains to provide an overall picture of diversity of the E. coli O157:H7 population, and to compare genotyping methods for O157:H7 strains. In silico determination of lineage, Shiga-toxin bacteriophage integration site, comparative genomic fingerprint, mCGH profile, novel region distribution profile, SNP type and multi-locus variable number tandem repeat analysis type was performed and a supernetwork based on the combination of these methods was produced. This supernetwork showed three distinct clusters of strains that were O157:H7 lineage-specific, with the SNP-based hyper-virulent clade 8 synonymous with O157:H7 lineage I/II. Lineage I/II/clade 8 strains clustered closest on the supernetwork to E. coli K12 and E. coli O55:H7, O145:NM and sorbitol-fermenting O157 strains. The results of this study highlight the similarities in relationships derived from multi-locus genome sampling methods and suggest a "common genotyping language" may be devised for population genetics and epidemiological studies. Future genotyping methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.
Xu, Qing; Mei, Gui; Sun, Dongxiao; Zhang, Qin; Zhang, Yuan; Yin, Cengceng; Chen, Huiyong; Ding, Xiangdong; Liu, Jianfeng
2012-11-02
We previously localized a quantitative trait locus (QTL) on bovine chromosome 6 affecting milk production traits to a 1.5-Mb region between BMS483 and MNB-209 via genome scanning followed by fine mapping. Totally 15 genes were mapped within such linkage region through bioinformatic analysis of the cattle-human comparative map and bovine genome assembly. Of them, the UDP-glucose dehydrogenase (UGDH) was suggested as a potential positional candidate gene for milk production traits based on its corresponding physiological and biochemical functions and genetic effects. By sequencing all the coding exons and the untranslated regions in UGDH with pooled DNA of 8 sires represented the separated families detected in our previous studies, a total of ten SNPs were identified and genotyped in 1417 Holstein cows of 8 separation families. Individual SNP-based association analysis revealed 4 significant associations of SNP Ex1-1, SNP Int3-1, SNP Int5-1, and SNP Ex12-3 with milk yield (P < 0.05), and 2 significant associations of SNP Ex1-1 and SNP Ex12-3 with protein yield (P < 0.05). Furthermore, our haplotype-based association analyses indicated that haplotypes G-C-C, formed by SNP Ex12-2-SNP Int11-1-SNP Ex11-1, T-G, formed by SNP Int9-3-SNP Int9-2, and C-C, formed by SNP Int5-1-SNP Int3-1, are significantly associated with protein percentage (F=4.15; P=0.0418) and fat percentage (F=5.18~7.25; P=0.0072~0.0231). Finally, by using an in vitro expression assay, we demonstrated that the A allele of SNP Ex1-1 and T allele of SNP Ex11-1of UGDH significantly decreases the expression of UGDH by 68.0% at the RNA, and 50.1% at the protein level, suggesting that SNP Ex1-1 and Ex11-1 represent two functional polymorphisms affecting expression of UGDH and may partly contributed to the observed association of the gene with milk production traits in our samples. Taken together, our findings strongly indicate that UGDH gene could be involved in genetic variation underlying the QTL for milk production traits.
Contributions to Statistical Problems Related to Microarray Data
ERIC Educational Resources Information Center
Hong, Feng
2009-01-01
Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…
Joint Identification of Genetic Variants for Physical Activity in Korean Population
Kim, Jayoun; Kim, Jaehee; Min, Haesook; Oh, Sohee; Kim, Yeonjung; Lee, Andy H.; Park, Taesung
2014-01-01
There has been limited research on genome-wide association with physical activity (PA). This study ascertained genetic associations between PA and 344,893 single nucleotide polymorphism (SNP) markers in 8842 Korean samples. PA data were obtained from a validated questionnaire that included information on PA intensity and duration. Metabolic equivalent of tasks were calculated to estimate the total daily PA level for each individual. In addition to single- and multiple-SNP association tests, a pathway enrichment analysis was performed to identify the biological significance of SNP markers. Although no significant SNP was found at genome-wide significance level via single-SNP association tests, 59 genetic variants mapped to 76 genes were identified via a multiple SNP approach using a bootstrap selection stability measure. Pathway analysis for these 59 variants showed that maturity onset diabetes of the young (MODY) was enriched. Joint identification of SNPs could enable the identification of multiple SNPs with good predictive power for PA and a pathway enriched for PA. PMID:25026172
Clevert, Djork-Arné; Mitterecker, Andreas; Mayr, Andreas; Klambauer, Günter; Tuefferd, Marianne; De Bondt, An; Talloen, Willem; Göhlmann, Hinrich; Hochreiter, Sepp
2011-07-01
Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, 'cn.FARMS', which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html.
snpTree--a web-server to identify and construct SNP trees from whole genome sequence data.
Leekitcharoenphon, Pimlapas; Kaas, Rolf S; Thomsen, Martin Christen Frølund; Friis, Carsten; Rasmussen, Simon; Aarestrup, Frank M
2012-01-01
The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script.The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evaluation results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.
Microarray platform for omics analysis
NASA Astrophysics Data System (ADS)
Mecklenburg, Michael; Xie, Bin
2001-09-01
Microarray technology has revolutionized genetic analysis. However, limitations in genome analysis has lead to renewed interest in establishing 'omic' strategies. As we enter the post-genomic era, new microarray technologies are needed to address these new classes of 'omic' targets, such as proteins, as well as lipids and carbohydrates. We have developed a microarray platform that combines self- assembling monolayers with the biotin-streptavidin system to provide a robust, versatile immobilization scheme. A hydrophobic film is patterned on the surface creating an array of tension wells that eliminates evaporation effects thereby reducing the shear stress to which biomolecules are exposed to during immobilization. The streptavidin linker layer makes it possible to adapt and/or develop microarray based assays using virtually any class of biomolecules including: carbohydrates, peptides, antibodies, receptors, as well as them ore traditional DNA based arrays. Our microarray technology is designed to furnish seamless compatibility across the various 'omic' platforms by providing a common blueprint for fabricating and analyzing arrays. The prototype microarray uses a microscope slide footprint patterned with 2 by 96 flat wells. Data on the microarray platform will be presented.
Microarrays in brain research: the good, the bad and the ugly.
Mirnics, K
2001-06-01
Making sense of microarray data is a complex process, in which the interpretation of findings will depend on the overall experimental design and judgement of the investigator performing the analysis. As a result, differences in tissue harvesting, microarray types, sample labelling and data analysis procedures make post hoc sharing of microarray data a great challenge. To ensure rapid and meaningful data exchange, we need to create some order out of the existing chaos. In these ground-breaking microarray standardization and data sharing efforts, NIH agencies should take a leading role
Trivedi, Prinal; Edwards, Jode W; Wang, Jelai; Gadbury, Gary L; Srinivasasainagendra, Vinodh; Zakharkin, Stanislav O; Kim, Kyoungmi; Mehta, Tapan; Brand, Jacob P L; Patki, Amit; Page, Grier P; Allison, David B
2005-04-06
Many efforts in microarray data analysis are focused on providing tools and methods for the qualitative analysis of microarray data. HDBStat! (High-Dimensional Biology-Statistics) is a software package designed for analysis of high dimensional biology data such as microarray data. It was initially developed for the analysis of microarray gene expression data, but it can also be used for some applications in proteomics and other aspects of genomics. HDBStat! provides statisticians and biologists a flexible and easy-to-use interface to analyze complex microarray data using a variety of methods for data preprocessing, quality control analysis and hypothesis testing. Results generated from data preprocessing methods, quality control analysis and hypothesis testing methods are output in the form of Excel CSV tables, graphs and an Html report summarizing data analysis. HDBStat! is a platform-independent software that is freely available to academic institutions and non-profit organizations. It can be downloaded from our website http://www.soph.uab.edu/ssg_content.asp?id=1164.
Biological relevance of CNV calling methods using familial relatedness including monozygotic twins.
Castellani, Christina A; Melka, Melkaye G; Wishart, Andrea E; Locke, M Elizabeth O; Awamleh, Zain; O'Reilly, Richard L; Singh, Shiva M
2014-04-21
Studies involving the analysis of structural variation including Copy Number Variation (CNV) have recently exploded in the literature. Furthermore, CNVs have been associated with a number of complex diseases and neurodevelopmental disorders. Common methods for CNV detection use SNP, CNV, or CGH arrays, where the signal intensities of consecutive probes are used to define the number of copies associated with a given genomic region. These practices pose a number of challenges that interfere with the ability of available methods to accurately call CNVs. It has, therefore, become necessary to develop experimental protocols to test the reliability of CNV calling methods from microarray data so that researchers can properly discriminate biologically relevant data from noise. We have developed a workflow for the integration of data from multiple CNV calling algorithms using the same array results. It uses four CNV calling programs: PennCNV (PC), Affymetrix® Genotyping Console™ (AGC), Partek® Genomics Suite™ (PGS) and Golden Helix SVS™ (GH) to analyze CEL files from the Affymetrix® Human SNP 6.0 Array™. To assess the relative suitability of each program, we used individuals of known genetic relationships. We found significant differences in CNV calls obtained by different CNV calling programs. Although the programs showed variable patterns of CNVs in the same individuals, their distribution in individuals of different degrees of genetic relatedness has allowed us to offer two suggestions. The first involves the use of multiple algorithms for the detection of the largest possible number of CNVs, and the second suggests the use of PennCNV over all other methods when the use of only one software program is desirable.
SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping.
Chang, Hsueh-Wei; Cheng, Yu-Huei; Chuang, Li-Yeh; Yang, Cheng-Hong
2010-04-08
PCR-restriction fragment length polymorphism (RFLP) assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels), gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2.
ArrayNinja: An Open Source Platform for Unified Planning and Analysis of Microarray Experiments.
Dickson, B M; Cornett, E M; Ramjan, Z; Rothbart, S B
2016-01-01
Microarray-based proteomic platforms have emerged as valuable tools for studying various aspects of protein function, particularly in the field of chromatin biochemistry. Microarray technology itself is largely unrestricted in regard to printable material and platform design, and efficient multidimensional optimization of assay parameters requires fluidity in the design and analysis of custom print layouts. This motivates the need for streamlined software infrastructure that facilitates the combined planning and analysis of custom microarray experiments. To this end, we have developed ArrayNinja as a portable, open source, and interactive application that unifies the planning and visualization of microarray experiments and provides maximum flexibility to end users. Array experiments can be planned, stored to a private database, and merged with the imaged results for a level of data interaction and centralization that is not currently attainable with available microarray informatics tools. © 2016 Elsevier Inc. All rights reserved.
SNP-Based Typing: A Useful Tool to Study Bordetella pertussis Populations
van der Heide, Han G. J.; Heuvelman, Kees J.; Kallonen, Teemu; He, Qiushui; Mertsola, Jussi; Advani, Abdolreza; Hallander, Hans O.; Janssens, Koen; Hermans, Peter W.; Mooi, Frits R.
2011-01-01
To monitor changes in Bordetella pertussis populations, mainly two typing methods are used; Pulsed-Field Gel Electrophoresis (PFGE) and Multiple-Locus Variable-Number Tandem Repeat Analysis (MLVA). In this study, a single nucleotide polymorphism (SNP) typing method, based on 87 SNPs, was developed and compared with PFGE and MLVA. The discriminatory indices of SNP typing, PFGE and MLVA were found to be 0.85, 0.95 and 0.83, respectively. Phylogenetic analysis, using SNP typing as Gold Standard, revealed false homoplasies in the PFGE and MLVA trees. Further, in contrast to the SNP-based tree, the PFGE- and MLVA-based trees did not reveal a positive correlation between root-to-tip distance and the isolation year of strains. Thus PFGE and MLVA do not allow an estimation of the relative age of the selected strains. In conclusion, SNP typing was found to be phylogenetically more informative than PFGE and more discriminative than MLVA. Further, in contrast to PFGE, it is readily standardized allowing interlaboratory comparisons. We applied SNP typing to study strains with a novel allele for the pertussis toxin promoter, ptxP3, which have a worldwide distribution and which have replaced the resident ptxP1 strains in the last 20 years. Previously, we showed that ptxP3 strains showed increased pertussis toxin expression and that their emergence was associated with increased notification in the Netherlands. SNP typing showed that the ptxP3 strains isolated in the Americas, Asia, Australia and Europe formed a monophyletic branch which recently diverged from ptxP1 strains. Two predominant ptxP3 SNP types were identified which spread worldwide. The widespread use of SNP typing will enhance our understanding of the evolution and global epidemiology of B. pertussis. PMID:21647370
WebArray: an online platform for microarray data analysis
Xia, Xiaoqin; McClelland, Michael; Wang, Yipeng
2005-01-01
Background Many cutting-edge microarray analysis tools and algorithms, including commonly used limma and affy packages in Bioconductor, need sophisticated knowledge of mathematics, statistics and computer skills for implementation. Commercially available software can provide a user-friendly interface at considerable cost. To facilitate the use of these tools for microarray data analysis on an open platform we developed an online microarray data analysis platform, WebArray, for bench biologists to utilize these tools to explore data from single/dual color microarray experiments. Results The currently implemented functions were based on limma and affy package from Bioconductor, the spacings LOESS histogram (SPLOSH) method, PCA-assisted normalization method and genome mapping method. WebArray incorporates these packages and provides a user-friendly interface for accessing a wide range of key functions of limma and others, such as spot quality weight, background correction, graphical plotting, normalization, linear modeling, empirical bayes statistical analysis, false discovery rate (FDR) estimation, chromosomal mapping for genome comparison. Conclusion WebArray offers a convenient platform for bench biologists to access several cutting-edge microarray data analysis tools. The website is freely available at . It runs on a Linux server with Apache and MySQL. PMID:16371165
Phetsuksiri, Benjawan; Srisungngam, Sopa; Rudeeaneksin, Janisara; Bunchoo, Supranee; Lukebua, Atchariya; Wongtrungkapun, Ruch; Paitoon, Soontara; Sakamuri, Rama Murthy; Brennan, Patrick J; Vissa, Varalakshmi
2012-01-01
Based on the discovery of three single nucleotide polymorphisms (SNPs) in Mycobacterium leprae, it has been previously reported that there are four major SNP types associated with different geographic regions around the world. Another typing system for global differentiation of M. leprae is the analysis of the variable number of short tandem repeats within the rpoT gene. To expand the analysis of geographic distribution of M. leprae, classified by SNP and rpoT gene polymorphisms, we studied 85 clinical isolates from Thai patients and compared the findings with those reported from Asian isolates. SNP genotyping by PCR amplification and sequencing revealed that all strains like those in Myanmar were SNP type 1 and 3, with the former being predominant, while in Japan, Korea, and Indonesia, the SNP type 3 was found to be more frequent. The pattern of M. leprae distribution in Thailand and Myanmar is quite similar, except that SNP type 2 was not found in Thailand. In addition, the 3-copy hexamer genotype in the rpoT gene is shared among the isolates from these two neighboring countries. On the basis of these two markers, we postulate that M. leprae in leprosy patients from Myanmar and Thailand has a common historical origin. Further differentiation among Thai isolates was possible by assessing copy numbers of the TTC sequence, a more polymorphic microsatellite locus.
Sabiel, Salih A I; Huang, Sisi; Hu, Xin; Ren, Xifeng; Fu, Chunjie; Peng, Junhua; Sun, Dongfa
2017-03-01
In the present study, 150 accessions of worldwide originated durum wheat germplasm ( Triticum turgidum spp. durum ) were observed for major seedling traits and their growth. The accessions were evaluated for major seedling traits under controlled conditions of hydroponics at the 13 th , 20 th , 27 th and 34 th day-after germination. Biomass traits were measured at the 34 th day-after germination. Correlation analysis was conducted among the seedling traits and three field traits at maturity, plant height, grain weight and 1000-grain weight observed in four consecutive years. Associations of the measured seedling traits and SNP markers were analyzed based on the mixed linear model (MLM). The results indicated that highly significant genetic variation and robust heritability were found for the seedling and field mature traits. In total, 259 significant associations were detected for all the traits and four growth stages. The phenotypic variation explained (R2) by a single SNP marker is higher than 10% for most (84%) of the significant SNP markers. Forty-six SNP markers associated with multiple traits, indicating non-neglectable pleiotropy in seedling stage. The associated SNP markers could be helpful for genetic analysis of seedling traits, and marker-assisted breeding of new wheat varieties with strong seedling vigor.
SNPConvert: SNP Array Standardization and Integration in Livestock Species.
Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra
2016-06-09
One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.
Nielsen, Peter B; Petersen, Maja S; Ystaas, Viviana; Andersen, Rolf V; Hansen, Karin M; Blaabjerg, Vibeke; Refstrup, Mette
2012-10-01
Classical hereditary hemochromatosis involves the HFE-gene and diagnostic analysis of the DNA variants HFE p.C282Y (c.845G>A; rs1800562) and HFE p.H63D (c.187C>G; rs1799945). The affected protein alters the iron homeostasis resulting in iron overload in various tissues. The aim of this study was to validate the TaqMan-based Sample-to-SNP protocol for the analysis of the HFE-p.C282Y and p.H63D variants with regard to accuracy, usefulness and reproducibility compared to an existing SNP protocol. The Sample-to-SNP protocol uses an approach where the DNA template is made accessible from a cell lysate followed by TaqMan analysis. Besides the HFE-SNPs other eight SNPs were used as well. These SNPs were: Coagulation factor II-gene F2 c.20210G>A, Coagulation factor V-gene F5 p.R506Q (c.1517G>A; rs121917732), Mitochondria SNP: mt7028 G>A, Mitochondria SNP: mt12308 A>G, Proprotein convertase subtilisin/kexin type 9-gene PCSK9 p.R46L (c.137G>T), Plutathione S-transferase pi 1-gene GSTP1 p.I105V (c313A>G; rs1695), LXR g.-171 A>G, ZNF202 g.-118 G>T. In conclusion the Sample-to-SNP kit proved to be an accurate, reliable, robust, easy to use and rapid TaqMan-based SNP detection protocol, which could be quickly implemented in a routine diagnostic or research facility. Copyright © 2012. Published by Elsevier B.V.
SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data.
Lee, Tae-Ho; Guo, Hui; Wang, Xiyin; Kim, Changsoo; Paterson, Andrew H
2014-02-26
Phylogenetic trees are widely used for genetic and evolutionary studies in various organisms. Advanced sequencing technology has dramatically enriched data available for constructing phylogenetic trees based on single nucleotide polymorphisms (SNPs). However, massive SNP data makes it difficult to perform reliable analysis, and there has been no ready-to-use pipeline to generate phylogenetic trees from these data. We developed a new pipeline, SNPhylo, to construct phylogenetic trees based on large SNP datasets. The pipeline may enable users to construct a phylogenetic tree from three representative SNP data file formats. In addition, in order to increase reliability of a tree, the pipeline has steps such as removing low quality data and considering linkage disequilibrium. A maximum likelihood method for the inference of phylogeny is also adopted in generation of a tree in our pipeline. Using SNPhylo, users can easily produce a reliable phylogenetic tree from a large SNP data file. Thus, this pipeline can help a researcher focus more on interpretation of the results of analysis of voluminous data sets, rather than manipulations necessary to accomplish the analysis.
Singh, Amit Kumar; Kumar, Sundeep; Srinivasan, Kalyani; Tyagi, R. K.; Singh, N. K.; Singh, Rakesh
2013-01-01
Simple sequence repeat (SSR) and Single Nucleotide Polymorphic (SNP), the two most robust markers for identifying rice varieties were compared for assessment of genetic diversity and population structure. Total 375 varieties of rice from various regions of India archived at the Indian National GeneBank, NBPGR, New Delhi, were analyzed using thirty six genetic markers, each of hypervariable SSR (HvSSR) and SNP which were distributed across 12 rice chromosomes. A total of 80 alleles were amplified with the SSR markers with an average of 2.22 alleles per locus whereas, 72 alleles were amplified with SNP markers. Polymorphic information content (PIC) values for HvSSR ranged from 0.04 to 0.5 with an average of 0.25. In the case of SNP markers, PIC values ranged from 0.03 to 0.37 with an average of 0.23. Genetic relatedness among the varieties was studied; utilizing an unrooted tree all the genotypes were grouped into three major clusters with both SSR and SNP markers. Analysis of molecular variance (AMOVA) indicated that maximum diversity was partitioned between and within individual level but not between populations. Principal coordinate analysis (PCoA) with SSR markers showed that genotypes were uniformly distributed across the two axes with 13.33% of cumulative variation whereas, in case of SNP markers varieties were grouped into three broad groups across two axes with 45.20% of cumulative variation. Population structure were tested using K values from 1 to 20, but there was no clear population structure, therefore Ln(PD) derived Δk was plotted against the K to determine the number of populations. In case of SSR maximum Δk was at K=5 whereas, in case of SNP maximum Δk was found at K=15, suggesting that resolution of population was higher with SNP markers, but SSR were more efficient for diversity analysis. PMID:24367635
Liu, X; Guo, X Y; Xu, X Z; Wu, M; Zhang, X; Li, Q; Ma, P P; Zhang, Y; Wang, C Y; Geng, F J; Qin, C H; Liu, L; Shi, W H; Wang, Y C; Yu, Y
2012-08-16
DNA methylation is essential for adipose deposition in mammals. We screened SNPs of the bovine DNA methyltransferase 3b (DNMT3b) gene in Snow Dragon beef, a commercial beef cattle population in China. Nine SNPs were found in the population and three of six novel SNPs were chosen for genotyping and analyzing a possible association with 16 meat quality traits. The frequencies of the alleles and genotypes of the three SNPs in Snow Dragon beef were similar to those in their terminal-paternal breed, Wagyu. Association analysis disclosed that SNP1 was not associated with any of the traits; SNP2 was significantly associated with lean meat color score and chuck short rib score, and SNP3 had a significant effect on dressing percentage and back-fat thickness in the beef population. The individuals with genotype GG for SNP2 had a 25.7% increase in lean meat color score and a 146% increase in chuck short rib score, compared with genotype AA. The cattle with genotype AG for SNP3 had 35.7 and 24% increases in dressing percentage and 28.8 and 29.2% increases in back-fat thickness, compared with genotypes GG and AA, respectively. Genotypic combination analysis revealed significant interactions between SNP1 and SNP2 and between SNP2 and SNP3 for the traits rib-eye area and live weight. We conclude that there is considerable evidence that DNMT3b is a determiner of beef quality traits.
Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping
NASA Technical Reports Server (NTRS)
Royce, Thomas E.; Rozowsky, Joel S.; Bertone, Paul; Samanta, Manoj; Stolc, Viktor; Weissman, Sherman; Snyder, Michael; Gerstein, Mark
2005-01-01
Traditional microarrays use probes complementary to known genes to quantitate the differential gene expression between two or more conditions. Genomic tiling microarray experiments differ in that probes that span a genomic region at regular intervals are used to detect the presence or absence of transcription. This difference means the same sets of biases and the methods for addressing them are unlikely to be relevant to both types of experiment. We introduce the informatics challenges arising in the analysis of tiling microarray experiments as open problems to the scientific community and present initial approaches for the analysis of this nascent technology.
Sharma, Pankaj; Gupta, Neerja; Chowdhury, Madhumita Roy; Sapra, Savita; Ghosh, Manju; Gulati, Sheffali; Kabra, Madhulika
2016-09-15
Intellectual disability (ID)/Global developmental delay (GDD) is a diverse group of disorders in terms of cognitive and non-cognitive functions and can occur with or without associated co-morbidities. It affects 1-3% of individuals globally and in at least 30-50% of cases the etiology remains unexplained. The widespread use of chromosomal microarray analysis (CMA) in a clinical setting has allowed the identification of submicroscopic copy number variations (CNVs), throughout the genome, associated with neurodevelopmental phenotypes including ID/GDD. In this study we investigated the utility of CMA in the detection of CNVs in 106 patients with unexplained ID/DD, dysmorphism with or without multiple congenital anomalies (MCA). CMA study was carried out using Agilent 8×60K chips and Illumina Human CytoSNP-12 chips. Pathogenic CNVs were found in 15 (14.2%) patients. In these patients, CNVs on single chromosome were detected in 10 patients while 5 patients showed co-occurrence CNVs on two chromosomes. The size of these CNVs ranged between 322kb to 13Mb. The yield of pathogenic CNVs was similar for both mild and severe ID/GDD cases. One patient described in this paper is considered to harbour a likely pathogenic CNV with deletion in 17q22 region. Only few cases have been described in literature for 17q22 deletion and patient reported here was found to have an atypical deletion in 17q22 region (Case 90). This study re-affirms the view point that CMA is a powerful diagnostic tool in the evaluation of idiopathic ID/GDD patients irrespective of the degree of severity. Identifying pathogenic CNVs helps in counseling and prenatal diagnosis if desired. Copyright © 2016 Elsevier B.V. All rights reserved.
Gardner, Shea N.; Hall, Barry G.
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four “raw read” genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths. PMID:24349125
Gardner, Shea N; Hall, Barry G
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.
Vallejo, Roger L; Silva, Rafael M O; Evenhuis, Jason P; Gao, Guangtu; Liu, Sixin; Parsons, James E; Martin, Kyle E; Wiens, Gregory D; Lourenco, Daniela A L; Leeds, Timothy D; Palti, Yniv
2018-06-05
Previously accurate genomic predictions for Bacterial cold water disease (BCWD) resistance in rainbow trout were obtained using a medium-density single nucleotide polymorphism (SNP) array. Here, the impact of lower-density SNP panels on the accuracy of genomic predictions was investigated in a commercial rainbow trout breeding population. Using progeny performance data, the accuracy of genomic breeding values (GEBV) using 35K, 10K, 3K, 1K, 500, 300 and 200 SNP panels as well as a panel with 70 quantitative trait loci (QTL)-flanking SNP was compared. The GEBVs were estimated using the Bayesian method BayesB, single-step GBLUP (ssGBLUP) and weighted ssGBLUP (wssGBLUP). The accuracy of GEBVs remained high despite the sharp reductions in SNP density, and even with 500 SNP accuracy was higher than the pedigree-based prediction (0.50-0.56 versus 0.36). Furthermore, the prediction accuracy with the 70 QTL-flanking SNP (0.65-0.72) was similar to the panel with 35K SNP (0.65-0.71). Genomewide linkage disequilibrium (LD) analysis revealed strong LD (r 2 ≥ 0.25) spanning on average over 1 Mb across the rainbow trout genome. This long-range LD likely contributed to the accurate genomic predictions with the low-density SNP panels. Population structure analysis supported the hypothesis that long-range LD in this population may be caused by admixture. Results suggest that lower-cost, low-density SNP panels can be used for implementing genomic selection for BCWD resistance in rainbow trout breeding programs. © 2018 The Authors. This article is a U.S. Government work and is in the public domain in the USA. Journal of Animal Breeding and Genetics published by Blackwell Verlag GmbH.
Zhao, Lan-Juan; Guo, Yan-Fang; Xiong, Dong-Hai; Xiao, Peng; Recker, Robert R; Deng, Hong-Wen
2006-11-01
In light of findings that osteoporosis and obesity may share some common genetic determination and previous reports that RANK (receptor activator of nuclear factor-kappaB) is expressed in skeletal muscles which are important for energy metabolism, we hypothesize that RANK, a gene essential for osteoclastogenesis, is also important for obesity. In order to test the hypothesis with solid data we first performed a linkage analysis around the RANK gene in 4,102 Caucasian subjects from 434 pedigrees, then we genotyped 19 SNPs in or around the RANK gene. A family-based association test (FBAT) was performed with both a quantitative measure of obesity [fat mass, lean mass, body mass index (BMI), and percentage fat mass (PFM)] and a dichotomously defined obesity phenotype-OB (OB if BMI > or = 30 kg/m(2)). In the linkage analysis, an empirical P = 0.004 was achieved at the location of the RANK gene for BMI. Family-based association analysis revealed significant associations of eight SNPs with at least one obesity-related phenotype (P < 0.05). Evidence of association was obtained at SNP10 (P = 0.002) and SNP16 (P = 0.001) with OB; SNP1 with fat mass (P = 0.003); SNP1 (P = 0.003) and SNP7 (P = 0.003) with lean mass; SNP1 (P = 0.002) and SNP7 (P = 0.002) with BMI; SNP1 (P = 0.003), SNP4 (P = 0.007), and SNP7 (P = 0.002) with PFM. In order to deal with the complex multiple testing issues, we performed FBAT multi-marker test (FBAT-MM) to evaluate the association between all the 18 SNPs and each obesity phenotype. The P value is 0.126 for OB, 0.033 for fat mass, 0.021 for lean mass, 0.016 for BMI, and 0.006 for PFM. The haplotype data analyses provide further association evidence. In conclusion, for the first time, our results suggest that RANK is a novel candidate for determination of obesity.
Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S. P.; Snyder, Michael; Harmer, Stacey L.; Zhu, Yu-Xian; Deng, Xing Wang
2009-01-01
We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale. PMID:19802365
Karyotype versus microarray testing for genetic abnormalities after stillbirth.
Reddy, Uma M; Page, Grier P; Saade, George R; Silver, Robert M; Thorsten, Vanessa R; Parker, Corette B; Pinar, Halit; Willinger, Marian; Stoll, Barbara J; Heim-Hall, Josefine; Varner, Michael W; Goldenberg, Robert L; Bukowski, Radek; Wapner, Ronald J; Drews-Botsch, Carolyn D; O'Brien, Barbara M; Dudley, Donald J; Levy, Brynn
2012-12-06
Genetic abnormalities have been associated with 6 to 13% of stillbirths, but the true prevalence may be higher. Unlike karyotype analysis, microarray analysis does not require live cells, and it detects small deletions and duplications called copy-number variants. The Stillbirth Collaborative Research Network conducted a population-based study of stillbirth in five geographic catchment areas. Standardized postmortem examinations and karyotype analyses were performed. A single-nucleotide polymorphism array was used to detect copy-number variants of at least 500 kb in placental or fetal tissue. Variants that were not identified in any of three databases of apparently unaffected persons were then classified into three groups: probably benign, clinical significance unknown, or pathogenic. We compared the results of karyotype and microarray analyses of samples obtained after delivery. In our analysis of samples from 532 stillbirths, microarray analysis yielded results more often than did karyotype analysis (87.4% vs. 70.5%, P<0.001) and provided better detection of genetic abnormalities (aneuploidy or pathogenic copy-number variants, 8.3% vs. 5.8%; P=0.007). Microarray analysis also identified more genetic abnormalities among 443 antepartum stillbirths (8.8% vs. 6.5%, P=0.02) and 67 stillbirths with congenital anomalies (29.9% vs. 19.4%, P=0.008). As compared with karyotype analysis, microarray analysis provided a relative increase in the diagnosis of genetic abnormalities of 41.9% in all stillbirths, 34.5% in antepartum stillbirths, and 53.8% in stillbirths with anomalies. Microarray analysis is more likely than karyotype analysis to provide a genetic diagnosis, primarily because of its success with nonviable tissue, and is especially valuable in analyses of stillbirths with congenital anomalies or in cases in which karyotype results cannot be obtained. (Funded by the Eunice Kennedy Shriver National Institute of Child Health and Human Development.).
Lathi, Ruth B; Gustin, Stephanie L F; Keller, Jennifer; Maisenbacher, Melissa K; Sigurjonsson, Styrmir; Tao, Rosina; Demko, Zach
2014-01-01
To examine the rate of maternal contamination in miscarriage specimens. Retrospective review of 1,222 miscarriage specimens submitted for chromosome testing with detection of maternal cell contamination (MCC). Referral centers requesting genetic testing of miscarriage specimens at a single reference laboratory. Women with pregnancy loss who desire complete chromosome analysis of the pregnancy tissue. Analysis of miscarriage specimens using single-nucleotide polymorphism (SNP) microarray technology with bioinformatics program to detect maternal cell contamination. Chromosome content of miscarriages and incidence of 46,XX results due to MCC. Of the 1,222 samples analyzed, 592 had numeric chromosomal abnormalities, and 630 were normal 46,XX or 46,XY (456 and 187, respectively). In 269 of the 46,XX specimens, MCC with no embryonic component was found. With the exclusion of maternal 46,XX results, the chromosomal abnormality rate increased from 48% to 62%, and the ratio for XX to XY results dropped from 2.6 to 1.0. Over half of the normal 46,XX results in miscarriage specimens were due to MCC. The use of SNPs in MCC testing allows for precise identification of chromosomal abnormalities in miscarriage as well as MCC, improving the accuracy of products of conception testing. Copyright © 2014 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H; Proukakis, Christos
2017-01-01
Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array "waves", and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance.
Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M.; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H.
2017-01-01
Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array “waves”, and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance. PMID:28683077
Analysis of FOXF1 and the FOX gene cluster in patients with VACTERL association
Agochukwu, Nneamaka B.; Pineda-Alvarez, Daniel E.; Keaton, Amelia A.; Warren-Mora, Nicole; Raam, Manu S.; Kamat, Aparna; Chandrasekharappa, Settara C.; Solomon, Benjamin D.
2011-01-01
VACTERL association, a relatively common condition with an incidence of approximately 1 in 20,000 – 35,000 births, is a non-random association of birth defects that includes vertebral defects (V), anal atresia (A), cardiac defects (C), tracheo-esophageal fistula (TE), renal anomalies (R) and limb malformations (L). Although the etiology is unknown in the majority of patients, there is evidence that it is causally heterogeneous. Several studies have shown evidence for inheritance in VACTERL, implying a role for genetic loci. Recently, patients with component features of VACTERL and a lethal developmental pulmonary disorder, alveolar capillary dysplasia with misalignment of pulmonary veins (ACD/MPV), were found to harbor deletions or mutations affecting FOXF1 and the FOX gene cluster on chromosome 16q24. We investigated this gene through direct sequencing and high-density SNP microarray in 12 patients with VACTERL association but without ACD/MPV. Our mutational analysis of FOXF1 showed normal sequences and no genomic imbalances affecting the FOX gene cluster on chromosome 16q24 in the studied patients. Possible explanations for these results include the etiologic and clinical heterogeneity of VACTERL association, the possibility that mutations affecting this gene may occur only in more severely affected individuals, and insufficient study sample size. PMID:21315191
Mokhtar, Siti Shuhada; Marshall, Christian R.; Phipps, Maude E.; Thiruvahindrapuram, Bhooma; Lionel, Anath C.; Scherer, Stephen W.; Peng, Hoh Boon
2014-01-01
Copy number variation (CNV) has been recognized as a major contributor to human genome diversity. It plays an important role in determining phenotypes and has been associated with a number of common and complex diseases. However CNV data from diverse populations is still limited. Here we report the first investigation of CNV in the indigenous populations from Peninsular Malaysia. We genotyped 34 Negrito genomes from Peninsular Malaysia using the Affymetrix SNP 6.0 microarray and identified 48 putative novel CNVs, consisting of 24 gains and 24 losses, of which 5 were identified in at least 2 unrelated samples. These CNVs appear unique to the Negrito population and were absent in the DGV, HapMap3 and Singapore Genome Variation Project (SGVP) datasets. Analysis of gene ontology revealed that genes within these CNVs were enriched in the immune system (GO:0002376), response to stimulus mechanisms (GO:0050896), the metabolic pathways (GO:0001852), as well as regulation of transcription (GO:0006355). Copy number gains in CNV regions (CNVRs) enriched with genes were significantly higher than the losses (P value <0.001). In view of the small population size, relative isolation and semi-nomadic lifestyles of this community, we speculate that these CNVs may be attributed to recent local adaptation of Negritos from Peninsular Malaysia. PMID:24956385
Mokhtar, Siti Shuhada; Marshall, Christian R; Phipps, Maude E; Thiruvahindrapuram, Bhooma; Lionel, Anath C; Scherer, Stephen W; Peng, Hoh Boon
2014-01-01
Copy number variation (CNV) has been recognized as a major contributor to human genome diversity. It plays an important role in determining phenotypes and has been associated with a number of common and complex diseases. However CNV data from diverse populations is still limited. Here we report the first investigation of CNV in the indigenous populations from Peninsular Malaysia. We genotyped 34 Negrito genomes from Peninsular Malaysia using the Affymetrix SNP 6.0 microarray and identified 48 putative novel CNVs, consisting of 24 gains and 24 losses, of which 5 were identified in at least 2 unrelated samples. These CNVs appear unique to the Negrito population and were absent in the DGV, HapMap3 and Singapore Genome Variation Project (SGVP) datasets. Analysis of gene ontology revealed that genes within these CNVs were enriched in the immune system (GO:0002376), response to stimulus mechanisms (GO:0050896), the metabolic pathways (GO:0001852), as well as regulation of transcription (GO:0006355). Copy number gains in CNV regions (CNVRs) enriched with genes were significantly higher than the losses (P value <0.001). In view of the small population size, relative isolation and semi-nomadic lifestyles of this community, we speculate that these CNVs may be attributed to recent local adaptation of Negritos from Peninsular Malaysia.
Jiang, Ni; Zhu, Xishan; Zhang, Hongmei; Wang, Xiaoli; Zhou, Xinna; Gu, Jiezhun; Chen, Baoan; Ren, Jun
2014-01-01
Methylenetetrahydrofolate reductase (MTHFR) is the key enzyme for folate metabolism. Previous studies suggest a relationship between its single nucleotide polymorphisms (SNP) of C677T and A1298C with a variety of tumor susceptibility including hematological malignancy. SNP frequency distribution in different ethnic populations might lead to differences in disease susceptibility. There has been little research in Chinese people on the MTHFR SNP with the susceptibility of the hematological malignancy. Therefore, this study investigated the relationship between MTHFR SNPs and hematological malignancy in Jiangsu province in China. Gene microarray was used to detect MTHFR C677T and A1298C single nucleotide polymorphism loci on 157 healthy controls and 127 patients from Jiangsu province with hematological malignancies (30 with multiple myeloma, 28 with non-Hodgkin's lymphoma, 22 with acute lymphoblastic leukemia, 40 with acute myeloid leukemia, and seven with chronic myeloid leukemia). The allele frequency of 677T was 41.3% in patients and 33.1% in controls, showed significant difference (chi2 = 4.08, p = 0.043); 677TT genotype with a high susceptibility to hematological malignancy (OR 1.96, 95% CI 1.01 - 4.45, p = 0.041). In subgroup analyses, the genotypes 677TT and 1298CC were associated with significantly increased multiple myeloma risk (TT vs. CC: OR 8.92, 95% CI 1.06 - 75.24, p = 0.006; CC vs. AA: OR = 4.80, 95% CI 1.56 - 14.73, p = 0.044). No associations were found between polymorphisms and susceptibilities to acute lymphoblastic leukemia, acute myeloid leukemia, or non-Hodgkin's lymphoma. MTHFRC677T polymorphisms influence the risk of hematological malignancy among the population in Jiangsu province. Both MTHFR 677TT and MTHFR 1298CC genotypes increase susceptibility to myeloid leukemia.
Emerging Use of Gene Expression Microarrays in Plant Physiology
Wullschleger, Stan D.; Difazio, Stephen P.
2003-01-01
Microarrays have become an important technology for the global analysis of gene expression in humans, animals, plants, and microbes. Implemented in the context of a well-designed experiment, cDNA and oligonucleotide arrays can provide highthroughput, simultaneous analysis of transcript abundance for hundreds, if not thousands, of genes. However, despite widespread acceptance, the use of microarrays as a tool to better understand processes of interest to the plant physiologist is still being explored. To help illustrate current uses of microarrays in the plant sciences, several case studies that we believe demonstrate the emerging application of gene expression arrays in plant physiology weremore » selected from among the many posters and presentations at the 2003 Plant and Animal Genome XI Conference. Based on this survey, microarrays are being used to assess gene expression in plants exposed to the experimental manipulation of air temperature, soil water content and aluminium concentration in the root zone. Analysis often includes characterizing transcript profiles for multiple post-treatment sampling periods and categorizing genes with common patterns of response using hierarchical clustering techniques. In addition, microarrays are also providing insights into developmental changes in gene expression associated with fibre and root elongation in cotton and maize, respectively. Technical and analytical limitations of microarrays are discussed and projects attempting to advance areas of microarray design and data analysis are highlighted. Finally, although much work remains, we conclude that microarrays are a valuable tool for the plant physiologist interested in the characterization and identification of individual genes and gene families with potential application in the fields of agriculture, horticulture and forestry.« less
A genome-wide 20 K citrus microarray for gene expression analysis
Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose
2008-01-01
Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to catalogue genes expressed in citrus globular embryos. PMID:18598343
A Java-based tool for the design of classification microarrays.
Meng, Da; Broschat, Shira L; Call, Douglas R
2008-08-04
Classification microarrays are used for purposes such as identifying strains of bacteria and determining genetic relationships to understand the epidemiology of an infectious disease. For these cases, mixed microarrays, which are composed of DNA from more than one organism, are more effective than conventional microarrays composed of DNA from a single organism. Selection of probes is a key factor in designing successful mixed microarrays because redundant sequences are inefficient and limited representation of diversity can restrict application of the microarray. We have developed a Java-based software tool, called PLASMID, for use in selecting the minimum set of probe sequences needed to classify different groups of plasmids or bacteria. The software program was successfully applied to several different sets of data. The utility of PLASMID was illustrated using existing mixed-plasmid microarray data as well as data from a virtual mixed-genome microarray constructed from different strains of Streptococcus. Moreover, use of data from expression microarray experiments demonstrated the generality of PLASMID. In this paper we describe a new software tool for selecting a set of probes for a classification microarray. While the tool was developed for the design of mixed microarrays-and mixed-plasmid microarrays in particular-it can also be used to design expression arrays. The user can choose from several clustering methods (including hierarchical, non-hierarchical, and a model-based genetic algorithm), several probe ranking methods, and several different display methods. A novel approach is used for probe redundancy reduction, and probe selection is accomplished via stepwise discriminant analysis. Data can be entered in different formats (including Excel and comma-delimited text), and dendrogram, heat map, and scatter plot images can be saved in several different formats (including jpeg and tiff). Weights generated using stepwise discriminant analysis can be stored for analysis of subsequent experimental data. Additionally, PLASMID can be used to construct virtual microarrays with genomes from public databases, which can then be used to identify an optimal set of probes.
Tra, Yolande V; Evans, Irene M
2010-01-01
BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course.
Evans, Irene M.
2010-01-01
BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course. PMID:20810954
Chromosomal Microarray versus Karyotyping for Prenatal Diagnosis
Wapner, Ronald J.; Martin, Christa Lese; Levy, Brynn; Ballif, Blake C.; Eng, Christine M.; Zachary, Julia M.; Savage, Melissa; Platt, Lawrence D.; Saltzman, Daniel; Grobman, William A.; Klugman, Susan; Scholl, Thomas; Simpson, Joe Leigh; McCall, Kimberly; Aggarwal, Vimla S.; Bunke, Brian; Nahum, Odelia; Patel, Ankita; Lamb, Allen N.; Thom, Elizabeth A.; Beaudet, Arthur L.; Ledbetter, David H.; Shaffer, Lisa G.; Jackson, Laird
2013-01-01
Background Chromosomal microarray analysis has emerged as a primary diagnostic tool for the evaluation of developmental delay and structural malformations in children. We aimed to evaluate the accuracy, efficacy, and incremental yield of chromosomal microarray analysis as compared with karyotyping for routine prenatal diagnosis. Methods Samples from women undergoing prenatal diagnosis at 29 centers were sent to a central karyotyping laboratory. Each sample was split in two; standard karyotyping was performed on one portion and the other was sent to one of four laboratories for chromosomal microarray. Results We enrolled a total of 4406 women. Indications for prenatal diagnosis were advanced maternal age (46.6%), abnormal result on Down’s syndrome screening (18.8%), structural anomalies on ultrasonography (25.2%), and other indications (9.4%). In 4340 (98.8%) of the fetal samples, microarray analysis was successful; 87.9% of samples could be used without tissue culture. Microarray analysis of the 4282 nonmosaic samples identified all the aneuploidies and unbalanced rearrangements identified on karyotyping but did not identify balanced translocations and fetal triploidy. In samples with a normal karyotype, microarray analysis revealed clinically relevant deletions or duplications in 6.0% with a structural anomaly and in 1.7% of those whose indications were advanced maternal age or positive screening results. Conclusions In the context of prenatal diagnostic testing, chromosomal microarray analysis identified additional, clinically significant cytogenetic information as compared with karyotyping and was equally efficacious in identifying aneuploidies and unbalanced rearrangements but did not identify balanced translocations and triploidies. (Funded by the Eunice Kennedy Shriver National Institute of Child Health and Human Development and others; ClinicalTrials.gov number, NCT01279733.) PMID:23215555
Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar
2016-04-01
Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.
Wang, Yi-Ting; Sung, Pei-Yuan; Lin, Peng-Lin; Yu, Ya-Wen; Chung, Ren-Hua
2015-05-15
Genome-wide association studies (GWAS) have become a common approach to identifying single nucleotide polymorphisms (SNPs) associated with complex diseases. As complex diseases are caused by the joint effects of multiple genes, while the effect of individual gene or SNP is modest, a method considering the joint effects of multiple SNPs can be more powerful than testing individual SNPs. The multi-SNP analysis aims to test association based on a SNP set, usually defined based on biological knowledge such as gene or pathway, which may contain only a portion of SNPs with effects on the disease. Therefore, a challenge for the multi-SNP analysis is how to effectively select a subset of SNPs with promising association signals from the SNP set. We developed the Optimal P-value Threshold Pedigree Disequilibrium Test (OPTPDT). The OPTPDT uses general nuclear families. A variable p-value threshold algorithm is used to determine an optimal p-value threshold for selecting a subset of SNPs. A permutation procedure is used to assess the significance of the test. We used simulations to verify that the OPTPDT has correct type I error rates. Our power studies showed that the OPTPDT can be more powerful than the set-based test in PLINK, the multi-SNP FBAT test, and the p-value based test GATES. We applied the OPTPDT to a family-based autism GWAS dataset for gene-based association analysis and identified MACROD2-AS1 with genome-wide significance (p-value=2.5×10(-6)). Our simulation results suggested that the OPTPDT is a valid and powerful test. The OPTPDT will be helpful for gene-based or pathway association analysis. The method is ideal for the secondary analysis of existing GWAS datasets, which may identify a set of SNPs with joint effects on the disease.
Kumar, V; Yadav, S K
2012-03-01
Green synthesis of nanoparticles is one of the crucial requirements in today's climate change scenario all over the world. In view of this, leaf extract (LE) of Bauhinia variegata L. possessing strong antidiabetic and antibacterial properties has been used to synthesise silver nanoparticles (SNP) in a controlled manner. Various-sized SNP (20-120 nm) were synthesised by varying incubation temperature, silver nitrate and LE concentrations. The rate of SNP synthesis and their size increased with increase in AgNO(3) concentration up to 4 mM. With increase in LE concentration, size and aggregation of SNP was increased. The size and aggregation of SNP were also increased at temperatures above and below 40°C. This has suggested that size and dispersion of SNP can be controlled by varying reaction components and conditions. Polarity-based fractionation of B. variegata LE has suggested that only water-soluble fraction is responsible for SNP synthesis. Fourier transform infrared spectroscopy analysis revealed the attachment of polyphenolic and carbohydrate moieties to SNP. The synthesised SNPs were found stable in double distilled water, BSA and phosphate buffer (pH 7.4). On the contrary, incubation of SNP with NaCl induced aggregation. This suggests the safe use of SNP for various in vivo applications.
snpGeneSets: An R Package for Genome-Wide Study Annotation
Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian
2016-01-01
Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048
Babushok, Daria V.; Xie, Hongbo M.; Roth, Jacquelyn J.; Perdigones, Nieves; Olson, Timothy S.; Cockroft, Joshua D.; Gai, Xiaowu; Perin, Juan C.; Li, Yimei; Paessler, Michele E.; Hakonarson, Hakon; Podsakoff, Gregory M.; Mason, Philip J.; Biegel, Jaclyn A.; Bessler, Monica
2013-01-01
Summary The bone marrow failure syndromes (BMFS) are a heterogeneous group of rare blood disorders characterized by inadequate haematopoiesis, clonal evolution, and increased risk of leukaemia. Single nucleotide polymorphism arrays (SNP-A) have been proposed as a tool for surveillance of clonal evolution in BMFS. To better understand the natural history of BMFS and to assess the clinical utility of SNP-A in these disorders, we analysed 124 SNP-A from a comprehensively characterized cohort of 91 patients at our BMFS centre. SNP-A were correlated with medical histories, haematopathology, cytogenetic and molecular data. To assess clonal evolution, longitudinal analysis of SNP-A was performed in 25 patients. We found that acquired copy number-neutral loss of heterozygosity (CN-LOH) was significantly more frequent in acquired aplastic anaemia (aAA) than in other BMFS (odds ratio 12.2, p<0.01). Homozygosity by descent was most common in congenital BMFS, frequently unmasking autosomal recessive mutations. Copy number variants (CNVs) were frequently polymorphic, and we identified CNVs enriched in neutropenia and aAA. Our results suggest that acquired CN-LOH is a general phenomenon in aAA that is probably mechanistically and prognostically distinct from typical CN-LOH of myeloid malignancies. Our analysis of clinical utility of SNP-A shows the highest yield of detecting new clonal haematopoiesis at diagnosis and at relapse. PMID:24116929
Babushok, Daria V; Xie, Hongbo M; Roth, Jacquelyn J; Perdigones, Nieves; Olson, Timothy S; Cockroft, Joshua D; Gai, Xiaowu; Perin, Juan C; Li, Yimei; Paessler, Michele E; Hakonarson, Hakon; Podsakoff, Gregory M; Mason, Philip J; Biegel, Jaclyn A; Bessler, Monica
2014-01-01
The bone marrow failure syndromes (BMFS) are a heterogeneous group of rare blood disorders characterized by inadequate haematopoiesis, clonal evolution, and increased risk of leukaemia. Single nucleotide polymorphism arrays (SNP-A) have been proposed as a tool for surveillance of clonal evolution in BMFS. To better understand the natural history of BMFS and to assess the clinical utility of SNP-A in these disorders, we analysed 124 SNP-A from a comprehensively characterized cohort of 91 patients at our BMFS centre. SNP-A were correlated with medical histories, haematopathology, cytogenetic and molecular data. To assess clonal evolution, longitudinal analysis of SNP-A was performed in 25 patients. We found that acquired copy number-neutral loss of heterozygosity (CN-LOH) was significantly more frequent in acquired aplastic anaemia (aAA) than in other BMFS (odds ratio 12·2, P < 0·01). Homozygosity by descent was most common in congenital BMFS, frequently unmasking autosomal recessive mutations. Copy number variants (CNVs) were frequently polymorphic, and we identified CNVs enriched in neutropenia and aAA. Our results suggest that acquired CN-LOH is a general phenomenon in aAA that is probably mechanistically and prognostically distinct from typical CN-LOH of myeloid malignancies. Our analysis of clinical utility of SNP-A shows the highest yield of detecting new clonal haematopoiesis at diagnosis and at relapse. © 2013 John Wiley & Sons Ltd.
Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Bellato, Cláudia M; Motilal, Lambert; Zhang, Dapeng
2014-01-15
Cacao (Theobroma cacao L.), the source of cocoa, is an economically important tropical crop. One problem with the premium cacao market is contamination with off-types adulterating raw premium material. Accurate determination of the genetic identity of single cacao beans is essential for ensuring cocoa authentication. Using nanofluidic single nucleotide polymorphism (SNP) genotyping with 48 SNP markers, we generated SNP fingerprints for small quantities of DNA extracted from the seed coat of single cacao beans. On the basis of the SNP profiles, we identified an assumed adulterant variety, which was unambiguously distinguished from the authentic beans by multilocus matching. Assignment tests based on both Bayesian clustering analysis and allele frequency clearly separated all 30 authentic samples from the non-authentic samples. Distance-based principle coordinate analysis further supported these results. The nanofluidic SNP protocol, together with forensic statistical tools, is sufficiently robust to establish authentication and to verify gourmet cacao varieties. This method shows significant potential for practical application.
Loughlin, J; Sinsheimer, J S; Mustafa, Z; Carr, A J; Clipsham, K; Bloomfield, V A; Chitnavis, J; Bailey, A; Sykes, B; Chapman, K
2000-03-01
Evidence has accumulated supporting a role for genes in the etiology of osteoarthritis (OA). Several candidates have been targeted as potential susceptibility loci including genes that are involved in the regulation of bone density. Genetic association analysis has suggested a role for the vitamin D receptor gene (VDR) and the estrogen receptor gene (ER) in susceptibility. Such findings must be tested in additional independent cohorts. We tested for association of these 2 genes, plus a third gene implicated in bone density, COL1A1, with idiopathic OA. A case-control cohort of 371 affected probands and 369 unaffected spouses was used. Association was tested using 4 intragenic single nucleotide polymorphisms (SNP), one each for the VDR and COL1A1 genes, and 2 for the ER gene. The VDR and ER SNP are the same SNP that have been associated with OA. All 4 SNP affect restriction enzyme sites and were genotyped using polymerase chain reaction and enzyme digestion. Allele and genotype distributions for each SNP were compared between cases and controls and analyzed using Fisher's exact test. There was no evidence of association of the VDR or the ER gene SNP to OA. There was weak evidence of association of the COL1A1 SNP in female cases (p = 0.017), reflected by a difference in the distribution of genotypes at this SNP between female cases and controls (p = 0.027). However, when corrected for multiple testing, these results were not significant. If the VDR, ER, or COL1A1 genes do encode predisposition to OA then the 4 SNP tested are not associated with major susceptibility alleles at these 3 loci.
2010-01-01
Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788
Genetic and clinical risk factors of root resorption associated with orthodontic treatment.
Guo, Yujiao; He, Shushu; Gu, Tian; Liu, Yi; Chen, Song
2016-08-01
External apical root resorption (EARR) is a common complication in orthodontic treatment. Despite many studies on EARR, great controversies remain with regard to its risk factors. The objective of this study was to explore the relationship among sex, root movement, IL-1RN single nucleotide polymorphism (SNP) rs419598, IL-6 SNP rs1800796, and EARR associated with orthodontic treatment. Altogether 174 patients (with 174 maxillary left central incisors) were selected for this study. Cone-beam computed tomography was performed before the start of the treatment and at the end of the treatment. Cone-beam computed tomography data were used to reconstruct a 3-dimensional image of each tooth; the volume and the root resorption volume of each tooth were calculated. Three-dimensional matching was used to measure the amount of movement of each root. Genomic DNA was extracted from buccal swabs, and genotypes of SNP rs419598 and SNP rs1800796 of each subject were determined using TaqMan polymerase chain reaction genotyping (Applied Biosystems, Foster City, Calif). The data were analyzed with multiple linear regression analysis. The statistical analysis indicated no relationship between sex, tooth movement amount, and IL-1RN SNP rs419598 with EARR. The IL-6 SNP rs1800796 GC was associated with EARR, and root resorption differed significantly between SNP rs1800796 GC and CC. IL-6 SNP rs1800796 GC is a risk factor for EARR. The amount of root movement, IL-1RN SNP rs419598, and sex as risk factors for EARR need further study. Copyright © 2016 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
Yokoyama, Eiji; Hirai, Shinichiro; Ishige, Taichiro; Murakami, Satoshi
2018-01-02
Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters included 136 strains containing strains from nine outbreaks, with each outbreak caused by a single source contaminated with the organism, as shown by epidemiological contact surveys. WGS data of these strains were used to identify single nucleotide polymorphisms (SNPs) by two methods: short read data were directly mapped to a reference genome (mapping derived SNPs) and common SNPs between the mapping derived SNPs and SNPs in assembled data of short read data (common SNPs). Among both SNPs, those that were detected in genes with a gap were excluded to remove ambiguous SNPs from further analysis. The effectiveness of both SNPs was investigated among all the concatenated SNPs that were detected (whole SNP set); SNPs were divided into three categories based on the genes in which they were located (i.e., backbone SNP set, O-island SNP set, and mobile element SNP set); and SNPs in non-coding regions (intergenic region SNP set). When SNPs from strains isolated from the nine single source derived outbreaks were analyzed using an unweighted pair group method with arithmetic mean tree (UPGMA) and a minimum spanning tree (MST), the maximum pair-wise distances of the backbone SNP set of the mapping derived SNPs were significantly smaller than those of the whole and intergenic region SNP set on both UPGMAs and MSTs. This significant difference was also observed when the backbone SNP set of the common SNPs were examined (Steel-Dwass test, P≤0.01). When the maximum pair-wise distances were compared between the mapping derived and common SNPs, significant differences were observed in those of the whole, mobile element, and intergenic region SNP set (Wilcoxon signed rank test, P≤0.01). When all the strains included in one complex on an MST or one cluster on a UPGMA were designated as the same genotype, the values of the Hunter-Gaston Discriminatory Power Index for the backbone SNP set of the mapping derived and common SNPs were higher than those of other SNP sets. In contrast, the mobile element SNP set could not robustly subdivide lineage I strains of tested O157 strains using both the mapping derived and common SNPs. These results suggested that the backbone SNP set were the most effective for analysis of WGS data for O157 in enabling an appropriation of its molecular epidemiology. Copyright © 2017 Elsevier B.V. All rights reserved.
CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs.
Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long
2016-07-01
SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies.
Direct labeling of serum proteins by fluorescent dye for antibody microarray.
Klimushina, M V; Gumanova, N G; Metelskaya, V A
2017-05-06
Analysis of serum proteome by antibody microarray is used to identify novel biomarkers and to study signaling pathways including protein phosphorylation and protein-protein interactions. Labeling of serum proteins is important for optimal performance of the antibody microarray. Proper choice of fluorescent label and optimal concentration of protein loaded on the microarray ensure good quality of imaging that can be reliably scanned and processed by the software. We have optimized direct serum protein labeling using fluorescent dye Arrayit Green 540 (Arrayit Corporation, USA) for antibody microarray. Optimized procedure produces high quality images that can be readily scanned and used for statistical analysis of protein composition of the serum. Copyright © 2017 Elsevier Inc. All rights reserved.
Bias due to two-stage residual-outcome regression analysis in genetic association studies.
Demissie, Serkalem; Cupples, L Adrienne
2011-11-01
Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, Amanda M.; Daly, Don S.; Willse, Alan R.
The Automated Microarray Image Analysis (AMIA) Toolbox for MATLAB is a flexible, open-source microarray image analysis tool that allows the user to customize analysis of sets of microarray images. This tool provides several methods of identifying and quantify spot statistics, as well as extensive diagnostic statistics and images to identify poor data quality or processing. The open nature of this software allows researchers to understand the algorithms used to provide intensity estimates and to modify them easily if desired.
Standardization of PCR-RFLP analysis of nsSNP rs1468384 of NPC1L1 gene
Balgir, Praveen P.; Khanna, Divya; Kaur, Gurlovleen
2008-01-01
Niemann-Pick C1-like 1 (NPC1L1) protein, a newly identified sterol influx transporter, located at the apical membrane of the enterocyte, which may actively facilitate the uptake of cholesterol by promoting the passage of sterols across the brush border membrane of the enterocyte. It effects intestinal cholesterol absorption and intracellular transport and as such is an integral part of complex process of cholesterol homeostasis. The study of population data for the distribution of these single nucleotide polymorphisms (SNP) of NPC1L1 has lead to the identification of six non-synonymous single nucleotide polymorphisms (nsSNP). The in vitro analysis using the software MuPro and StructureSNP shows that nsSNP M510I (rs1468384), which involves A→G base pair change leads to decrease in the stability of the protein. A reproducible and a cost-effective PCR-RFLP based assay was developed to screen for the SNP among population data. This SNP has been studied in Caucasian, Asian, and African American populations. Till date, no data is available on Indian population. The distribution of M510I NPC1L1 genotype was estimated in the North Western Indian Population as a test case. The allele distribution in Indian Population differs significantly from that of other populations. The methodology thus proved to be robust enough to bring out these differences. PMID:20300301
ELISA-BASE: An Integrated Bioinformatics Tool for Analyzing and Tracking ELISA Microarray Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, Amanda M.; Collett, James L.; Seurynck-Servoss, Shannon L.
ELISA-BASE is an open-source database for capturing, organizing and analyzing protein enzyme-linked immunosorbent assay (ELISA) microarray data. ELISA-BASE is an extension of the BioArray Soft-ware Environment (BASE) database system, which was developed for DNA microarrays. In order to make BASE suitable for protein microarray experiments, we developed several plugins for importing and analyzing quantitative ELISA microarray data. Most notably, our Protein Microarray Analysis Tool (ProMAT) for processing quantita-tive ELISA data is now available as a plugin to the database.
2010-01-01
Background Recent developments in high-throughput methods of analyzing transcriptomic profiles are promising for many areas of biology, including ecophysiology. However, although commercial microarrays are available for most common laboratory models, transcriptome analysis in non-traditional model species still remains a challenge. Indeed, the signal resulting from heterologous hybridization is low and difficult to interpret because of the weak complementarity between probe and target sequences, especially when no microarray dedicated to a genetically close species is available. Results We show here that transcriptome analysis in a species genetically distant from laboratory models is made possible by using MAXRS, a new method of analyzing heterologous hybridization on microarrays. This method takes advantage of the design of several commercial microarrays, with different probes targeting the same transcript. To illustrate and test this method, we analyzed the transcriptome of king penguin pectoralis muscle hybridized to Affymetrix chicken microarrays, two organisms separated by an evolutionary distance of approximately 100 million years. The differential gene expression observed between different physiological situations computed by MAXRS was confirmed by real-time PCR on 10 genes out of 11 tested. Conclusions MAXRS appears to be an appropriate method for gene expression analysis under heterologous hybridization conditions. PMID:20509979
Wang, Zi-nian; Cai, Han-fang; Li, Ming-xun; Cao, Xiu-kai; Lan, Xian-yong; Lei, Chu-zhao; Chen, Hong
2016-01-10
Patatin-like phospholipase domain-containing protein 3 (PNPLA3), a member of the patatin like phospholipase domain-containing (PNPLA) family, plays an important role in energy balance, fat metabolism regulation, glucose metabolism and fatty liver disease. Tetra-primer amplification refractory mutation system PCR (T-ARMS-PCR) is a new method offering fast detection and extreme simplicity at a negligible cost for SNP genotyping. In this paper, we investigated the genetic variations at different ages of 660 Chinese indigenous cattle belonging to three breeds (QC, NY, JX) and applied T-ARMS-PCR and PCR-RFLP methods to genotype four SNPs, SNP1: g.A2980G, SNP2: g.A2996T, SNP3: g.A36718G, SNP4: g.G36850A. The statistical analyses indicated that these 4 SNPs affected growth traits markedly (P<0.05) in QC population, whereas combined haplotypes were not (P>0.05). The qPCR (quantitative PCR) indicated that bovine PNPLA3 gene was exclusively expressed in fat tissues. Besides, the analysis between SNP and mRNA expression revealed that, in SNP1, the expression of AG was much higher than AA and GG (P<0.05), which was in accordance with the results of growth traits association analysis, while the results of SNP4 was not. These results supported high potential that SNPs of bovine PNPLA3 gene might be utilized as genetic markers in marker-assisted selection (MAS) for Chinese cattle breeding programs. Copyright © 2015 Elsevier B.V. All rights reserved.
A Human Lectin Microarray for Sperm Surface Glycosylation Analysis *
Sun, Yangyang; Cheng, Li; Gu, Yihua; Xin, Aijie; Wu, Bin; Zhou, Shumin; Guo, Shujuan; Liu, Yin; Diao, Hua; Shi, Huijuan; Wang, Guangyu; Tao, Sheng-ce
2016-01-01
Glycosylation is one of the most abundant and functionally important protein post-translational modifications. As such, technology for efficient glycosylation analysis is in high demand. Lectin microarrays are a powerful tool for such investigations and have been successfully applied for a variety of glycobiological studies. However, most of the current lectin microarrays are primarily constructed from plant lectins, which are not well suited for studies of human glycosylation because of the extreme complexity of human glycans. Herein, we constructed a human lectin microarray with 60 human lectin and lectin-like proteins. All of the lectins and lectin-like proteins were purified from yeast, and most showed binding to human glycans. To demonstrate the applicability of the human lectin microarray, human sperm were probed on the microarray and strong bindings were observed for several lectins, including galectin-1, 7, 8, GalNAc-T6, and ERGIC-53 (LMAN1). These bindings were validated by flow cytometry and fluorescence immunostaining. Further, mass spectrometry analysis showed that galectin-1 binds several membrane-associated proteins including heat shock protein 90. Finally, functional assays showed that binding of galectin-8 could significantly enhance the acrosome reaction within human sperms. To our knowledge, this is the first construction of a human lectin microarray, and we anticipate it will find wide use for a range of human or mammalian studies, alone or in combination with plant lectin microarrays. PMID:27364157
Telfer, Emily J; Stovold, Grahame T; Li, Yongjun; Silva-Junior, Orzenil B; Grattapaglia, Dario G; Dungey, Heidi S
2015-01-01
Pedigree reconstruction using molecular markers enables efficient management of inbreeding in open-pollinated breeding strategies, replacing expensive and time-consuming controlled pollination. This is particularly useful in preferentially outcrossed, insect pollinated Eucalypts known to suffer considerable inbreeding depression from related matings. A single nucleotide polymorphism (SNP) marker panel consisting of 106 markers was selected for pedigree reconstruction from the recently developed high-density Eucalyptus Infinium SNP chip (EuCHIP60K). The performance of this SNP panel for pedigree reconstruction in open-pollinated progenies of two Eucalyptus nitens seed orchards was compared with that of two microsatellite panels with 13 and 16 markers respectively. The SNP marker panel out-performed one of the microsatellite panels in the resolution power to reconstruct pedigrees and out-performed both panels with respect to data quality. Parentage of all but one offspring in each clonal seed orchard was correctly matched to the expected seed parent using the SNP marker panel, whereas parentage assignment to less than a third of the expected seed parents were supported using the 13-microsatellite panel. The 16-microsatellite panel supported all but one of the recorded seed parents, one better than the SNP panel, although there was still a considerable level of missing and inconsistent data. SNP marker data was considerably superior to microsatellite data in accuracy, reproducibility and robustness. Although microsatellites and SNPs data provide equivalent resolution for pedigree reconstruction, microsatellite analysis requires more time and experience to deal with the uncertainties of allele calling and faces challenges for data transferability across labs and over time. While microsatellite analysis will continue to be useful for some breeding tasks due to the high information content, existing infrastructure and low operating costs, the multi-species SNP resource available with the EuCHIP60k, opens a whole new array of opportunities for high-throughput, genome-wide or targeted genotyping in species of Eucalyptus.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gentry, T.; Schadt, C.; Zhou, J.
Microarray technology has the unparalleled potential tosimultaneously determine the dynamics and/or activities of most, if notall, of the microbial populations in complex environments such as soilsand sediments. Researchers have developed several types of arrays thatcharacterize the microbial populations in these samples based on theirphylogenetic relatedness or functional genomic content. Several recentstudies have used these microarrays to investigate ecological issues;however, most have only analyzed a limited number of samples withrelatively few experiments utilizing the full high-throughput potentialof microarray analysis. This is due in part to the unique analyticalchallenges that these samples present with regard to sensitivity,specificity, quantitation, and data analysis. Thismore » review discussesspecific applications of microarrays to microbial ecology research alongwith some of the latest studies addressing the difficulties encounteredduring analysis of complex microbial communities within environmentalsamples. With continued development, microarray technology may ultimatelyachieve its potential for comprehensive, high-throughput characterizationof microbial populations in near real-time.« less
Kerr, Jonathan R; Kaushik, Narendra; Fear, David; Baldwin, Don A; Nuwaysir, Emile F; Adcock, Ian M
2005-07-15
This study was undertaken to further examine the role of the host response to parvovirus B19 in the development of symptoms and consequences of viral persistence. Genomic DNA from 42 patients with symptomatic B19 infection was analyzed using the HuSNP assay (Affymetrix), and the results were compared with those from analysis of 53 healthy control individuals. Fifty-seven single-nucleotide polymorphisms were identified that were significantly associated with symptomatic infection. Total RNA from peripheral blood mononuclear cells from 57 B19-seropositive and 13 B19-seronegative donors was analyzed by hybridization to a single-color microarray representing 9522 human genes. Ninety-two genes were shown to be differentially expressed. Differential expression was confirmed in 6 of 38 genes (SKIP, MACF1, SPAG7, FLOT1, c6orf48, and RASSF5) tested using real-time quantitative polymerase chain reaction in a different group of healthy subjects. Genes identified in both studies play a functional role in the cytoskeleton, integrin signaling, and oncosuppression, themes that have been shown to be important in parvovirus infections.
SNP_tools: A compact tool package for analysis and conversion of genotype data for MS-Excel
Chen, Bowang; Wilkening, Stefan; Drechsel, Marion; Hemminki, Kari
2009-01-01
Background Single nucleotide polymorphism (SNP) genotyping is a major activity in biomedical research. Scientists prefer to have a facile access to the results which may require conversions between data formats. First hand SNP data is often entered in or saved in the MS-Excel format, but this software lacks genetic and epidemiological related functions. A general tool to do basic genetic and epidemiological analysis and data conversion for MS-Excel is needed. Findings The SNP_tools package is prepared as an add-in for MS-Excel. The code is written in Visual Basic for Application, embedded in the Microsoft Office package. This add-in is an easy to use tool for users with basic computer knowledge (and requirements for basic statistical analysis). Conclusion Our implementation for Microsoft Excel 2000-2007 in Microsoft Windows 2000, XP, Vista and Windows 7 beta can handle files in different formats and converts them into other formats. It is a free software. PMID:19852806
SNP_tools: A compact tool package for analysis and conversion of genotype data for MS-Excel.
Chen, Bowang; Wilkening, Stefan; Drechsel, Marion; Hemminki, Kari
2009-10-23
Single nucleotide polymorphism (SNP) genotyping is a major activity in biomedical research. Scientists prefer to have a facile access to the results which may require conversions between data formats. First hand SNP data is often entered in or saved in the MS-Excel format, but this software lacks genetic and epidemiological related functions. A general tool to do basic genetic and epidemiological analysis and data conversion for MS-Excel is needed. The SNP_tools package is prepared as an add-in for MS-Excel. The code is written in Visual Basic for Application, embedded in the Microsoft Office package. This add-in is an easy to use tool for users with basic computer knowledge (and requirements for basic statistical analysis). Our implementation for Microsoft Excel 2000-2007 in Microsoft Windows 2000, XP, Vista and Windows 7 beta can handle files in different formats and converts them into other formats. It is a free software.
Sheela, Shekaraiah; Aithal, Venkataraja U; Rajashekhar, Bellur; Lewis, Melissa Glenda
2016-01-01
Tracheoesophageal (TE) prosthetic voice is one of the voice restoration options for individuals who have undergone a total laryngectomy. Aerodynamic analysis of the TE voice provides insight into the physiological changes that occur at the level of the neoglottis with voice prosthesis in situ. The present study is a systematic review and meta-analysis of sub-neoglottic pressure (SNP) measurement in TE speakers by direct and indirect methods. The screening of abstracts and titles was carried out for inclusion of articles using 10 electronic databases spanning the period from 1979 to 2016. Ten articles which met the inclusion criteria were considered for meta-analysis with a pooled age range of 40-83 years. The pooled mean SNP obtained from the direct measurement method was 53.80 cm H2O with a 95% confidence interval of 21.14-86.46 cm H2O, while for the indirect measurement method, the mean SNP was 23.55 cm H2O with a 95% confidence interval of 19.23-27.87 cm H2O. Based on the literature review, the various procedures followed for direct and indirect measurements of SNP contributed to a range of differences in outcome measures. The meta-analysis revealed that the "interpolation method" for indirect estimation of SNP was the most acceptable and valid method in TE speakers. © 2017 S. Karger AG, Basel.
PRACTICAL STRATEGIES FOR PROCESSING AND ANALYZING SPOTTED OLIGONUCLEOTIDE MICROARRAY DATA
Thoughtful data analysis is as important as experimental design, biological sample quality, and appropriate experimental procedures for making microarrays a useful supplement to traditional toxicology. In the present study, spotted oligonucleotide microarrays were used to profile...
Rice SNP-seek database update: new SNPs, indels, and queries.
Mansueto, Locedie; Fuentes, Roven Rommel; Borja, Frances Nikki; Detras, Jeffery; Abriol-Santos, Juan Miguel; Chebotarov, Dmytro; Sanciangco, Millicent; Palis, Kevin; Copetti, Dario; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Wing, Rod A; Hamilton, Ruaraidh Sackville; Mauleon, Ramil; McNally, Kenneth L; Alexandrov, Nickolai
2017-01-04
We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sengupta Chattopadhyay, Amrita; Hsiao, Ching-Lin; Chang, Chien Ching; Lian, Ie-Bin; Fann, Cathy S J
2014-01-01
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods-Gini, absolute probability difference (APD), and entropy-to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (<0.05) compared to GS, APDS, ES (>0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. © 2013 Elsevier B.V. All rights reserved.
Silver nano fabrication using leaf disc of Passiflora foetida Linn
NASA Astrophysics Data System (ADS)
Lade, Bipin D.; Patil, Anita S.
2017-06-01
The main purpose of the experiment is to develop a greener low cost SNP fabrication steps using factories of secondary metabolites from Passiflora leaf extract. Here, the leaf extraction process is omitted, and instead a leaf disc was used for stable SNP fabricated by optimizing parameters such as a circular leaf disc of 2 cm (1, 2, 3, 4, 5) instead of leaf extract and grade of pH (7, 8, 9, 11). The SNP synthesis reaction is tried under room temperature, sun, UV and dark condition. The leaf disc preparation steps are also discussed in details. The SNP obtained using (1 mM: 100 ml AgNO3+ singular leaf disc: pH 9, 11) is applied against featured room temperature and sun condition. The UV spectroscopic analysis confirms that sun rays synthesized SNP yields stable nano particles. The FTIR analysis confirms a large number of functional groups such as alkanes, alkyne, amines, aliphatic amine, carboxylic acid; nitro-compound, alcohol, saturated aldehyde and phenols involved in reduction of silver salt to zero valent ions. The leaf disc mediated synthesis of silver nanoparticles, minimizes leaf extract preparation step and eligible for stable SNP synthesis. The methods sun and room temperature based nano particles synthesized within 10 min would be use certainly for antimicrobial activity.
Duellman, Tyler; Warren, Christopher; Yang, Jay
2014-01-01
Microribonucleic acids (miRNAs) work with exquisite specificity and are able to distinguish a target from a non-target based on a single nucleotide mismatch in the core nucleotide domain. We questioned whether miRNA regulation of gene expression could occur in a single nucleotide polymorphism (SNP)-specific manner, manifesting as a post-transcriptional control of expression of genetic polymorphisms. In our recent study of the functional consequences of matrix metalloproteinase (MMP)-9 SNPs, we discovered that expression of a coding exon SNP in the pro-domain of the protein resulted in a profound decrease in the secreted protein. This missense SNP results in the N38S amino acid change and a loss of an N-glycosylation site. A systematic study demonstrated that the loss of secreted protein was due not to the loss of an N-glycosylation site, but rather an SNP-specific targeting by miR-671-3p and miR-657. Bioinformatics analysis identified 41 SNP-specific miRNA targeting MMP-9 SNPs, mostly in the coding exon and an extension of the analysis to chromosome 20, where the MMP-9 gene is located, suggesting that SNP-specific miRNAs targeting the coding exon are prevalent. This selective post-transcriptional regulation of a target messenger RNA harboring genetic polymorphisms by miRNAs offers an SNP-dependent post-transcriptional regulatory mechanism, allowing for polymorphic-specific differential gene regulation. PMID:24627221
Lavania, M; Jadhav, R S; Turankar, R P; Chaitanya, V S; Singh, M; Sengupta, U
2013-11-01
Earlier studies indicate that genotyping of Mycobaterium leprae based on single-nucleotide polymorphisms (SNPs) is useful for analysis of the global spread of leprosy. In the present study, we investigated the diversity of M. leprae at eight SNP loci using 180 clinical isolates obtained from patients with leprosy residing mainly in Delhi and Purulia (West Bengal) regions. It was observed that the frequency of SNP type 1 and subtype D was most predominant in the Indian population. Further, the SNP type 2 subtype E was noted only from East Delhi region and SNP type 2 subtype G was noted only from the nearby areas of Hoogly district of West Bengal. These results indicate the occurrence of focal transmission of M. leprae infection and demonstrate that analysis by SNP typing has great potential to help researchers in understanding the transmission of M. leprae infection in the community. © 2013 The Authors Clinical Microbiology and Infection © 2013 European Society of Clinical Microbiology and Infectious Diseases.
Bungartz, Annemarie; Klaus, Marius; Mathew, Boby; Léon, Jens; Naz, Ali Ahmad
2016-03-01
The aim of the present study was to develop a new cost effective PCR based CAPS marker set using advantages of high-throughput SNP genotyping. Initially, SNP survey was made using 20 diverse barley genotypes via 9k iSelect array genotyping that resulted in 6334 polymorphic SNP markers. Principle component analysis using this marker data showed fine differentiation of barley diverse gene pool. Till this end, we developed 200 SNP derived CAPS markers distributed across the genome covering around 991cM with an average marker density of 5.09cM. Further, we genotyped 68 CAPS markers in an F2 population (Cheri×ICB181160) segregating for seed color variation in barley. Genetic mapping of seed color revealed putative linkage of single nuclear gene on chromosome 1H. These findings showed the proof of concept for the development and utility of a newer cost effective genomic tool kit to analyze broader genetic resources of barley worldwide. Copyright © 2016 Elsevier Inc. All rights reserved.
Dar, Sajad Ahmad; Akhter, Naseem; Haque, Shafiul; Singh, Taru; Mandal, Raju Kumar; Ramachandran, Vishnampettai Ganapathysubramanian; Bhattacharya, Sambit Nath; Banerjee, Basu Dev; Das, Shukla
2016-01-01
Pemphigus is an autoimmune blistering disorder of skin and/or mucosal surfaces characterized by intraepithelial lesions and immunoglobulin-G autoantibodies against desmogleins (proteins critical in cell-to-cell adhesion). Genetic, immunological, hormonal, and environmental factors are known to contribute to its etiology. Tumor necrosis factor-alpha (TNF-α) which plays a key role in pathogenesis of many infectious and inflammatory diseases has been found in high levels in lesional skin and sera of pemphigus patients. However, studies on association of single nucleotide polymorphism (SNP) in promoter region of TNF-α at position -308 affecting G to A transition with pemphigus has been scarce. This study was conducted to evaluate the TNF-α -308G/A SNP distribution in North Indian cohort, and to define the association between the TNF-α -308G/A SNP distribution and pemphigus, globally, by means of meta-analysis. TNF-α -308G/A SNP in pemphigus patients was investigated by cytokine genotyping using genomic DNA by PCR with sequence-specific primers. Meta-analysis of the data, including four previously published studies from other populations, was performed to generate a meaningful relationship. The results of our case-control study indicate non-significant differences between patients and controls in TNF-α -308G/A SNP. The meta-analysis also revealed that TNF-α -308G/A SNP is not associated with pemphigus risk in population at large; however, it may be contributing towards autoimmune phenomenon in pemphigus by being a part of its multi-factorial etiology. This study provides evidence that the TNF-α -308G/A polymorphism is not associated with overall pemphigus susceptibility. Nevertheless, further studies on specific ethnicity and pemphigus variants are necessary to validate the findings.
Soler, Stephan; Rittore, Cécile; Touitou, Isabelle; Philibert, Laurent
2011-02-20
From the wide range of methods currently available for genotyping, we wished to identify a quick, reliable and affordable approach for routine use in our laboratory for LTA+252 C>T SNP screening. We set up and compared three genotyping methods for SNP detection: restriction fragment length polymorphism (RFLP), tetra primer amplification refractory mutation system PCR (TPAP) and unlabeled probe melting analysis (UPMA). The SNP model used was LTA+252 C>T, a cytokine gene polymorphism that has been associated with response to treatment in rheumatoid arthritis. The study was performed using 46 samples from healthy Caucasian volunteers. Allele and genotype distribution was similar to that previously described in the same population. All three genotyping methods showed good reproducibility and are suitable for a medium scale throughput molecular platform. UPMA was the most cost effective, reliable and safe method since it required the shortest technician time, could be performed in a single closed tube and involved automatic data analysis. This work is the first to compare these three genotyping techniques and provides evidence for UPMA being the method of choice for LTA+252 C>T SNP genotyping. Copyright © 2010 Elsevier B.V. All rights reserved.
GeneXplorer: an interactive web application for microarray data visualization and analysis.
Rees, Christian A; Demeter, Janos; Matese, John C; Botstein, David; Sherlock, Gavin
2004-10-01
When publishing large-scale microarray datasets, it is of great value to create supplemental websites where either the full data, or selected subsets corresponding to figures within the paper, can be browsed. We set out to create a CGI application containing many of the features of some of the existing standalone software for the visualization of clustered microarray data. We present GeneXplorer, a web application for interactive microarray data visualization and analysis in a web environment. GeneXplorer allows users to browse a microarray dataset in an intuitive fashion. It provides simple access to microarray data over the Internet and uses only HTML and JavaScript to display graphic and annotation information. It provides radar and zoom views of the data, allows display of the nearest neighbors to a gene expression vector based on their Pearson correlations and provides the ability to search gene annotation fields. The software is released under the permissive MIT Open Source license, and the complete documentation and the entire source code are freely available for download from CPAN http://search.cpan.org/dist/Microarray-GeneXplorer/.
Nanotechnology: moving from microarrays toward nanoarrays.
Chen, Hua; Li, Jun
2007-01-01
Microarrays are important tools for high-throughput analysis of biomolecules. The use of microarrays for parallel screening of nucleic acid and protein profiles has become an industry standard. A few limitations of microarrays are the requirement for relatively large sample volumes and elongated incubation time, as well as the limit of detection. In addition, traditional microarrays make use of bulky instrumentation for the detection, and sample amplification and labeling are quite laborious, which increase analysis cost and delays the time for obtaining results. These problems limit microarray techniques from point-of-care and field applications. One strategy for overcoming these problems is to develop nanoarrays, particularly electronics-based nanoarrays. With further miniaturization, higher sensitivity, and simplified sample preparation, nanoarrays could potentially be employed for biomolecular analysis in personal healthcare and monitoring of trace pathogens. In this chapter, it is intended to introduce the concept and advantage of nanotechnology and then describe current methods and protocols for novel nanoarrays in three aspects: (1) label-free nucleic acids analysis using nanoarrays, (2) nanoarrays for protein detection by conventional optical fluorescence microscopy as well as by novel label-free methods such as atomic force microscopy, and (3) nanoarray for enzymatic-based assay. These nanoarrays will have significant applications in drug discovery, medical diagnosis, genetic testing, environmental monitoring, and food safety inspection.
A meta-data based method for DNA microarray imputation.
Jörnsten, Rebecka; Ouyang, Ming; Wang, Hui-Yu
2007-03-29
DNA microarray experiments are conducted in logical sets, such as time course profiling after a treatment is applied to the samples, or comparisons of the samples under two or more conditions. Due to cost and design constraints of spotted cDNA microarray experiments, each logical set commonly includes only a small number of replicates per condition. Despite the vast improvement of the microarray technology in recent years, missing values are prevalent. Intuitively, imputation of missing values is best done using many replicates within the same logical set. In practice, there are few replicates and thus reliable imputation within logical sets is difficult. However, it is in the case of few replicates that the presence of missing values, and how they are imputed, can have the most profound impact on the outcome of downstream analyses (e.g. significance analysis and clustering). This study explores the feasibility of imputation across logical sets, using the vast amount of publicly available microarray data to improve imputation reliability in the small sample size setting. We download all cDNA microarray data of Saccharomyces cerevisiae, Arabidopsis thaliana, and Caenorhabditis elegans from the Stanford Microarray Database. Through cross-validation and simulation, we find that, for all three species, our proposed imputation using data from public databases is far superior to imputation within a logical set, sometimes to an astonishing degree. Furthermore, the imputation root mean square error for significant genes is generally a lot less than that of non-significant ones. Since downstream analysis of significant genes, such as clustering and network analysis, can be very sensitive to small perturbations of estimated gene effects, it is highly recommended that researchers apply reliable data imputation prior to further analysis. Our method can also be applied to cDNA microarray experiments from other species, provided good reference data are available.
2010-01-01
Background The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells. Results The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques. Conclusions BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets. PMID:20128918
USDA-ARS?s Scientific Manuscript database
Stemphylium leaf spot, caused by Stemphylium botryosum f. sp. spinacia is an important disease in spinach. Use of genetic resistance is an efficient, economic and environment-friendly method to control this disease. The objective of this research was to conduct association analysis and identify SNP ...
Yang, Zhe; Zhou, Lin; Wu, Li-Ming; Xie, Hai-Yang; Zhang, Feng; Zheng, Shu-Sen
2010-12-01
Histone deacetylases (HDACs) have been reported to be poor prognostic indicators in patients with cancer. However, no data are available for the role of single nucleotide polymorphism (SNP) of class I HDAC in hepato-cellular carcinoma (HCC). Therefore, we investigated the association of class I HDAC isoforms genomic polymorphisms with risk of HCC and tumor recurrence following liver transplantation (LT). One hundred and ninety-six Chinese subjects consisting of 97 HCC patients and 99 controls were enrolled in this study. Nine polymorphisms of the HDAC1, HDAC2, and HDAC3 gene (rs2530223, rs1741981, rs2547547, rs13204445, rs6568819, rs10499080, rs11741808, rs2475631, rs11391) were examined using Applied Biosystems SNaP-Shot and TaqMan technology. We found no significant difference in genotype frequencies between the HCC cases and controls. In terms of tumor recurrence following LT, patients carrying the T allele of HDAC1 SNP rs1741981 showed a favorable outcome for recurrence free survival when compared with patients homozygous for CC. In addition, the same significant trend was observed in HDAC3 SNP rs2547547. Kaplan-Meier analysis showed that the combination of the T variant allele (CT+TT) of HDAC1 SNP rs1741981 and the homozygous TT variant allele of HDAC3 SNP rs2547547 was the most favorable prognostic factor. The risk for postoperative tumor recurrence was about 2.2-fold lower for patients with this genotype combination compared with carriers of the HDAC1 SNP rs1741981 CC and HDAC3 SNP rs2547547 CT genotype combination (hazard ratio: 2.235, p=0.003). Our data suggest that combined analysis of HDAC1 SNP rs1741981 and HDAC3 SNP rs2547547 may be a potential genetic marker for HCC recurrence in LT patients.
Bu, Huajie; Narisu, Narisu; Schlick, Bettina; Rainer, Johannes; Manke, Thomas; Schäfer, Georg; Pasqualini, Lorenza; Chines, Peter; Schweiger, Michal R.; Fuchsberger, Christian
2015-01-01
ABSTRACT Genome‐wide association studies have identified genomic loci, whose single‐nucleotide polymorphisms (SNPs) predispose to prostate cancer (PCa). However, the mechanisms of most of these variants are largely unknown. We integrated chromatin‐immunoprecipitation‐coupled sequencing and microarray expression profiling in TMPRSS2‐ERG gene rearrangement positive DUCaP cells with the GWAS PCa risk SNPs catalog to identify disease susceptibility SNPs localized within functional androgen receptor‐binding sites (ARBSs). Among the 48 GWAS index risk SNPs and 3,917 linked SNPs, 80 were found located in ARBSs. Of these, rs11891426:T>G in an intron of the melanophilin gene (MLPH) was within a novel putative auxiliary AR‐binding motif, which is enriched in the neighborhood of canonical androgen‐responsive elements. T→G exchange attenuated the transcriptional activity of the ARBS in an AR reporter gene assay. The expression of MLPH in primary prostate tumors was significantly lower in those with the G compared with the T allele and correlated significantly with AR protein. Higher melanophilin level in prostate tissue of patients with a favorable PCa risk profile points out a tumor‐suppressive effect. These results unravel a hidden link between AR and a functional putative PCa risk SNP, whose allele alteration affects androgen regulation of its host gene MLPH. PMID:26411452
Genome-wide Target Enrichment-aided Chip Design: a 66 K SNP Chip for Cashmere Goat.
Qiao, Xian; Su, Rui; Wang, Yang; Wang, Ruijun; Yang, Ting; Li, Xiaokai; Chen, Wei; He, Shiyang; Jiang, Yu; Xu, Qiwu; Wan, Wenting; Zhang, Yaolei; Zhang, Wenguang; Chen, Jiang; Liu, Bin; Liu, Xin; Fan, Yixing; Chen, Duoyuan; Jiang, Huaizhi; Fang, Dongming; Liu, Zhihong; Wang, Xiaowen; Zhang, Yanjun; Mao, Danqing; Wang, Zhiying; Di, Ran; Zhao, Qianjun; Zhong, Tao; Yang, Huanming; Wang, Jian; Wang, Wen; Dong, Yang; Chen, Xiaoli; Xu, Xun; Li, Jinquan
2017-08-17
Compared with the commercially available single nucleotide polymorphism (SNP) chip based on the Bead Chip technology, the solution hybrid selection (SHS)-based target enrichment SNP chip is not only design-flexible, but also cost-effective for genotype sequencing. In this study, we propose to design an animal SNP chip using the SHS-based target enrichment strategy for the first time. As an update to the international collaboration on goat research, a 66 K SNP chip for cashmere goat was created from the whole-genome sequencing data of 73 individuals. Verification of this 66 K SNP chip with the whole-genome sequencing data of 436 cashmere goats showed that the SNP call rates was between 95.3% and 99.8%. The average sequencing depth for target SNPs were 40X. The capture regions were shown to be 200 bp that flank target SNPs. This chip was further tested in a genome-wide association analysis of cashmere fineness (fiber diameter). Several top hit loci were found marginally associated with signaling pathways involved in hair growth. These results demonstrate that the 66 K SNP chip is a useful tool in the genomic analyses of cashmere goats. The successful chip design shows that the SHS-based target enrichment strategy could be applied to SNP chip design in other species.
van Huet, Ramon A. C.; Pierrache, Laurence H.M.; Meester-Smoor, Magda A.; Klaver, Caroline C.W.; van den Born, L. Ingeborgh; Hoyng, Carel B.; de Wijs, Ilse J.; Collin, Rob W. J.; Hoefsloot, Lies H.
2015-01-01
Purpose To determine the efficacy of multiple versions of a commercially available arrayed primer extension (APEX) microarray chip for autosomal recessive retinitis pigmentosa (arRP). Methods We included 250 probands suspected of arRP who were genetically analyzed with the APEX microarray between January 2008 and November 2013. The mode of inheritance had to be autosomal recessive according to the pedigree (including isolated cases). If the microarray identified a heterozygous mutation, we performed Sanger sequencing of exons and exon–intron boundaries of that specific gene. The efficacy of this microarray chip with the additional Sanger sequencing approach was determined by the percentage of patients that received a molecular diagnosis. We also collected data from genetic tests other than the APEX analysis for arRP to provide a detailed description of the molecular diagnoses in our study cohort. Results The APEX microarray chip for arRP identified the molecular diagnosis in 21 (8.5%) of the patients in our cohort. Additional Sanger sequencing yielded a second mutation in 17 patients (6.8%), thereby establishing the molecular diagnosis. In total, 38 patients (15.2%) received a molecular diagnosis after analysis using the microarray and additional Sanger sequencing approach. Further genetic analyses after a negative result of the arRP microarray (n = 107) resulted in a molecular diagnosis of arRP (n = 23), autosomal dominant RP (n = 5), X-linked RP (n = 2), and choroideremia (n = 1). Conclusions The efficacy of the commercially available APEX microarray chips for arRP appears to be low, most likely caused by the limitations of this technique and the genetic and allelic heterogeneity of RP. Diagnostic yields up to 40% have been reported for next-generation sequencing (NGS) techniques that, as expected, thereby outperform targeted APEX analysis. PMID:25999674
Mitchell, Sarah G; Bunting, Silvia T; Saxe, Debra; Olson, Thomas; Keller, Frank G
2017-04-01
An activating point mutation of the c-KIT tyrosine kinase receptor gene, D816H, has been described in germ cell tumors (GCTs). We report an adolescent diagnosed with an ovarian mixed GCT and systemic mastocytosis with chronic myelomonocytic leukemia (SM-CMML). The teratoma and dysgerminoma differed by copy number aberrations via single nucleotide polymorphism (SNP) microarray, but were inclusive of the same c-KIT D816H point mutation (c.2446G>C) also identified in blood and bone marrow mast cells. These findings indicate not only a clonal origin of the GCT and hematologic malignancy, but also suggest a rare KIT mutation may be playing a fundamental role in malignancy development. © 2016 Wiley Periodicals, Inc.
A global reference for human genetic variation
2016-01-01
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. PMID:26432245
Dapia, Irene; Tong, Hoi Y.; Arias, Pedro; Muñoz, Mario; Tenorio, Jair; Hernández, Rafael; García García, Irene; Gordo, Gema; Ramírez, Elena; Frías, Jesús; Lapunzina, Pablo; Carcas, Antonio J.
2017-01-01
Abstract In 2014, we established a pharmacogenetics unit with the intention of facilitating the integration of pharmacogenetic testing into clinical practice. This unit was centered around two main ideas: i) individualization of clinical recommendations, and ii) preemptive genotyping in risk populations. Our unit is based on the design and validation of a single nucleotide polymorphism (SNP) microarray, which has allowed testing of 180 SNPs associated with drug response (PharmArray), and clinical consultation regarding the results. Herein, we report our experience in integrating pharmacogenetic testing into our hospital and we present the results of the 2,539 pharmacogenetic consultation requests received over the past 3 years in our unit. The results demonstrate the feasibility of implementing pharmacogenetic testing in clinical practice within a national health system. PMID:29193749
Principles of gene microarray data analysis.
Mocellin, Simone; Rossi, Carlo Riccardo
2007-01-01
The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.
Schönmann, Susan; Loy, Alexander; Wimmersberger, Céline; Sobek, Jens; Aquino, Catharine; Vandamme, Peter; Frey, Beat; Rehrauer, Hubert; Eberl, Leo
2009-04-01
For cultivation-independent and highly parallel analysis of members of the genus Burkholderia, an oligonucleotide microarray (phylochip) consisting of 131 hierarchically nested 16S rRNA gene-targeted oligonucleotide probes was developed. A novel primer pair was designed for selective amplification of a 1.3 kb 16S rRNA gene fragment of Burkholderia species prior to microarray analysis. The diagnostic performance of the microarray for identification and differentiation of Burkholderia species was tested with 44 reference strains of the genera Burkholderia, Pandoraea, Ralstonia and Limnobacter. Hybridization patterns based on presence/absence of probe signals were interpreted semi-automatically using the novel likelihood-based strategy of the web-tool Phylo- Detect. Eighty-eight per cent of the reference strains were correctly identified at the species level. The evaluated microarray was applied to investigate shifts in the Burkholderia community structure in acidic forest soil upon addition of cadmium, a condition that selected for Burkholderia species. The microarray results were in agreement with those obtained from phylogenetic analysis of Burkholderia 16S rRNA gene sequences recovered from the same cadmiumcontaminated soil, demonstrating the value of the Burkholderia phylochip for determinative and environmental studies.
Support vector machine and principal component analysis for microarray data classification
NASA Astrophysics Data System (ADS)
Astuti, Widi; Adiwijaya
2018-03-01
Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.
Welderufael, B G; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L G; Fikse, W F
2018-01-01
Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to - but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t -test and a genome-wide significance level of P -value < 10 -4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to - or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2 ) and genes involved in macrophage recruitment and regulation of inflammations ( PDGFD and PTX3 ) were suggested as possible causal genes for susceptibility to - and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to - and recoverability from mastitis.
Experimental Approaches to Microarray Analysis of Tumor Samples
ERIC Educational Resources Information Center
Furge, Laura Lowe; Winter, Michael B.; Meyers, Jacob I.; Furge, Kyle A.
2008-01-01
Comprehensive measurement of gene expression using high-density nucleic acid arrays (i.e. microarrays) has become an important tool for investigating the molecular differences in clinical and research samples. Consequently, inclusion of discussion in biochemistry, molecular biology, or other appropriate courses of microarray technologies has…
Multiplex cDNA quantification method that facilitates the standardization of gene expression data
Gotoh, Osamu; Murakami, Yasufumi; Suyama, Akira
2011-01-01
Microarray-based gene expression measurement is one of the major methods for transcriptome analysis. However, current microarray data are substantially affected by microarray platforms and RNA references because of the microarray method can provide merely the relative amounts of gene expression levels. Therefore, valid comparisons of the microarray data require standardized platforms, internal and/or external controls and complicated normalizations. These requirements impose limitations on the extensive comparison of gene expression data. Here, we report an effective approach to removing the unfavorable limitations by measuring the absolute amounts of gene expression levels on common DNA microarrays. We have developed a multiplex cDNA quantification method called GEP-DEAN (Gene expression profiling by DCN-encoding-based analysis). The method was validated by using chemically synthesized DNA strands of known quantities and cDNA samples prepared from mouse liver, demonstrating that the absolute amounts of cDNA strands were successfully measured with a sensitivity of 18 zmol in a highly multiplexed manner in 7 h. PMID:21415008
Spot detection and image segmentation in DNA microarray data.
Qin, Li; Rueda, Luis; Ali, Adnan; Ngom, Alioune
2005-01-01
Following the invention of microarrays in 1994, the development and applications of this technology have grown exponentially. The numerous applications of microarray technology include clinical diagnosis and treatment, drug design and discovery, tumour detection, and environmental health research. One of the key issues in the experimental approaches utilising microarrays is to extract quantitative information from the spots, which represent genes in a given experiment. For this process, the initial stages are important and they influence future steps in the analysis. Identifying the spots and separating the background from the foreground is a fundamental problem in DNA microarray data analysis. In this review, we present an overview of state-of-the-art methods for microarray image segmentation. We discuss the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. We analytically show that clustering-based techniques are equivalent to the one-dimensional, standard k-means clustering algorithm that utilises the Euclidean distance.
Split-plot microarray experiments: issues of design, power and sample size.
Tsai, Pi-Wen; Lee, Mei-Ling Ting
2005-01-01
This article focuses on microarray experiments with two or more factors in which treatment combinations of the factors corresponding to the samples paired together onto arrays are not completely random. A main effect of one (or more) factor(s) is confounded with arrays (the experimental blocks). This is called a split-plot microarray experiment. We utilise an analysis of variance (ANOVA) model to assess differentially expressed genes for between-array and within-array comparisons that are generic under a split-plot microarray experiment. Instead of standard t- or F-test statistics that rely on mean square errors of the ANOVA model, we use a robust method, referred to as 'a pooled percentile estimator', to identify genes that are differentially expressed across different treatment conditions. We illustrate the design and analysis of split-plot microarray experiments based on a case application described by Jin et al. A brief discussion of power and sample size for split-plot microarray experiments is also presented.
Analysis of SNP rs16754 of WT1 gene in a series of de novo acute myeloid leukemia patients.
Luna, Irene; Such, Esperanza; Cervera, Jose; Barragán, Eva; Jiménez-Velasco, Antonio; Dolz, Sandra; Ibáñez, Mariam; Gómez-Seguí, Inés; López-Pavía, María; Llop, Marta; Fuster, Óscar; Oltra, Silvestre; Moscardó, Federico; Martínez-Cuadrón, David; Senent, M Leonor; Gascón, Adriana; Montesinos, Pau; Martín, Guillermo; Bolufer, Pascual; Sanz, Miguel A
2012-12-01
The single nucleotide polymorphism (SNP) rs16754 of the WT1 gene has been previously described as a possible prognostic marker in normal karyotype acute myeloid leukemia (AML) patients. Nevertheless, the findings in this field are not always reproducible in different series. One hundred and seventy-five adult de novo AML patients were screened with two different methods for the detection of SNP rs16754: high-resolution melting (HRM) and FRET hybridization probes. Direct sequencing was used to validate both techniques. The SNP was detected in 52 out of 175 patients (30 %), both by HRM and hybridization probes. Direct sequencing confirmed that every positive sample in the screening methods had a variation in the DNA sequence. Patients with the wild-type genotype (WT1(AA)) for the SNP rs16754 were significantly younger than those with the heterozygous WT1(AG) genotype. No other difference was observed for baseline characteristic or outcome between patients with or without the SNP. Both techniques are equally reliable and reproducible as screening methods for the detection of the SNP rs16754, allowing for the selection of those samples that will need to be sequenced. We were unable to confirm the suggested favorable outcome of SNP rs16754 in de novo AML.
Activity study of biogenic spherical silver nanoparticles towards microbes and oxidants
NASA Astrophysics Data System (ADS)
Hoskote Anand, Kiran Kumar; Mandal, Badal Kumar
2015-01-01
The eco-friendly approach for the green synthesis of silver nanoparticles (SNP) using Terminalia bellirica (T. bellirica) fruit extract is reported herein. Initially formation of SNP was noticed through visual color change from yellow to reddish brown and further analyzed by surface plasmonic resonance (SPR) band at 429 nm using UV-Vis spectroscopy. Identification of different polyphenols present in T. bellirica extract was done using High Pressure Liquid Chromatography (HPLC). Aqueous T. bellirica extract contains high amount of gallic acid which is major secondary metabolite responsible for the reduction and stabilization process. It was established by analyses of extracts before and after reduction using HPLC. Formation of spherical SNP was characterized by Transmission Electron Microscopy (TEM) analysis. X-ray Diffraction (XRD) study revealed crystalline nature of SNP. Presence of different functional groups on the surface of SNP was evidenced by Fourier Transform Infrared Spectroscopy (FTIR) study. A plausible mechanism of reduction and stabilization processes involved in the synthesis of stable SNP was also explained based on HPLC and FTIR data. In addition, the synthesized SNP was tested for antibacterial and antioxidant activities. SNP showed good antimicrobial activity against both gram positive (S. aureus) and gram negative (E. coli) bacteria. It also showed good antioxidant activity compared to ascorbic acid as standard antioxidant by using standard DPPH method.
Scurrah, Katrina J; Lamantia, Angela; Ellis, Justine A; Harrap, Stephen B
2017-06-01
Renin-angiotensin-aldosterone system genes have been inconsistently associated with blood pressure, possibly because of unrecognized influences of sex-dependent genetic effects or gene-gene interactions (epistasis). We tested association of systolic blood pressure with single-nucleotide polymorphisms (SNPs) at renin ( REN ), angiotensinogen ( AGT ), angiotensin-converting enzyme ( ACE ), angiotensin II type 1 receptor ( AGTR1 ), and aldosterone synthase ( CYP11B2 ), including sex-SNP or SNP-SNP interactions. Eighty-eight tagSNPs were tested in 2872 white individuals in 809 pedigrees from the Victorian Family Heart Study using variance components models. Three SNPs (rs8075924 and rs4277404 at ACE and rs12721297 at AGTR1 ) were individually associated with lower systolic blood pressure with significant ( P <0.00076) effect sizes ≈1.7 to 2.5 mm Hg. Sex-specific associations were seen for 3 SNPs in men (rs2468523 and rs2478544 at AGT and rs11658531 at ACE ) and 1 SNP in women (rs12451328 at ACE ). SNP-SNP interaction was suggested ( P <0.005) for 14 SNP pairs, none of which had shown individual association with systolic blood pressure. Four SNP pairs were at the same gene (2 for REN , 1 for AGT , and 1 for AGTR1 ). The SNP rs3097 at CYP11B2 was represented in 5 separate pairs. SNPs at key renin-angiotensin-aldosterone system genes associate with systolic blood pressure individually in both sexes, individually in one sex only and only when combined with another SNP. Analyses that incorporate sex-dependent and epistatic effects could reconcile past inconsistencies and account for some of the missing heritability of blood pressure and are generally relevant to SNP association studies for any phenotype. © 2017 American Heart Association, Inc.
PExFInS: An Integrative Post-GWAS Explorer for Functional Indels and SNPs
Cheng, Zhongshan; Chu, Hin; Fan, Yanhui; Li, Cun; Song, You-Qiang; Zhou, Jie; Yuen, Kwok-Yung
2015-01-01
Expression quantitative trait loci (eQTLs) mapping and linkage disequilibrium (LD) analysis have been widely employed to interpret findings of genome-wide association studies (GWAS). With the availability of deep sequencing data of 423 lymphoblastoid cell lines (LCLs) from six global populations and the microarray expression data, we performed eQTL analysis, identified more than 228 K SNP cis-eQTLs and 21 K indel cis-eQTLs and generated a LCL cis-eQTL database. We demonstrate that the percentages of population-shared and population-specific cis-eQTLs are comparable; while indel cis-eQTLs in the population-specific subsection make more contribution to gene expression variations than those in the population-shared subsection. We found cis-eQTLs, especially the population-shared cis-eQTLs are significantly enriched toward transcription start site. Moreover, the National Human Genome Research Institute cataloged GWAS SNPs are enriched for LCL cis-eQTLs. Specifically, 32.8% GWAS SNPs are LCL cis-eQTLs, among which 12.5% can be tagged by indel cis-eQTLs, suggesting the fundamental contribution of indel cis-eQTLs to GWAS association signals. To search for functional indels and SNPs tagging GWAS SNPs, a pipeline Post-GWAS Explorer for Functional Indels and SNPs (PExFInS) has been developed, integrating LD analysis, functional annotation from public databases, cis-eQTL mapping with our LCL cis-eQTL database and other published cis-eQTL datasets. PMID:26612672
Cohort analysis of a single nucleotide polymorphism on DNA chips.
Schwonbeck, Susanne; Krause-Griep, Andrea; Gajovic-Eichelmann, Nenad; Ehrentreich-Förster, Eva; Meinl, Walter; Glatt, Hansrüdi; Bier, Frank F
2004-11-15
A method has been developed to determine SNPs on DNA chips by applying a flow-through bioscanner. As a practical application we demonstrated the fast and simple SNP analysis of 24 genotypes in an array of 96 spots with a single hybridisation and dissociation experiment. The main advantage of this methodical concept is the parallel and fast analysis without any need of enzymatic digestion. Additionally, the DNA chip format used is appropriate for parallel analysis up to 400 spots. The polymorphism in the gene of the human phenol sulfotransferase SULT1A1 was studied as a model SNP. Biotinylated PCR products containing the SNP (The SNP summary web site: ) (mutant) and those containing no mutation (wild-type) were brought onto the chips coated with NeutrAvidin using non-contact spotting. This was followed by an analysis which was carried out in a flow-through biochip scanner while constantly rinsing with buffer. After removing the non-biotinylated strand a fluorescent probe was hybridised, which is complementary to the wild-type sequence. If this probe binds to a mutant sequence, then one single base is not fully matching. Thereby, the mismatched hybrid (mutant) is less stable than the full-matched hybrid (wild-type). The final step after hybridisation on the chip involves rinsing with a buffer to start dissociation of the fluorescent probe from the immobilised DNA strand. The online measurement of the fluorescence intensity by the biochip scanner provides the possibility to follow the kinetics of the hybridisation and dissociation processes. According to the different stability of the full-match and the mismatch, either visual discrimination or kinetic analysis is possible to distinguish SNP-containing sequence from the wild-type sequence.
DoGSD: the dog and wolf genome SNP database.
Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping
2015-01-01
The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kato, Hideaki; Ohata, Aya; Samukawa, Sei; Ueda, Atsuhisa; Ishigatsubo, Yoshiaki
2016-04-01
To investigate the association between single nucleotide polymorphisms (SNPs) in the adiponectin-encoding gene ADIPOQ and changes in serum lipid levels in HIV-1-infected patients after antiretroviral therapy (ART). ART-naïve HIV-1-infected patients were recruited to this prospective analysis. SNP +45 and SNP +276 genotype was determined by direct sequencing. Multivariate linear regression analysis was performed to analyse the effects of genotype, and predisposing conditions on serum total cholesterol and triglyceride in the 4 months before and after ART initiation. The study enrolled 78 patients with HIV-1-infection (73 male, five female; age range 22-67 years). HIV-1 viral load ≥5 log10 copies/ml, baseline total cholesterol ≥160 mg/dl, and CD4(+) lymphocyte count <200/µl were associated with increased serum total cholesterol levels after ART initiation. Protease inhibitor treatment and body mass index ≥25 kg/m(2) were associated with increased triglyceride levels after ART initiation. There were no significant associations between SNP +45 or SNP +276 genotype and serum total cholesterol or triglyceride levels. SNP +45 and SNP +276 genotype is not associated with changes in serum total cholesterol or triglyceride levels after ART initiation. © The Author(s) 2016.
Zhang, Han; Wheeler, William; Song, Lei; Yu, Kai
2017-07-07
As meta-analysis results published by consortia of genome-wide association studies (GWASs) become increasingly available, many association summary statistics-based multi-locus tests have been developed to jointly evaluate multiple single-nucleotide polymorphisms (SNPs) to reveal novel genetic architectures of various complex traits. The validity of these approaches relies on the accurate estimate of z-score correlations at considered SNPs, which in turn requires knowledge on the set of SNPs assessed by each study participating in the meta-analysis. However, this exact SNP coverage information is usually unavailable from the meta-analysis results published by GWAS consortia. In the absence of the coverage information, researchers typically estimate the z-score correlations by making oversimplified coverage assumptions. We show through real studies that such a practice can generate highly inflated type I errors, and we demonstrate the proper way to incorporate correct coverage information into multi-locus analyses. We advocate that consortia should make SNP coverage information available when posting their meta-analysis results, and that investigators who develop analytic tools for joint analyses based on summary data should pay attention to the variation in SNP coverage and adjust for it appropriately. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.
Streit, M; Reinhardt, F; Thaller, G; Bennewitz, J
2013-01-01
Genotype by environment interaction (G × E) has been widely reported in dairy cattle. If the environment can be measured on a continuous scale, reaction norms can be applied to study G × E. The average herd milk production level has frequently been used as an environmental descriptor because it is influenced by the level of feeding or the feeding regimen. Another important environmental factor is the level of udder health and hygiene, for which the average herd somatic cell count might be a descriptor. In the present study, we conducted a genome-wide association analysis to identify single nucleotide polymorphisms (SNP) that affect intercept and slope of milk protein yield reaction norms when using the average herd test-day solution for somatic cell score as an environmental descriptor. Sire estimates for intercept and slope of the reaction norms were calculated from around 12 million daughter records, using linear reaction norm models. Sires were genotyped for ~54,000 SNP. The sire estimates were used as observations in the association analysis, using 1,797 sires. Significant SNP were confirmed in an independent validation set consisting of 500 sires. A known major gene affecting protein yield was included as a covariable in the statistical model. Sixty (21) SNP were confirmed for intercept with P ≤ 0.01 (P ≤ 0.001) in the validation set, and 28 and 11 SNP, respectively, were confirmed for slope. Most but not all SNP affecting slope also affected intercept. Comparison with an earlier study revealed that SNP affecting slope were, in general, also significant for slope when the environment was modeled by the average herd milk production level, although the two environmental descriptors were poorly correlated. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Rode, Tone Mari; Berget, Ingunn; Langsrud, Solveig; Møretrø, Trond; Holck, Askild
2009-07-01
Microorganisms are constantly exposed to new and altered growth conditions, and respond by changing gene expression patterns. Several methods for studying gene expression exist. During the last decade, the analysis of microarrays has been one of the most common approaches applied for large scale gene expression studies. A relatively new method for gene expression analysis is MassARRAY, which combines real competitive-PCR and MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry. In contrast to microarray methods, MassARRAY technology is suitable for analysing a larger number of samples, though for a smaller set of genes. In this study we compare the results from MassARRAY with microarrays on gene expression responses of Staphylococcus aureus exposed to acid stress at pH 4.5. RNA isolated from the same stress experiments was analysed using both the MassARRAY and the microarray methods. The MassARRAY and microarray methods showed good correlation. Both MassARRAY and microarray estimated somewhat lower fold changes compared with quantitative real-time PCR (qRT-PCR). The results confirmed the up-regulation of the urease genes in acidic environments, and also indicated the importance of metal ion regulation. This study shows that the MassARRAY technology is suitable for gene expression analysis in prokaryotes, and has advantages when a set of genes is being analysed for an organism exposed to many different environmental conditions.
Microarray analysis of potential genes in the pathogenesis of recurrent oral ulcer.
Han, Jingying; He, Zhiwei; Li, Kun; Hou, Lu
2015-01-01
Recurrent oral ulcer seriously threatens patients' daily life and health. This study investigated potential genes and pathways that participate in the pathogenesis of recurrent oral ulcer by high throughput bioinformatic analysis. RT-PCR and Western blot were applied to further verify screened interleukins effect. Recurrent oral ulcer related genes were collected from websites and papers, and further found out from Human Genome 280 6.0 microarray data. Each pathway of recurrent oral ulcer related genes were got through chip hybridization. RT-PCR was applied to test four recurrent oral ulcer related genes to verify the microarray data. Data transformation, scatter plot, clustering analysis, and expression pattern analysis were used to analyze recurrent oral ulcer related gene expression changes. Recurrent oral ulcer gene microarray was successfully established. Microarray showed that 551 genes involved in recurrent oral ulcer activity and 196 genes were recurrent oral ulcer related genes. Of them, 76 genes up-regulated, 62 genes down-regulated, and 58 genes up-/down-regulated. Total expression level up-regulated 752 times (60%) and down-regulated 485 times (40%). IL-2 plays an important role in the occurrence, development and recurrence of recurrent oral ulcer on the mRNA and protein levels. Gene microarray can be used to analyze potential genes and pathways in recurrent oral ulcer. IL-2 may be involved in the pathogenesis of recurrent oral ulcer.
2010-01-01
Background Analysis of gene expression and gene mutation may add information to be different from ordinary pathological tissue diagnosis. Since samples obtained endoscopically are very small, it is desired that more sensitive technology is developed for gene analysis. We investigated whether gene expression and gene mutation analysis by newly developed ultra-sensitive three-dimensional (3D) microarray is possible using small amount samples from endoscopic ultrasound-guided fine-needle aspiration (EUS-FNA) specimens and pancreatic juices. Methods Small amount samples from 17 EUS-FNA specimens and 16 pancreatic juices were obtained. After nucleic acid extraction, the samples were amplified with labeling and analyzed by the 3D microarray. Results The analyzable rate with the microarray was 46% (6/13) in EUS-FNA specimens of RNAlater® storage, and RNA degradations were observed in all the samples of frozen storage. In pancreatic juices, the analyzable rate was 67% (4/6) in frozen storage samples and 20% (2/10) in RNAlater® storage. EUS-FNA specimens were classified into cancer and non-cancer by gene expression analysis and K-ras codon 12 mutations were also detected using the 3D microarray. Conclusions Gene analysis from small amount samples obtained endoscopically was possible by newly developed 3D microarray technology. High quality RNA from EUS-FNA samples were obtained and remained in good condition only using RNA stabilizer. In contrast, high quality RNA from pancreatic juice samples were obtained only in frozen storage without RNA stabilizer. PMID:20416107
MAAMD: a workflow to standardize meta-analyses and comparison of affymetrix microarray data
2014-01-01
Background Mandatory deposit of raw microarray data files for public access, prior to study publication, provides significant opportunities to conduct new bioinformatics analyses within and across multiple datasets. Analysis of raw microarray data files (e.g. Affymetrix CEL files) can be time consuming, complex, and requires fundamental computational and bioinformatics skills. The development of analytical workflows to automate these tasks simplifies the processing of, improves the efficiency of, and serves to standardize multiple and sequential analyses. Once installed, workflows facilitate the tedious steps required to run rapid intra- and inter-dataset comparisons. Results We developed a workflow to facilitate and standardize Meta-Analysis of Affymetrix Microarray Data analysis (MAAMD) in Kepler. Two freely available stand-alone software tools, R and AltAnalyze were embedded in MAAMD. The inputs of MAAMD are user-editable csv files, which contain sample information and parameters describing the locations of input files and required tools. MAAMD was tested by analyzing 4 different GEO datasets from mice and drosophila. MAAMD automates data downloading, data organization, data quality control assesment, differential gene expression analysis, clustering analysis, pathway visualization, gene-set enrichment analysis, and cross-species orthologous-gene comparisons. MAAMD was utilized to identify gene orthologues responding to hypoxia or hyperoxia in both mice and drosophila. The entire set of analyses for 4 datasets (34 total microarrays) finished in ~ one hour. Conclusions MAAMD saves time, minimizes the required computer skills, and offers a standardized procedure for users to analyze microarray datasets and make new intra- and inter-dataset comparisons. PMID:24621103
Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David
2018-04-11
Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
Nakatochi, Masahiro; Ushida, Yasunori; Yasuda, Yoshinari; Yoshida, Yasuko; Kawai, Shun; Kato, Ryuji; Nakashima, Toru; Iwata, Masamitsu; Kuwatsuka, Yachiyo; Ando, Masahiko; Hamajima, Nobuyuki; Kondo, Takaaki; Oda, Hiroaki; Hayashi, Mutsuharu; Kato, Sawako; Yamaguchi, Makoto; Maruyama, Shoichi; Matsuo, Seiichi; Honda, Hiroyuki
2015-01-01
Although many single nucleotide polymorphisms (SNPs) have been identified to be associated with metabolic syndrome (MetS), there was only a slight improvement in the ability to predict future MetS by the simply addition of SNPs to clinical risk markers. To improve the ability to predict future MetS, combinational effects, such as SNP-SNP interaction, SNP-environment interaction, and SNP-clinical parameter (SNP × CP) interaction should be also considered. We performed a case-control study to explore novel SNP × CP interactions as risk markers for MetS based on health check-up data of Japanese male employees. We selected 99 SNPs that were previously reported to be associated with MetS and components of MetS; subsequently, we genotyped these SNPs from 360 cases and 1983 control subjects. First, we performed logistic regression analyses to assess the association of each SNP with MetS. Of these SNPs, five SNPs were significantly associated with MetS (P < 0.05): LRP2 rs2544390, rs1800592 between UCP1 and TBC1D9, APOA5 rs662799, VWF rs7965413, and rs1411766 between MYO16 and IRS2. Furthermore, we performed multiple logistic regression analyses, including an SNP term, a CP term, and an SNP × CP interaction term for each CP and SNP that was significantly associated with MetS. We identified a novel SNP × CP interaction between rs7965413 and platelet count that was significantly associated with MetS [SNP term: odds ratio (OR) = 0.78, P = 0.004; SNP × CP interaction term: OR = 1.33, P = 0.001]. This association of the SNP × CP interaction with MetS remained nominally significant in multiple logistic regression analysis after adjustment for either the number of MetS components or MetS components excluding obesity. Our results reveal new insight into platelet count as a risk marker for MetS.
USDA-ARS?s Scientific Manuscript database
The amount of microarray gene expression data in public repositories has been increasing exponentially for the last couple of decades. High-throughput microarray data integration and analysis has become a critical step in exploring the large amount of expression data for biological discovery. Howeve...
Gong, Bin-Sheng; Zhang, Qing-Pu; Zhang, Guang-Mei; Zhang, Shao-Jun; Zhang, Wei; Lv, Hong-Chao; Zhang, Fan; Lv, Sa-Li; Li, Chuan-Xing; Rao, Shao-Qi; Li, Xia
2007-01-01
Gene expression profiles and single-nucleotide polymorphism (SNP) profiles are modern data for genetic analysis. It is possible to use the two types of information to analyze the relationships among genes by some genetical genomics approaches. In this study, gene expression profiles were used as expression traits. And relationships among the genes, which were co-linked to a common SNP(s), were identified by integrating the two types of information. Further research on the co-expressions among the co-linked genes was carried out after the gene-SNP relationships were established using the Haseman-Elston sib-pair regression. The results showed that the co-expressions among the co-linked genes were significantly higher if the number of connections between the genes and a SNP(s) was more than six. Then, the genes were interconnected via one or more SNP co-linkers to construct a gene-SNP intermixed network. The genes sharing more SNPs tended to have a stronger correlation. Finally, a gene-gene network was constructed with their intensities of relationships (the number of SNP co-linkers shared) as the weights for the edges. PMID:18466544
Kinoshita, Kenji; Fujimoto, Kentaro; Yakabe, Toru; Saito, Shin; Hamaguchi, Yuzo; Kikuchi, Takayuki; Nonaka, Ken; Murata, Shigenori; Masuda, Daisuke; Takada, Wataru; Funaoka, Sohei; Arai, Susumu; Nakanishi, Hisao; Yokoyama, Kanehisa; Fujiwara, Kazuhiko; Matsubara, Kenichi
2007-01-01
DNA microarrays are routinely used to monitor gene expression profiling and single nucleotide polymorphisms (SNPs). However, for practically useful high performance, the detection sensitivity is still not adequate, leaving low expression genes undetected. To resolve this issue, we have developed a new plastic S-BIO® PrimeSurface® with a biocompatible polymer; its surface chemistry offers an extraordinarily stable thermal property for a lack of pre-activated glass slide surface. The oligonucleotides immobilized on this substrate are robust in boiling water and show no significant loss of hybridization activity during dissociation treatment. This allowed us to hybridize the templates, extend the 3′ end of the immobilized DNA primers on the S-Bio® by DNA polymerase using deoxynucleotidyl triphosphates (dNTP) as extender units, release the templates by denaturalization and use the same templates for a second round of reactions similar to that of the PCR method. By repeating this cycle, the picomolar concentration range of the template oligonucleotide can be detected as stable signals via the incorporation of labeled dUTP into primers. This method of Multiple Primer EXtension (MPEX) could be further extended as an alternative route for producing DNA microarrays for SNP analyses via simple template preparation such as reverse transcript cDNA or restriction enzyme treatment of genome DNA. PMID:17135189
NASA Astrophysics Data System (ADS)
Bogdanov, Valery L.; Boyce-Jacino, Michael
1999-05-01
Confined arrays of biochemical probes deposited on a solid support surface (analytical microarray or 'chip') provide an opportunity to analysis multiple reactions simultaneously. Microarrays are increasingly used in genetics, medicine and environment scanning as research and analytical instruments. A power of microarray technology comes from its parallelism which grows with array miniaturization, minimization of reagent volume per reaction site and reaction multiplexing. An optical detector of microarray signals should combine high sensitivity, spatial and spectral resolution. Additionally, low-cost and a high processing rate are needed to transfer microarray technology into biomedical practice. We designed an imager that provides confocal and complete spectrum detection of entire fluorescently-labeled microarray in parallel. Imager uses microlens array, non-slit spectral decomposer, and high- sensitive detector (cooled CCD). Two imaging channels provide a simultaneous detection of localization, integrated and spectral intensities for each reaction site in microarray. A dimensional matching between microarray and imager's optics eliminates all in moving parts in instrumentation, enabling highly informative, fast and low-cost microarray detection. We report theory of confocal hyperspectral imaging with microlenses array and experimental data for implementation of developed imager to detect fluorescently labeled microarray with a density approximately 103 sites per cm2.
Savio, Andrea J.; Lemire, Mathieu; Mrkonjic, Miralem; Gallinger, Steven; Zanke, Brent W.; Hudson, Thomas J.; Bapat, Bharati
2012-01-01
Single nucleotide polymorphisms (SNPs) are the most common form of genetic variation. We previously demonstrated that SNPs (rs1800734, rs749072, and rs13098279) in the MLH1 gene region are associated with MLH1 promoter island methylation, loss of MLH1 protein expression, and microsatellite instability (MSI) in colorectal cancer (CRC) patients. Recent studies have identified less CpG-dense “shore” regions flanking many CpG islands. These shores often exhibit distinct methylation profiles between different tissues and matched normal versus tumor cells of patients. To date, most epigenetic studies have focused on somatic methylation events occurring within solid tumors; less is known of the contributions of peripheral blood cell (PBC) methylation to processes such as aging and tumorigenesis. To address whether MLH1 methylation in PBCs is correlated with tumorigenesis we utilized the Illumina 450 K microarrays to measure methylation in PBC DNA of 846 healthy controls and 252 CRC patients from Ontario, Canada. Analysis of a region of chromosome 3p21 spanning the MLH1 locus in healthy controls revealed that a CpG island shore 1 kb upstream of the MLH1 gene exhibits different methylation profiles when stratified by SNP genotypes (rs1800734, rs749072, and rs13098279). Individuals with wild-type genotypes incur significantly higher PBC shore methylation than heterozygous or homozygous variant carriers (p<1.1×10−6; ANOVA). This trend is also seen in CRC cases (p<0.096; ANOVA). Shore methylation also decreases significantly with increasing age in cases and controls. This is the first study of its kind to integrate PBC methylation at a CpG island shore with SNP genotype status in CRC cases and controls. These results indicate that CpG island shore methylation in PBCs may be influenced by genotype as well as the normal aging process. PMID:23240038
Khrustaleva, A M; Volkov, A A; Stoklitskaia, D S; Miuge, N S; Zelenina, D A
2010-11-01
Sockeye salmon samples from five largest lacustrine-riverine systems of Kamchatka Peninsula were tested for polymorphism at six microsatellite (STR) and five single nucleotide polymorphism (SNP) loci. Statistically significant genetic differentiation among local populations from this part of the species range examined was demonstrated. The data presented point to pronounced genetic divergence of the populations from two geographical regions, Eastern and Western Kamchatka. For sockeye salmon, the individual identification test accuracy was higher for microsatellites compared to similar number of SNP markers. Pooling of the STR and SNP allele frequency data sets provided the highest accuracy of the individual fish population assignment.
NASA Astrophysics Data System (ADS)
Brazhnik, Kristina; Sokolova, Zinaida; Baryshnikova, Maria; Bilan, Regina; Nabiev, Igor; Sukhanova, Alyona
Multiplexed analysis of cancer markers is crucial for early tumor diagnosis and screening. We have designed lab-on-a-bead microarray for quantitative detection of three breast cancer markers in human serum. Quantum dots were used as bead-bound fluorescent tags for identifying each marker by means of flow cytometry. Antigen-specific beads reliably detected CA 15-3, CEA, and CA 125 in serum samples, providing clear discrimination between the samples with respect to the antigen levels. The novel microarray is advantageous over the routine single-analyte ones due to the simultaneous detection of various markers. Therefore the developed microarray is a promising tool for serum tumor marker profiling.
Discovery of 100K SNP array and its utilization in sugarcane
USDA-ARS?s Scientific Manuscript database
Next generation sequencing (NGS) enable us to identify thousands of single nucleotide polymorphisms (SNPs) marker for genotyping and fingerprinting. However, the process requires very precise bioinformatics analysis and filtering process. High throughput SNP array with predefined genomic location co...
Construction of a cDNA microarray derived from the ascidian Ciona intestinalis.
Azumi, Kaoru; Takahashi, Hiroki; Miki, Yasufumi; Fujie, Manabu; Usami, Takeshi; Ishikawa, Hisayoshi; Kitayama, Atsusi; Satou, Yutaka; Ueno, Naoto; Satoh, Nori
2003-10-01
A cDNA microarray was constructed from a basal chordate, the ascidian Ciona intestinalis. The draft genome of Ciona has been read and inferred to contain approximately 16,000 protein-coding genes, and cDNAs for transcripts of 13,464 genes have been characterized and compiled as the "Ciona intestinalis Gene Collection Release I". In the present study, we constructed a cDNA microarray of these 13,464 Ciona genes. A preliminary experiment with Cy3- and Cy5-labeled probes showed extensive differential gene expression between fertilized eggs and larvae. In addition, there was a good correlation between results obtained by the present microarray analysis and those from previous EST analyses. This first microarray of a large collection of Ciona intestinalis cDNA clones should facilitate the analysis of global gene expression and gene networks during the embryogenesis of basal chordates.
Welderufael, B. G.; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L. G.; Fikse, W. F.
2018-01-01
Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to – but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t-test and a genome-wide significance level of P-value < 10-4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to – or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2) and genes involved in macrophage recruitment and regulation of inflammations (PDGFD and PTX3) were suggested as possible causal genes for susceptibility to – and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to – and recoverability from mastitis. PMID:29755506
Implementation of GenePattern within the Stanford Microarray Database.
Hubble, Jeremy; Demeter, Janos; Jin, Heng; Mao, Maria; Nitzberg, Michael; Reddy, T B K; Wymore, Farrell; Zachariah, Zachariah K; Sherlock, Gavin; Ball, Catherine A
2009-01-01
Hundreds of researchers across the world use the Stanford Microarray Database (SMD; http://smd.stanford.edu/) to store, annotate, view, analyze and share microarray data. In addition to providing registered users at Stanford access to their own data, SMD also provides access to public data, and tools with which to analyze those data, to any public user anywhere in the world. Previously, the addition of new microarray data analysis tools to SMD has been limited by available engineering resources, and in addition, the existing suite of tools did not provide a simple way to design, execute and share analysis pipelines, or to document such pipelines for the purposes of publication. To address this, we have incorporated the GenePattern software package directly into SMD, providing access to many new analysis tools, as well as a plug-in architecture that allows users to directly integrate and share additional tools through SMD. In this article, we describe our implementation of the GenePattern microarray analysis software package into the SMD code base. This extension is available with the SMD source code that is fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD with an enriched data analysis capability.
First example of an FY*01 allele associated with weakened expression of Fya on red blood cells.
Arndt, Patricia A; Horn, Trina; Keller, Jessica A; Heri, Suzanne M; Keller, Margaret A
2015-01-01
Duffy antigens are important in immunohematology. the reference allele for the Duffy gene (FY) is FY*02, which encodes Fy(b). An A>G single nucleotide polymorphism (SNP) at coding nucleotide (c.) 125 in exon 2 defines the FY*01 allele, which encodes the antithetical Fy(a). A C>T SNP at c.265 in the FY*02 allele is associated with weakening of Fy(b) expression on red blood cells (R BCs) (called Fy(x)). until recently, this latter change had not been described on a FY*01 background allele. Phenotype-matched units were desired for a multi-transfused Vietnamese fetus with α-thalassemia. Genotyping of the fetus using a microarray assay that interrogates three SNPs (c.1-67, c.125, and c.265) in FY yielded indeterminate results for the predicted Duffy phenotype. Genomic sequencing of FY exon 2 showed that the fetal sample had one wild-type FY*01 allele and one new FY*01 allele with the c.265C>T SNP, which until recently had only been found on the FY*02 allele. Genotyping performed on samples from the proband's parents indicated that the father had the same FY genotype as the fetus. Flow cytometry, which has been previously demonstrated as a useful method to study antigen strength on cells, was used to determine if this new FY*01 allele was associated with reduced Fy(a) expression on the father's RBCs. Median fluorescence intensity of the father's RBCs (after incubation with anti-FY(a) and fluorescein-labeled anti-IgG) was similar to known FY*01 heterozygotes. and significantly weaker than known FY*01 homozygotes. In conclusion, the fetus and father both had one normal FY*01 allele and one new FY*01W.01, is associated with weakened expression of Fy(a) on RBCs.
Chono, Makiko; Matsunaka, Hitoshi; Seki, Masako; Fujita, Masaya; Kiribuchi-Otobe, Chikako; Oda, Shunsuke; Kojima, Hisayo; Nakamura, Shingo
2015-03-01
In the wheat (Triticum aestivum L.) cultivar 'Zenkoujikomugi', a single nucleotide polymorphism (SNP) in the promoter of MOTHER OF FT AND TFL1 on chromosome 3A (MFT-3A) causes an increase in the level of gene expression, resulting in strong grain dormancy. We used a DNA marker to detect the 'Zenkoujikomugi'-type (Zen-type) SNP and examined the genotype of MFT-3A in Japanese wheat varieties, and we found that 169 of 324 varieties carry the Zen-type SNP. In Japanese commercial varieties, the frequency of the Zen-type SNP was remarkably high in the southern part of Japan, but low in the northern part. To examine the relationship between MFT-3A genotype and grain dormancy, we performed a germination assay in three wheat-growing seasons. On average, the varieties carrying the Zen-type SNP showed stronger grain dormancy than the varieties carrying the non-Zen-type SNP. Among commercial cultivars, 'Iwainodaichi' (Kyushu), 'Junreikomugi' (Kinki-Chugoku-Shikoku), 'Kinuhime' (Kanto-Tokai), 'Nebarigoshi' (Tohoku-Hokuriku), and 'Kitamoe' (Hokkaido) showed the strongest grain dormancy in each geographical group, and all these varieties, except for 'Kitamoe', were found to carry the Zen-type SNP. In recent years, the number of varieties carrying the Zen-type SNP has increased in the Tohoku-Hokuriku region, but not in the Hokkaido region.
Isolation, characterization, and radiation protection of Sipunculus nudus L. polysaccharide.
Li, Na; Shen, Xianrong; Liu, Yuming; Zhang, Junling; He, Ying; Liu, Qiong; Jiang, Dingwen; Zong, Jie; Li, Jiamei; Hou, Dengyong; Chen, Wei; Wang, Qingrong; Luo, Qun; Li, Kexian
2016-02-01
Sipunculus nudus Linnaeus polysaccharide (SNP) was purified from S. nudus L. via NaOH extraction, trichloroacetic acid deproteination, DEAE-cellulose 52 and Sephacryl S-300 chromatography. The monosaccharide analysis and molecular weight was detected with HPLC. FT-IR, 1H spectrum and 13C NMR spectrum were performed to detect the chemical characteristics. The antioxidant activity was assayed in vitro. The radiation protection effects were detected on mice. The results showed that SNP was composed of mannose, rhamnose, galacturonic acid, glucose, arabinose and fucose, and the average molecular weight was 680 kDa. Above the concentration of 10 mg/mL, SNP showed powerful scavenging activity on hydroxyl radical. In the animals irradiated with a 7.5 Gy γ-rays, the 90 mg/kg and the 270 mg/kg SNP groups survived significantly longer than the radiation control group. In the animals irradiated with a 4.0 Gy γ-rays, SNP showed significant protection effect. The contents of DNA in bone marrow cells were significantly increased by SNP treatment, and the micronucleus rates of 30 mg/kg and 270 mg/kg SNP groups were decrease significantly compared to the radiation control group. These findings suggest that SNP possesses marked antioxidant and bone marrow damage protection capacity which play important roles in the prevention of radiation damage. Copyright © 2015 Elsevier B.V. All rights reserved.
Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.
Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J
2008-06-18
Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.
Fabrication of Carbohydrate Microarrays by Boronate Formation.
Adak, Avijit K; Lin, Ting-Wei; Li, Ben-Yuan; Lin, Chun-Cheng
2017-01-01
The interactions between soluble carbohydrates and/or surface displayed glycans and protein receptors are essential to many biological processes and cellular recognition events. Carbohydrate microarrays provide opportunities for high-throughput quantitative analysis of carbohydrate-protein interactions. Over the past decade, various techniques have been implemented for immobilizing glycans on solid surfaces in a microarray format. Herein, we describe a detailed protocol for fabricating carbohydrate microarrays that capitalizes on the intrinsic reactivity of boronic acid toward carbohydrates to form stable boronate diesters. A large variety of unprotected carbohydrates ranging in structure from simple disaccharides and trisaccharides to considerably more complex human milk and blood group (oligo)saccharides have been covalently immobilized in a single step on glass slides, which were derivatized with high-affinity boronic acid ligands. The immobilized ligands in these microarrays maintain the receptor-binding activities including those of lectins and antibodies according to the structures of their pendant carbohydrates for rapid analysis of a number of carbohydrate-recognition events within 30 h. This method facilitates the direct construction of otherwise difficult to obtain carbohydrate microarrays from underivatized glycans.
Mullen, M P; Berry, D P; Howard, D J; Diskin, M G; Lynch, C O; Berkowicz, E W; Magee, D A; MacHugh, D E; Waters, S M
2010-12-01
Growth hormone, produced in the anterior pituitary gland, stimulates the release of insulin-like growth factor-I from the liver and is of critical importance in the control of nutrient utilization and partitioning for lactogenesis, fertility, growth, and development in cattle. The aim of this study was to discover novel polymorphisms in the bovine growth hormone gene (GH1) and to quantify their association with performance using estimates of genetic merit on 848 Holstein-Friesian AI (artificial insemination) dairy sires. Associations with previously reported polymorphisms in the bovine GH1 gene were also undertaken. A total of 38 novel single nucleotide polymorphisms (SNP) were identified across a panel of 22 beef and dairy cattle by sequence analysis of the 5' promoter, intronic, exonic, and 3' regulatory regions, encompassing approximately 7 kb of the GH1 gene. Following multiple regression analysis on all SNP, associations were identified between 11 SNP (2 novel and 9 previously identified) and milk fat and protein yield, milk composition, somatic cell score, survival, body condition score, and body size. The G allele of a previously identified SNP in exon 5 at position 2141 of the GH1 sequence, resulting in a nonsynonymous substitution, was associated with decreased milk protein yield. The C allele of a novel SNP, GH32, was associated with inferior carcass conformation. In addition, the T allele of a previously characterized SNP, GH35, was associated with decreased survival. Both GH24 (novel) and GH35 were independently associated with somatic cell count, and 3 SNP, GH21, 2291, and GH35, were independently associated with body depth. Furthermore, 2 SNP, GH24 and GH63, were independently associated with carcass fat. Results of this study further demonstrate the multifaceted influences of GH1 on milk production, fertility, and growth-related traits in cattle. Copyright © 2010 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The Glycan Microarray Story from Construction to Applications.
Hyun, Ji Young; Pai, Jaeyoung; Shin, Injae
2017-04-18
Not only are glycan-mediated binding processes in cells and organisms essential for a wide range of physiological processes, but they are also implicated in various pathological processes. As a result, elucidation of glycan-associated biomolecular interactions and their consequences is of great importance in basic biological research and biomedical applications. In 2002, we and others were the first to utilize glycan microarrays in efforts aimed at the rapid analysis of glycan-associated recognition events. Because they contain a number of glycans immobilized in a dense and orderly manner on a solid surface, glycan microarrays enable multiple parallel analyses of glycan-protein binding events while utilizing only small amounts of glycan samples. Therefore, this microarray technology has become a leading edge tool in studies aimed at elucidating roles played by glycans and glycan binding proteins in biological systems. In this Account, we summarize our efforts on the construction of glycan microarrays and their applications in studies of glycan-associated interactions. Immobilization strategies of functionalized and unmodified glycans on derivatized glass surfaces are described. Although others have developed immobilization techniques, our efforts have focused on improving the efficiencies and operational simplicity of microarray construction. The microarray-based technology has been most extensively used for rapid analysis of the glycan binding properties of proteins. In addition, glycan microarrays have been employed to determine glycan-protein interactions quantitatively, detect pathogens, and rapidly assess substrate specificities of carbohydrate-processing enzymes. More recently, the microarrays have been employed to identify functional glycans that elicit cell surface lectin-mediated cellular responses. Owing to these efforts, it is now possible to use glycan microarrays to expand the understanding of roles played by glycans and glycan binding proteins in biological systems.
[Prenatal genetic diagnosis for a fetus with atypical neurofibromatosis type 1 microdeletion].
Lin, Shaobin; Wu, Jianzhu; Zhang, Zhiqiang; Ji, Yuanjun; Fang, Qun; Chen, Baojiang; Luo, Yanmin
2016-04-01
To analyze the correlation between atypical neurofibromatosis type 1(NF1) microdeletion and fetal phenotype. Fetal blood sampling was carried out for a woman bearing a fetus with talipes equinovarus. G-banded karyotyping and single nucleotide polymorphism array (SNP-array) were performed on the fetal blood sample. Fluorescence in situ hybridization (FISH) was used to confirm the result of SNP array analysis. FISH assay was also carried out on peripheral blood specimens from the parents to ascertain the origin of mutation. The karyotype of fetus was found to be 46, XY by G-banding analysis. However, a 3.132 Mb microdeletion was detected in chromosome region 17q11.2 by SNP array, which overlaped with the region of NF1 microdeletion syndrome. Analyzing of the specimens from the fetus and its parents with FISH has confirmed it to be a de novo deletion. Talipes equinovarus may be an abnormal sonographic feature of fetus with atypical NF1 microdeletion which can be accurately diagnosed with SNP array.
EDGE3: A web-based solution for management and analysis of Agilent two color microarray experiments
Vollrath, Aaron L; Smith, Adam A; Craven, Mark; Bradfield, Christopher A
2009-01-01
Background The ability to generate transcriptional data on the scale of entire genomes has been a boon both in the improvement of biological understanding and in the amount of data generated. The latter, the amount of data generated, has implications when it comes to effective storage, analysis and sharing of these data. A number of software tools have been developed to store, analyze, and share microarray data. However, a majority of these tools do not offer all of these features nor do they specifically target the commonly used two color Agilent DNA microarray platform. Thus, the motivating factor for the development of EDGE3 was to incorporate the storage, analysis and sharing of microarray data in a manner that would provide a means for research groups to collaborate on Agilent-based microarray experiments without a large investment in software-related expenditures or extensive training of end-users. Results EDGE3 has been developed with two major functions in mind. The first function is to provide a workflow process for the generation of microarray data by a research laboratory or a microarray facility. The second is to store, analyze, and share microarray data in a manner that doesn't require complicated software. To satisfy the first function, EDGE3 has been developed as a means to establish a well defined experimental workflow and information system for microarray generation. To satisfy the second function, the software application utilized as the user interface of EDGE3 is a web browser. Within the web browser, a user is able to access the entire functionality, including, but not limited to, the ability to perform a number of bioinformatics based analyses, collaborate between research groups through a user-based security model, and access to the raw data files and quality control files generated by the software used to extract the signals from an array image. Conclusion Here, we present EDGE3, an open-source, web-based application that allows for the storage, analysis, and controlled sharing of transcription-based microarray data generated on the Agilent DNA platform. In addition, EDGE3 provides a means for managing RNA samples and arrays during the hybridization process. EDGE3 is freely available for download at . PMID:19732451
Vollrath, Aaron L; Smith, Adam A; Craven, Mark; Bradfield, Christopher A
2009-09-04
The ability to generate transcriptional data on the scale of entire genomes has been a boon both in the improvement of biological understanding and in the amount of data generated. The latter, the amount of data generated, has implications when it comes to effective storage, analysis and sharing of these data. A number of software tools have been developed to store, analyze, and share microarray data. However, a majority of these tools do not offer all of these features nor do they specifically target the commonly used two color Agilent DNA microarray platform. Thus, the motivating factor for the development of EDGE(3) was to incorporate the storage, analysis and sharing of microarray data in a manner that would provide a means for research groups to collaborate on Agilent-based microarray experiments without a large investment in software-related expenditures or extensive training of end-users. EDGE(3) has been developed with two major functions in mind. The first function is to provide a workflow process for the generation of microarray data by a research laboratory or a microarray facility. The second is to store, analyze, and share microarray data in a manner that doesn't require complicated software. To satisfy the first function, EDGE3 has been developed as a means to establish a well defined experimental workflow and information system for microarray generation. To satisfy the second function, the software application utilized as the user interface of EDGE(3) is a web browser. Within the web browser, a user is able to access the entire functionality, including, but not limited to, the ability to perform a number of bioinformatics based analyses, collaborate between research groups through a user-based security model, and access to the raw data files and quality control files generated by the software used to extract the signals from an array image. Here, we present EDGE(3), an open-source, web-based application that allows for the storage, analysis, and controlled sharing of transcription-based microarray data generated on the Agilent DNA platform. In addition, EDGE(3) provides a means for managing RNA samples and arrays during the hybridization process. EDGE(3) is freely available for download at http://edge.oncology.wisc.edu/.
Viability of in-house datamarting approaches for population genetics analysis of SNP genotypes
Amigo, Jorge; Phillips, Christopher; Salas, Antonio; Carracedo, Ángel
2009-01-01
Background Databases containing very large amounts of SNP (Single Nucleotide Polymorphism) data are now freely available for researchers interested in medical and/or population genetics applications. While many of these SNP repositories have implemented data retrieval tools for general-purpose mining, these alone cannot cover the broad spectrum of needs of most medical and population genetics studies. Results To address this limitation, we have built in-house customized data marts from the raw data provided by the largest public databases. In particular, for population genetics analysis based on genotypes we have built a set of data processing scripts that deal with raw data coming from the major SNP variation databases (e.g. HapMap, Perlegen), stripping them into single genotypes and then grouping them into populations, then merged with additional complementary descriptive information extracted from dbSNP. This allows not only in-house standardization and normalization of the genotyping data retrieved from different repositories, but also the calculation of statistical indices from simple allele frequency estimates to more elaborate genetic differentiation tests within populations, together with the ability to combine population samples from different databases. Conclusion The present study demonstrates the viability of implementing scripts for handling extensive datasets of SNP genotypes with low computational costs, dealing with certain complex issues that arise from the divergent nature and configuration of the most popular SNP repositories. The information contained in these databases can also be enriched with additional information obtained from other complementary databases, in order to build a dedicated data mart. Updating the data structure is straightforward, as well as permitting easy implementation of new external data and the computation of supplementary statistical indices of interest. PMID:19344481
Viability of in-house datamarting approaches for population genetics analysis of SNP genotypes.
Amigo, Jorge; Phillips, Christopher; Salas, Antonio; Carracedo, Angel
2009-03-19
Databases containing very large amounts of SNP (Single Nucleotide Polymorphism) data are now freely available for researchers interested in medical and/or population genetics applications. While many of these SNP repositories have implemented data retrieval tools for general-purpose mining, these alone cannot cover the broad spectrum of needs of most medical and population genetics studies. To address this limitation, we have built in-house customized data marts from the raw data provided by the largest public databases. In particular, for population genetics analysis based on genotypes we have built a set of data processing scripts that deal with raw data coming from the major SNP variation databases (e.g. HapMap, Perlegen), stripping them into single genotypes and then grouping them into populations, then merged with additional complementary descriptive information extracted from dbSNP. This allows not only in-house standardization and normalization of the genotyping data retrieved from different repositories, but also the calculation of statistical indices from simple allele frequency estimates to more elaborate genetic differentiation tests within populations, together with the ability to combine population samples from different databases. The present study demonstrates the viability of implementing scripts for handling extensive datasets of SNP genotypes with low computational costs, dealing with certain complex issues that arise from the divergent nature and configuration of the most popular SNP repositories. The information contained in these databases can also be enriched with additional information obtained from other complementary databases, in order to build a dedicated data mart. Updating the data structure is straightforward, as well as permitting easy implementation of new external data and the computation of supplementary statistical indices of interest.
Unravelling the Genetic Diversity among Cassava Bemisia tabaci Whiteflies Using NextRAD Sequencing.
Wosula, Everlyne N; Chen, Wenbo; Fei, Zhangjun; Legg, James P
2017-11-01
Bemisia tabaci threatens production of cassava in Africa through vectoring viruses that cause cassava mosaic disease (CMD) and cassava brown streak disease (CBSD). B. tabaci sampled from cassava in eight countries in Africa were genotyped using NextRAD sequencing, and their phylogeny and population genetics were investigated using the resultant single nucleotide polymorphism (SNP) markers. SNP marker data and short sequences of mitochondrial DNA cytochrome oxidase I (mtCOI) obtained from the same insect were compared. Eight genetically distinct groups were identified based on mtCOI, whereas phylogenetic analysis using SNPs identified six major groups, which were further confirmed by PCA and multidimensional analyses. STRUCTURE analysis identified four ancestral B. tabaci populations that have contributed alleles to the six SNP-based groups. Significant gene flows were detected between several of the six SNP-based groups. Evidence of gene flow was strongest for SNP-based groups occurring in central Africa. Comparison of the mtCOI and SNP identities of sampled insects provided a strong indication that hybrid populations are emerging in parts of Africa recently affected by the severe CMD pandemic. This study reveals that mtCOI is not an effective marker at distinguishing cassava-colonizing B. tabaci haplogroups, and that more robust SNP-based multilocus markers should be developed. Significant gene flows between populations could lead to the emergence of haplogroups that might alter the dynamics of cassava virus spread and disease severity in Africa. Continuous monitoring of genetic compositions of whitefly populations should be an essential component in efforts to combat cassava viruses in Africa. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Xiao, Shijun; Wang, Panpan; Dong, Linsong; Zhang, Yaguang; Han, Zhaofang; Wang, Qiurong
2016-01-01
Whole-genome single-nucleotide polymorphism (SNP) markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS) provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms. PMID:28028455
Wu, Xiaoping; Guldbrandtsen, Bernt; Lund, Mogens Sandø; Sahana, Goutam
2016-09-01
Identification of genetic variants associated with feet and legs disorders (FLD) will aid in the genetic improvement of these traits by providing knowledge on genes that influence trait variations. In Denmark, FLD in cattle has been recorded since the 1990s. In this report, we used deregressed breeding values as response variables for a genome-wide association study. Bulls (5,334 Danish Holstein, 4,237 Nordic Red Dairy Cattle, and 1,180 Danish Jersey) with deregressed estimated breeding values were genotyped with the Illumina Bovine 54k single nucleotide polymorphism (SNP) genotyping array. Genotypes were imputed to whole-genome sequence variants, and then 22,751,039 SNP on 29 autosomes were used for an association analysis. A modified linear mixed-model approach (efficient mixed-model association eXpedited, EMMAX) and a linear mixed model were used for association analysis. We identified 5 (3,854 SNP), 3 (13,642 SNP), and 0 quantitative trait locus (QTL) regions associated with the FLD index in Danish Holstein, Nordic Red Dairy Cattle, and Danish Jersey populations, respectively. We did not identify any QTL that were common among the 3 breeds. In a meta-analysis of the 3 breeds, 4 QTL regions were significant, but no additional QTL region was identified compared with within-breed analyses. Comparison between top SNP locations within these QTL regions and known genes suggested that RASGRP1, LCORL, MOS, and MITF may be candidate genes for FLD in dairy cattle. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Gupta, Surya; De Puysseleyr, Veronic; Van der Heyden, José; Maddelein, Davy; Lemmens, Irma; Lievens, Sam; Degroeve, Sven; Tavernier, Jan; Martens, Lennart
2017-05-01
Protein-protein interaction (PPI) studies have dramatically expanded our knowledge about cellular behaviour and development in different conditions. A multitude of high-throughput PPI techniques have been developed to achieve proteome-scale coverage for PPI studies, including the microarray based Mammalian Protein-Protein Interaction Trap (MAPPIT) system. Because such high-throughput techniques typically report thousands of interactions, managing and analysing the large amounts of acquired data is a challenge. We have therefore built the MAPPIT cell microArray Protein Protein Interaction-Data management & Analysis Tool (MAPPI-DAT) as an automated data management and analysis tool for MAPPIT cell microarray experiments. MAPPI-DAT stores the experimental data and metadata in a systematic and structured way, automates data analysis and interpretation, and enables the meta-analysis of MAPPIT cell microarray data across all stored experiments. MAPPI-DAT is developed in Python, using R for data analysis and MySQL as data management system. MAPPI-DAT is cross-platform and can be ran on Microsoft Windows, Linux and OS X/macOS. The source code and a Microsoft Windows executable are freely available under the permissive Apache2 open source license at https://github.com/compomics/MAPPI-DAT. jan.tavernier@vib-ugent.be or lennart.martens@vib-ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
What is the study? This study is the first to use microarray analysis in the Ames strains of Salmonella. The microarray chips were custom-designed for this study and are not commercially available, and we evaluated the well-studied drinking water mutagen, MX. Because much inform...
MICROARRAY ANALYSIS OF DICHLOROACETIC ACID-INDUCED CHANGES IN GENE EXPRESSION
MICROARRAY ANALYSIS OF DICHLOROACETIC ACID-INDUCED CHANGES IN GENE EXPRESSION
Dichloroacetic acid (DCA) is a major by-product of water disinfection by chlorination. Several studies have demonstrated the hepatocarcinogenicity of DCA in rodents when administered in dri...
Robust gene selection methods using weighting schemes for microarray data analysis.
Kang, Suyeon; Song, Jongwoo
2017-09-02
A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.
Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang
2008-01-01
Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
The application of DNA microarrays in gene expression analysis.
van Hal, N L; Vorst, O; van Houwelingen, A M; Kok, E J; Peijnenburg, A; Aharoni, A; van Tunen, A J; Keijer, J
2000-03-31
DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed. These comprise array manufacturing and design, array hybridisation, scanning, and data handling. Furthermore, it is discussed how DNA microarrays can be applied in the working fields of: safety, functionality and health of food and gene discovery and pathway engineering in plants.
Implementation of mutual information and bayes theorem for classification microarray data
NASA Astrophysics Data System (ADS)
Dwifebri Purbolaksono, Mahendra; Widiastuti, Kurnia C.; Syahrul Mubarok, Mohamad; Adiwijaya; Aminy Ma’ruf, Firda
2018-03-01
Microarray Technology is one of technology which able to read the structure of gen. The analysis is important for this technology. It is for deciding which attribute is more important than the others. Microarray technology is able to get cancer information to diagnose a person’s gen. Preparation of microarray data is a huge problem and takes a long time. That is because microarray data contains high number of insignificant and irrelevant attributes. So, it needs a method to reduce the dimension of microarray data without eliminating important information in every attribute. This research uses Mutual Information to reduce dimension. System is built with Machine Learning approach specifically Bayes Theorem. This theorem uses a statistical and probability approach. By combining both methods, it will be powerful for Microarray Data Classification. The experiment results show that system is good to classify Microarray data with highest F1-score using Bayesian Network by 91.06%, and Naïve Bayes by 88.85%.
Gene ARMADA: an integrated multi-analysis platform for microarray data implemented in MATLAB.
Chatziioannou, Aristotelis; Moulos, Panagiotis; Kolisis, Fragiskos N
2009-10-27
The microarray data analysis realm is ever growing through the development of various tools, open source and commercial. However there is absence of predefined rational algorithmic analysis workflows or batch standardized processing to incorporate all steps, from raw data import up to the derivation of significantly differentially expressed gene lists. This absence obfuscates the analytical procedure and obstructs the massive comparative processing of genomic microarray datasets. Moreover, the solutions provided, heavily depend on the programming skills of the user, whereas in the case of GUI embedded solutions, they do not provide direct support of various raw image analysis formats or a versatile and simultaneously flexible combination of signal processing methods. We describe here Gene ARMADA (Automated Robust MicroArray Data Analysis), a MATLAB implemented platform with a Graphical User Interface. This suite integrates all steps of microarray data analysis including automated data import, noise correction and filtering, normalization, statistical selection of differentially expressed genes, clustering, classification and annotation. In its current version, Gene ARMADA fully supports 2 coloured cDNA and Affymetrix oligonucleotide arrays, plus custom arrays for which experimental details are given in tabular form (Excel spreadsheet, comma separated values, tab-delimited text formats). It also supports the analysis of already processed results through its versatile import editor. Besides being fully automated, Gene ARMADA incorporates numerous functionalities of the Statistics and Bioinformatics Toolboxes of MATLAB. In addition, it provides numerous visualization and exploration tools plus customizable export data formats for seamless integration by other analysis tools or MATLAB, for further processing. Gene ARMADA requires MATLAB 7.4 (R2007a) or higher and is also distributed as a stand-alone application with MATLAB Component Runtime. Gene ARMADA provides a highly adaptable, integrative, yet flexible tool which can be used for automated quality control, analysis, annotation and visualization of microarray data, constituting a starting point for further data interpretation and integration with numerous other tools.
Microarray data mining using Bioconductor packages.
Nie, Haisheng; Neerincx, Pieter B T; van der Poel, Jan; Ferrari, Francesco; Bicciato, Silvio; Leunissen, Jack A M; Groenen, Martien A M
2009-07-16
This paper describes the results of a Gene Ontology (GO) term enrichment analysis of chicken microarray data using the Bioconductor packages. By checking the enriched GO terms in three contrasts, MM8-PM8, MM8-MA8, and MM8-MM24, of the provided microarray data during this workshop, this analysis aimed to investigate the host reactions in chickens occurring shortly after a secondary challenge with either a homologous or heterologous species of Eimeria. The results of GO enrichment analysis using GO terms annotated to chicken genes and GO terms annotated to chicken-human orthologous genes were also compared. Furthermore, a locally adaptive statistical procedure (LAP) was performed to test differentially expressed chromosomal regions, rather than individual genes, in the chicken genome after Eimeria challenge. GO enrichment analysis identified significant (raw p-value < 0.05) GO terms for all three contrasts included in the analysis. Some of the GO terms linked to, generally, primary immune responses or secondary immune responses indicating the GO enrichment analysis is a useful approach to analyze microarray data. The comparisons of GO enrichment results using chicken gene information and chicken-human orthologous gene information showed more refined GO terms related to immune responses when using chicken-human orthologous gene information, this suggests that using chicken-human orthologous gene information has higher power to detect significant GO terms with more refined functionality. Furthermore, three chromosome regions were identified to be significantly up-regulated in contrast MM8-PM8 (q-value < 0.01). Overall, this paper describes a practical approach to analyze microarray data in farm animals where the genome information is still incomplete. For farm animals, such as chicken, with currently limited gene annotation, borrowing gene annotation information from orthologous genes in well-annotated species, such as human, will help improve the pathway analysis results substantially. Furthermore, LAP analysis approach is a relatively new and very useful way to be applied in microarray analysis.
Polyadenylation state microarray (PASTA) analysis.
Beilharz, Traude H; Preiss, Thomas
2011-01-01
Nearly all eukaryotic mRNAs terminate in a poly(A) tail that serves important roles in mRNA utilization. In the cytoplasm, the poly(A) tail promotes both mRNA stability and translation, and these functions are frequently regulated through changes in tail length. To identify the scope of poly(A) tail length control in a transcriptome, we developed the polyadenylation state microarray (PASTA) method. It involves the purification of mRNA based on poly(A) tail length using thermal elution from poly(U) sepharose, followed by microarray analysis of the resulting fractions. In this chapter we detail our PASTA approach and describe some methods for bulk and mRNA-specific poly(A) tail length measurements of use to monitor the procedure and independently verify the microarray data.
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers
USDA-ARS?s Scientific Manuscript database
The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
Analysis of genetic diversity using SNP markers in oat
USDA-ARS?s Scientific Manuscript database
A large-scale single nucleotide polymorphism (SNP) discovery was carried out in cultivated oat using Roche 454 sequencing methods. DNA sequences were generated from cDNAs originating from a panel of 20 diverse oat cultivars, and from Diversity Array Technology (DArT) genomic complexity reductions fr...
Zhao, Xu; Qin, Shengying; Shi, Yongyong; Zhang, Aiping; Zhang, Jing; Bian, Li; Wan, Chunling; Feng, Guoyin; Gu, Niufan; Zhang, Guangqi; He, Guang; He, Lin
2007-07-01
Several studies have suggested the dysfunction of the GABAergic system as a risk factor in the pathogenesis of schizophrenia. In the present study, case-control association analysis was conducted in four GABAergic genes: two glutamic acid decarboxylase genes (GAD1 and GAD2), a GABA(A) receptor subunit beta2 gene (GABRB2) and a GABA(B) receptor 1 gene (GABBR1). Using a universal DNA microarray procedure we genotyped a total of 20 SNPs on the above four genes in a study involving 292 patients and 286 controls of Chinese descent. Statistically significant differences were observed in the allelic frequencies of the rs187269C/T polymorphism in the GABRB2 gene (P=0.0450, chi(2)=12.40, OR=1.65) and the -292A/C polymorphism in the GAD1 gene (P=0.0450, chi(2)=14.64 OR=1.77). In addition, using an electrophoretic mobility shift assay (EMSA), we discovered differences in the U251 nuclear protein binding to oligonucleotides representing the -292 SNP on the GAD1 gene, which suggests that the -292C allele has reduced transcription factor binding efficiency compared with the 292A allele. Using the multifactor-dimensionality reduction method (MDR), we found that the interactions among the rs187269C/T polymorphism in the GABRB2 gene, the -243A/G polymorphism in the GAD2 gene and the 27379C/T and 661C/T polymorphisms in the GAD1 gene revealed a significant association with schizophrenia (P<0.001). These findings suggest that the GABRB2 and GAD1 genes alone and the combined effects of the polymorphisms in the four GABAergic system genes may confer susceptibility to the development of schizophrenia in the Chinese population.
Samusik, Nikolay; Krukovskaya, Larisa; Meln, Irina; Shilov, Evgeny; Kozlov, Andrey P.
2013-01-01
PBOV1 is a known human protein-coding gene with an uncharacterized function. We have previously found that PBOV1 lacks orthologs in non-primate genomes and is expressed in a wide range of tumor types. Here we report that PBOV1 protein-coding sequence is human-specific and has originated de novo in the primate evolution through a series of frame-shift and stop codon mutations. We profiled PBOV1 expression in multiple cancer and normal tissue samples and found that it was expressed in 19 out of 34 tumors of various origins but completely lacked expression in any of the normal adult or fetal human tissues. We found that, unlike the cancer/testis antigens that are typically controlled by CpG island-containing promoters, PBOV1 was expressed from a GC-poor TATA-containing promoter which was not influenced by CpG demethylation and was inactive in testis. Our analysis of public microarray data suggests that PBOV1 activation in tumors could be dependent on the Hedgehog signaling pathway. Despite the recent de novo origin and the lack of identifiable functional signatures, a missense SNP in the PBOV1 coding sequence has been previously associated with an increased risk of breast cancer. Using publicly available microarray datasets, we found that high levels of PBOV1 expression in breast cancer and glioma samples were significantly associated with a positive outcome of the cancer disease. We also found that PBOV1 was highly expressed in primary but not in recurrent high-grade gliomas, suggesting the presence of a negative selection against PBOV1-expressing cancer cells. Our findings could contribute to the understanding of the mechanisms behind de novo gene origin and the possible role of tumors in this process. PMID:23418531
Microelectronic DNA assay for the detection of BRCA1 gene mutations
NASA Technical Reports Server (NTRS)
Chen, Hua; Han, Jie; Li, Jun; Meyyappan, Meyya
2004-01-01
Mutations in BRCA1 are characterized by predisposition to breast cancer, ovarian cancer and prostate cancer as well as colon cancer. Prognosis for this cancer survival depends upon the stage at which cancer is diagnosed. Reliable and rapid mutation detection is crucial for the early diagnosis and treatment. We developed an electronic assay for the detection of a representative single nucleotide polymorphism (SNP), deletion and insertion in BRCA1 gene by the microelectronics microarray instrumentation. The assay is rapid, and it takes 30 minutes for the immobilization of target DNA samples, hybridization, washing and readout. The assay is multiplexing since it is carried out at the same temperature and buffer conditions for each step. The assay is also highly specific, as the signal-to-noise ratio is much larger than recommended value (72.86 to 321.05 vs. 5) for homozygotes genotyping, and signal ratio close to the perfect value 1 for heterozygotes genotyping (1.04).
The Use of Atomic Force Microscopy for 3D Analysis of Nucleic Acid Hybridization on Microarrays.
Dubrovin, E V; Presnova, G V; Rubtsova, M Yu; Egorov, A M; Grigorenko, V G; Yaminsky, I V
2015-01-01
Oligonucleotide microarrays are considered today to be one of the most efficient methods of gene diagnostics. The capability of atomic force microscopy (AFM) to characterize the three-dimensional morphology of single molecules on a surface allows one to use it as an effective tool for the 3D analysis of a microarray for the detection of nucleic acids. The high resolution of AFM offers ways to decrease the detection threshold of target DNA and increase the signal-to-noise ratio. In this work, we suggest an approach to the evaluation of the results of hybridization of gold nanoparticle-labeled nucleic acids on silicon microarrays based on an AFM analysis of the surface both in air and in liquid which takes into account of their three-dimensional structure. We suggest a quantitative measure of the hybridization results which is based on the fraction of the surface area occupied by the nanoparticles.
The Utility of Chromosomal Microarray Analysis in Developmental and Behavioral Pediatrics
ERIC Educational Resources Information Center
Beaudet, Arthur L.
2013-01-01
Chromosomal microarray analysis (CMA) has emerged as a powerful new tool to identify genomic abnormalities associated with a wide range of developmental disabilities including congenital malformations, cognitive impairment, and behavioral abnormalities. CMA includes array comparative genomic hybridization (CGH) and single nucleotide polymorphism…
2011-01-01
Background Cytogenetic evaluation is a key component of the diagnosis and prognosis of chronic lymphocytic leukemia (CLL). We performed oligonucleotide-based comparative genomic hybridization microarray analysis on 34 samples with CLL and known abnormal karyotypes previously determined by cytogenetics and/or fluorescence in situ hybridization (FISH). Results Using a custom designed microarray that targets >1800 genes involved in hematologic disease and other malignancies, we identified additional cryptic aberrations and novel findings in 59% of cases. These included gains and losses of genes associated with cell cycle regulation, apoptosis and susceptibility loci on 3p21.31, 5q35.2q35.3, 10q23.31q23.33, 11q22.3, and 22q11.23. Conclusions Our results show that microarray analysis will detect known aberrations, including microscopic and cryptic alterations. In addition, novel genomic changes will be uncovered that may become important prognostic predictors or treatment targets for CLL in the future. PMID:22087757
Nandi, Shyam Sundar; Sharma, Deepa Kailash; Deshpande, Jagadish M
2016-07-01
It is important to understand the role of cell surface receptors in susceptibility to infectious diseases. CD155 a member of the immunoglobulin super family, serves as the poliovirus receptor (PVR). Heterozygous (Ala67Thr) polymorphism in CD155 has been suggested as a risk factor for paralytic outcome of poliovirus infection. The present study pertains to the development of a screening test to detect the single nucleotide (SNP) polymorphism in the CD155 gene. New primers were designed for PCR, sequencing and SNP analysis of Exon2 of CD155 gene. DNAs extracted from either whole blood (n=75) or cells from oral cavity (n=75) were used for standardization and validation of the SNP assay. DNA sequencing was used as the gold standard method. A new SNP assay for detection of heterozygous Ala67Thr genotype was developed and validated by testing 150 DNA samples. Heterozygous CD155 was detected in 27.33 per cent (41/150) of DNA samples tested by both SNP detection assay and sequencing. The SNP detection assay was successfully developed for identification of Ala67Thr polymorphism in human PVR/CD155 gene. The SNP assay will be useful for large scale screening of DNA samples.
Andrews, Kimberly R; Adams, Jennifer R; Cassirer, E Frances; Plowright, Raina K; Gardner, Colby; Dwire, Maggie; Hohenlohe, Paul A; Waits, Lisette P
2018-06-05
The development of high-throughput sequencing technologies is dramatically increasing the use of single nucleotide polymorphisms (SNPs) across the field of genetics, but most parentage studies of wild populations still rely on microsatellites. We developed a bioinformatic pipeline for identifying SNP panels that are informative for parentage analysis from restriction site-associated DNA sequencing (RADseq) data. This pipeline includes options for analysis with or without a reference genome, and provides methods to maximize genotyping accuracy and select sets of unlinked loci that have high statistical power. We test this pipeline on small populations of Mexican gray wolf and bighorn sheep, for which parentage analyses are expected to be challenging due to low genetic diversity and the presence of many closely related individuals. We compare the results of parentage analysis across SNP panels generated with or without the use of a reference genome, and between SNPs and microsatellites. For Mexican gray wolf, we conducted parentage analyses for 30 pups from a single cohort where samples were available from 64% of possible mothers and 53% of possible fathers, and the accuracy of parentage assignments could be estimated because true identities of parents were known a priori based on field data. For bighorn sheep, we conducted maternity analyses for 39 lambs from five cohorts where 77% of possible mothers were sampled, but true identities of parents were unknown. Analyses with and without a reference genome produced SNP panels with >95% parentage assignment accuracy for Mexican gray wolf, outperforming microsatellites at 78% accuracy. Maternity assignments were completely consistent across all SNP panels for the bighorn sheep, and were 74.4% consistent with assignments from microsatellites. Accuracy and consistency of parentage analysis were not reduced when using as few as 284 SNPs for Mexican gray wolf and 142 SNPs for bighorn sheep, indicating our pipeline can be used to develop SNP genotyping assays for parentage analysis with relatively small numbers of loci. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
A novel approach to analyzing fMRI and SNP data via parallel independent component analysis
NASA Astrophysics Data System (ADS)
Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas
2007-03-01
There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.
Sallman, David A.; Basiorka, Ashley A.; Irvine, Brittany A.; Zhang, Ling; Epling-Burnette, P.K.; Rollison, Dana E.; Mallo, Mar; Sokol, Lubomir; Solé, Francesc; Maciejewski, Jaroslaw; List, Alan F.
2015-01-01
P53 is a key regulator of many cellular processes and is negatively regulated by the human homolog of murine double minute-2 (MDM2) E3 ubiquitin ligase. Single nucleotide polymorphisms (SNPs) of either gene alone, and in combination, are linked to cancer susceptibility, disease progression, and therapy response. We analyzed the interaction of TP53 R72P and MDM2 SNP309 SNPs in relationship to outcome in patients with myelodysplastic syndromes (MDS). Sanger sequencing was performed on DNA isolated from 208 MDS cases. Utilizing a novel functional SNP scoring system ranging from +2 to −2 based on predicted p53 activity, we found statistically significant differences in overall survival (OS) (p = 0.02) and progression-free survival (PFS) (p = 0.02) in non-del(5q) MDS patients with low functional scores. In univariate analysis, only IPSS and the functional SNP score predicted OS and PFS in non-del(5q) patients. In multivariate analysis, the functional SNP score was independent of IPSS for OS and PFS. These data underscore the importance of TP53 R72P and MDM2 SNP309 SNPs in MDS, and provide a novel scoring system independent of IPSS that is predictive for disease outcome. PMID:26416416
Zhang, RuiJie; Li, Xia; Jiang, YongShuai; Liu, GuiYou; Li, ChuanXing; Zhang, Fan; Xiao, Yun; Gong, BinSheng
2009-02-01
High-throughout single nucleotide polymorphism detection technology and the existing knowledge provide strong support for mining the disease-related haplotypes and genes. In this study, first, we apply four kinds of haplotype identification methods (Confidence Intervals, Four Gamete Tests, Solid Spine of LD and fusing method of haplotype block) into high-throughout SNP genotype data to identify blocks, then use cluster analysis to verify the effectiveness of the four methods, and select the alcoholism-related SNP haplotypes through risk analysis. Second, we establish a mapping from haplotypes to alcoholism-related genes. Third, we inquire NCBI SNP and gene databases to locate the blocks and identify the candidate genes. In the end, we make gene function annotation by KEGG, Biocarta, and GO database. We find 159 haplotype blocks, which relate to the alcoholism most possibly on chromosome 1 approximately 22, including 227 haplotypes, of which 102 SNP haplotypes may increase the risk of alcoholism. We get 121 alcoholism-related genes and verify their reliability by the functional annotation of biology. In a word, we not only can handle the SNP data easily, but also can locate the disease-related genes precisely by combining our novel strategies of mining alcoholism-related haplotypes and genes with existing knowledge framework.
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data.
Tan, Qihua; Thomassen, Mads; Burton, Mark; Mose, Kristian Fredløv; Andersen, Klaus Ejner; Hjelmborg, Jacob; Kruse, Torben
2017-06-06
Modeling complex time-course patterns is a challenging issue in microarray study due to complex gene expression patterns in response to the time-course experiment. We introduce the generalized correlation coefficient and propose a combinatory approach for detecting, testing and clustering the heterogeneous time-course gene expression patterns. Application of the method identified nonlinear time-course patterns in high agreement with parametric analysis. We conclude that the non-parametric nature in the generalized correlation analysis could be an useful and efficient tool for analyzing microarray time-course data and for exploring the complex relationships in the omics data for studying their association with disease and health.
Baghbaderani, Behnam Ahmadian; Syama, Adhikarla; Sivapatham, Renuka; Pei, Ying; Mukherjee, Odity; Fellner, Thomas; Zeng, Xianmin; Rao, Mahendra S
2016-08-01
We have recently described manufacturing of human induced pluripotent stem cells (iPSC) master cell banks (MCB) generated by a clinically compliant process using cord blood as a starting material (Baghbaderani et al. in Stem Cell Reports, 5(4), 647-659, 2015). In this manuscript, we describe the detailed characterization of the two iPSC clones generated using this process, including whole genome sequencing (WGS), microarray, and comparative genomic hybridization (aCGH) single nucleotide polymorphism (SNP) analysis. We compare their profiles with a proposed calibration material and with a reporter subclone and lines made by a similar process from different donors. We believe that iPSCs are likely to be used to make multiple clinical products. We further believe that the lines used as input material will be used at different sites and, given their immortal status, will be used for many years or even decades. Therefore, it will be important to develop assays to monitor the state of the cells and their drift in culture. We suggest that a detailed characterization of the initial status of the cells, a comparison with some calibration material and the development of reporter sublcones will help determine which set of tests will be most useful in monitoring the cells and establishing criteria for discarding a line.
Klein, Karl Martin; Pendziwiat, Manuela; Eilam, Anda; Gilad, Ronit; Blatt, Ilan; Rosenow, Felix; Kanaan, Moien; Helbig, Ingo; Afawi, Zaid
2017-07-01
Mutations or structural genomic alterations of the X-chromosomal gene ARHGEF9 have been described in male and female patients with intellectual disability. Hyperekplexia and epilepsy were observed to a variable degree, but incompletely described. Here, we expand the phenotypic spectrum of ARHGEF9 by describing a large Ethiopian-Jewish family with epilepsy and intellectual disability. The four affected male siblings, their unaffected parents and two unaffected female siblings were recruited and phenotyped. Parametric linkage analysis was performed using SNP microarrays. Variants from exome sequencing in two affected individuals were confirmed by Sanger sequencing. All affected male siblings had febrile seizures from age 2-3 years and intellectual disability. Three developed afebrile seizures between age 7-17 years. Three showed focal seizure semiology. None had hyperekplexia. A novel ARHGEF9 variant (c.967G>A, p.G323R, NM_015185.2) was hemizygous in all affected male siblings and heterozygous in the mother. This family reveals that the phenotypic spectrum of ARHGEF9 is broader than commonly assumed and includes febrile seizures and focal epilepsy with intellectual disability in the absence of hyperekplexia or other clinically distinguishing features. Our findings suggest that pathogenic variants in ARHGEF9 may be more common than previously assumed in patients with intellectual disability and mild epilepsy.
Huerta, Mario; Munyi, Marc; Expósito, David; Querol, Enric; Cedano, Juan
2014-06-15
The microarrays performed by scientific teams grow exponentially. These microarray data could be useful for researchers around the world, but unfortunately they are underused. To fully exploit these data, it is necessary (i) to extract these data from a repository of the high-throughput gene expression data like Gene Expression Omnibus (GEO) and (ii) to make the data from different microarrays comparable with tools easy to use for scientists. We have developed these two solutions in our server, implementing a database of microarray marker genes (Marker Genes Data Base). This database contains the marker genes of all GEO microarray datasets and it is updated monthly with the new microarrays from GEO. Thus, researchers can see whether the marker genes of their microarray are marker genes in other microarrays in the database, expanding the analysis of their microarray to the rest of the public microarrays. This solution helps not only to corroborate the conclusions regarding a researcher's microarray but also to identify the phenotype of different subsets of individuals under investigation, to frame the results with microarray experiments from other species, pathologies or tissues, to search for drugs that promote the transition between the studied phenotypes, to detect undesirable side effects of the treatment applied, etc. Thus, the researcher can quickly add relevant information to his/her studies from all of the previous analyses performed in other studies as long as they have been deposited in public repositories. Marker-gene database tool: http://ibb.uab.es/mgdb © The Author 2014. Published by Oxford University Press.
SNP-based genotyping in lentil: linking sequence information with phenotypes
USDA-ARS?s Scientific Manuscript database
Lentil (Lens culinaris) has been late to enter the world of high throughput molecular analysis due to a general lack of genomic resources. Using a 454 sequencing-based approach, SNPs have been identified in genes across the lentil genome. Several hundred have been turned into single SNP KASP assay...
Kawaura, Kanako; Mochida, Keiichi; Yamazaki, Yukiko; Ogihara, Yasunari
2006-04-01
In this study, we constructed a 22k wheat oligo-DNA microarray. A total of 148,676 expressed sequence tags of common wheat were collected from the database of the Wheat Genomics Consortium of Japan. These were grouped into 34,064 contigs, which were then used to design an oligonucleotide DNA microarray. Following a multistep selection of the sense strand, 21,939 60-mer oligo-DNA probes were selected for attachment on the microarray slide. This 22k oligo-DNA microarray was used to examine the transcriptional response of wheat to salt stress. More than 95% of the probes gave reproducible hybridization signals when targeted with RNAs extracted from salt-treated wheat shoots and roots. With the microarray, we identified 1,811 genes whose expressions changed more than 2-fold in response to salt. These included genes known to mediate response to salt, as well as unknown genes, and they were classified into 12 major groups by hierarchical clustering. These gene expression patterns were also confirmed by real-time reverse transcription-PCR. Many of the genes with unknown function were clustered together with genes known to be involved in response to salt stress. Thus, analysis of gene expression patterns combined with gene ontology should help identify the function of the unknown genes. Also, functional analysis of these wheat genes should provide new insight into the response to salt stress. Finally, these results indicate that the 22k oligo-DNA microarray is a reliable method for monitoring global gene expression patterns in wheat.
Bălăcescu, Loredana; Bălăcescu, O; Crişan, N; Fetica, B; Petruţ, B; Bungărdean, Cătălina; Rus, Meda; Tudoran, Oana; Meurice, G; Irimie, Al; Dragoş, N; Berindan-Neagoe, Ioana
2011-01-01
Prostate cancer represents the first leading cause of cancer among western male population, with different clinical behavior ranging from indolent to metastatic disease. Although many molecules and deregulated pathways are known, the molecular mechanisms involved in the development of prostate cancer are not fully understood. The aim of this study was to explore the molecular variation underlying the prostate cancer, based on microarray analysis and bioinformatics approaches. Normal and prostate cancer tissues were collected by macrodissection from prostatectomy pieces. All prostate cancer specimens used in our study were Gleason score 7. Gene expression microarray (Agilent Technologies) was used for Whole Human Genome evaluation. The bioinformatics and functional analysis were based on Limma and Ingenuity software. The microarray analysis identified 1119 differentially expressed genes between prostate cancer and normal prostate, which were up- or down-regulated at least 2-fold. P-values were adjusted for multiple testing using Benjamini-Hochberg method with a false discovery rate of 0.01. These genes were analyzed with Ingenuity Pathway Analysis software and were established 23 genetic networks. Our microarray results provide new information regarding the molecular networks in prostate cancer stratified as Gleason 7. These data highlighted gene expression profiles for better understanding of prostate cancer progression.
Tojo, Axel; Malm, Johan; Marko-Varga, György; Lilja, Hans; Laurell, Thomas
2014-01-01
The antibody microarrays have become widespread, but their use for quantitative analyses in clinical samples has not yet been established. We investigated an immunoassay based on nanoporous silicon antibody microarrays for quantification of total prostate-specific-antigen (PSA) in 80 clinical plasma samples, and provide quantitative data from a duplex microarray assay that simultaneously quantifies free and total PSA in plasma. To further develop the assay the porous silicon chips was placed into a standard 96-well microtiter plate for higher throughput analysis. The samples analyzed by this quantitative microarray were 80 plasma samples obtained from men undergoing clinical PSA testing (dynamic range: 0.14-44ng/ml, LOD: 0.14ng/ml). The second dataset, measuring free PSA (dynamic range: 0.40-74.9ng/ml, LOD: 0.47ng/ml) and total PSA (dynamic range: 0.87-295ng/ml, LOD: 0.76ng/ml), was also obtained from the clinical routine. The reference for the quantification was a commercially available assay, the ProStatus PSA Free/Total DELFIA. In an analysis of 80 plasma samples the microarray platform performs well across the range of total PSA levels. This assay might have the potential to substitute for the large-scale microtiter plate format in diagnostic applications. The duplex assay paves the way for a future quantitative multiplex assay, which analyses several prostate cancer biomarkers simultaneously. PMID:22921878
Wimmer, Isabella; Tröscher, Anna R; Brunner, Florian; Rubino, Stephen J; Bien, Christian G; Weiner, Howard L; Lassmann, Hans; Bauer, Jan
2018-04-20
Formalin-fixed paraffin-embedded (FFPE) tissues are valuable resources commonly used in pathology. However, formalin fixation modifies nucleic acids challenging the isolation of high-quality RNA for genetic profiling. Here, we assessed feasibility and reliability of microarray studies analysing transcriptome data from fresh, fresh-frozen (FF) and FFPE tissues. We show that reproducible microarray data can be generated from only 2 ng FFPE-derived RNA. For RNA quality assessment, fragment size distribution (DV200) and qPCR proved most suitable. During RNA isolation, extending tissue lysis time to 10 hours reduced high-molecular-weight species, while additional incubation at 70 °C markedly increased RNA yields. Since FF- and FFPE-derived microarrays constitute different data entities, we used indirect measures to investigate gene signal variation and relative gene expression. Whole-genome analyses revealed high concordance rates, while reviewing on single-genes basis showed higher data variation in FFPE than FF arrays. Using an experimental model, gene set enrichment analysis (GSEA) of FFPE-derived microarrays and fresh tissue-derived RNA-Seq datasets yielded similarly affected pathways confirming the applicability of FFPE tissue in global gene expression analysis. Our study provides a workflow comprising RNA isolation, quality assessment and microarray profiling using minimal RNA input, thus enabling hypothesis-generating pathway analyses from limited amounts of precious, pathologically significant FFPE tissues.
Chono, Makiko; Matsunaka, Hitoshi; Seki, Masako; Fujita, Masaya; Kiribuchi-Otobe, Chikako; Oda, Shunsuke; Kojima, Hisayo; Nakamura, Shingo
2015-01-01
In the wheat (Triticum aestivum L.) cultivar ‘Zenkoujikomugi’, a single nucleotide polymorphism (SNP) in the promoter of MOTHER OF FT AND TFL1 on chromosome 3A (MFT-3A) causes an increase in the level of gene expression, resulting in strong grain dormancy. We used a DNA marker to detect the ‘Zenkoujikomugi’-type (Zen-type) SNP and examined the genotype of MFT-3A in Japanese wheat varieties, and we found that 169 of 324 varieties carry the Zen-type SNP. In Japanese commercial varieties, the frequency of the Zen-type SNP was remarkably high in the southern part of Japan, but low in the northern part. To examine the relationship between MFT-3A genotype and grain dormancy, we performed a germination assay in three wheat-growing seasons. On average, the varieties carrying the Zen-type SNP showed stronger grain dormancy than the varieties carrying the non-Zen-type SNP. Among commercial cultivars, ‘Iwainodaichi’ (Kyushu), ‘Junreikomugi’ (Kinki-Chugoku-Shikoku), ‘Kinuhime’ (Kanto-Tokai), ‘Nebarigoshi’ (Tohoku-Hokuriku), and ‘Kitamoe’ (Hokkaido) showed the strongest grain dormancy in each geographical group, and all these varieties, except for ‘Kitamoe’, were found to carry the Zen-type SNP. In recent years, the number of varieties carrying the Zen-type SNP has increased in the Tohoku-Hokuriku region, but not in the Hokkaido region. PMID:25931984
Polymorphism in ovine ANXA9 gene and physic-chemical properties and the fraction of protein in milk.
Pecka-Kiełb, Ewa; Czerniawska-Piątkowska, Ewa; Kowalewska-Łuczak, Inga; Vasil, Milan
2018-04-16
Annexin A9 (ANXA9) is a specific fatty acid transport protein. ANXA9 gene is expressed in various tissues, including secretory tissue and mammary glands. The association between three SNPs of the ANXA9 gene and sheep's milk compositions was assessed. Genotype analysis was performed with the use of PCR-RFLP method. The studied ANXA9 polymorphisms had the following MAF (Major Allele Frequency): SNP1: allele G 0,66; SNP2: allele G 0,54; SNP3: allele C 0,57. The study found the most desired profile of protein fractions, namely an increased kappa-casein fractions and a decreased level of whey protein in sheep's milk for SNP1 and SNP3 polymorphisms. Sheep with the SNP1 GA genotype had the highest (P <0.05) content of fat and dry matter in milk. AXNA9 gene polymorphism did not influence the levels of protein, lactose or urea in sheep's milk. The information contained in this study may be useful for determining the impact of the ANXA9 gene on sheep's milk. The ANXA9 SNP1 and SNP3 polymorphisms results could be included in the breeding programs to select the sheep with the genotypes ensuring the highest kappa-casein levels in milk. However, it is worth conducting further research on ANXA9 and milk composition in larger herds of animals and various breeds of sheep. This article is protected by copyright. All rights reserved.
2016-01-01
Abstract Microarray gene expression data sets are jointly analyzed to increase statistical power. They could either be merged together or analyzed by meta-analysis. For a given ensemble of data sets, it cannot be foreseen which of these paradigms, merging or meta-analysis, works better. In this article, three joint analysis methods, Z -score normalization, ComBat and the inverse normal method (meta-analysis) were selected for survival prognosis and risk assessment of breast cancer patients. The methods were applied to eight microarray gene expression data sets, totaling 1324 patients with two clinical endpoints, overall survival and relapse-free survival. The performance derived from the joint analysis methods was evaluated using Cox regression for survival analysis and independent validation used as bias estimation. Overall, Z -score normalization had a better performance than ComBat and meta-analysis. Higher Area Under the Receiver Operating Characteristic curve and hazard ratio were also obtained when independent validation was used as bias estimation. With a lower time and memory complexity, Z -score normalization is a simple method for joint analysis of microarray gene expression data sets. The derived findings suggest further assessment of this method in future survival prediction and cancer classification applications. PMID:26504096
Oligonucleotide microarrays are a powerful tool for unsupervised analysis of chemical impacts on biological systems. However, the lack of well annotated biological pathways for many aquatic organisms, including fish, and the poor power of microarray-based analyses to detect diffe...
Lutkowska, Anna; Roszak, Andrzej; Lianeri, Margarita; Sowińska, Anna; Sotiri, Emianka; Jagodziński, Pawel P
2017-04-01
We studied the role of the NC_000017.10:g.38051348A>G (rs8067378) single nucleotide polymorphism (SNP) located 9.5 kb downstream of gasdermin B (GSDMB), in the development and progression of cervical squamous cell carcinomas (SCC). Using high-resolution melting curve analysis, we genotyped this SNP in patients with cervical SCC (n = 486) and controls (n = 511) from the Polish Caucasian population. Logistic regression analysis was used to adjust for the effect of confounders such as age, parity, oral contraceptive use, tobacco smoking, and menopausal status. The effect of this SNP on the expression of GSDMB was studied by reverse transcription and quantitative real-time polymerase chain reaction analysis of GSDMB transcript levels in SCC tissues. For all patients with SCC, the p trend value calculated for rs8067378 was statistically significant (p trend = 0.0019). The adjusted odds ratio for the G/G vs. A/A genotype was 1.304 (95% confidence interval 1.080-1.574, p = 0.0057) and the adjusted odds ratio for the G/A + G/G vs. A/A genotype was 1.444 (95% confidence interval 1.064-1.959, p = 0.0181). We also found a significant association of the rs8067378 SNP with tumor stages III, IV, and grade of differentiation G3, and with parity, oral contraceptive use, smoking, and women of postmenopausal age. We found increased GSDMB1 isoform transcripts in the cancerous and non-cancerous tissues from carriers of the G allele vs. carriers of the A/A genotype. The rs8067378 SNP variants may increase the expression of GSDMB and the risk of the development and progression of cervical SCC.
Ogunnaike, Babatunde A; Gelmi, Claudio A; Edwards, Jeremy S
2010-05-21
Gene expression studies generate large quantities of data with the defining characteristic that the number of genes (whose expression profiles are to be determined) exceed the number of available replicates by several orders of magnitude. Standard spot-by-spot analysis still seeks to extract useful information for each gene on the basis of the number of available replicates, and thus plays to the weakness of microarrays. On the other hand, because of the data volume, treating the entire data set as an ensemble, and developing theoretical distributions for these ensembles provides a framework that plays instead to the strength of microarrays. We present theoretical results that under reasonable assumptions, the distribution of microarray intensities follows the Gamma model, with the biological interpretations of the model parameters emerging naturally. We subsequently establish that for each microarray data set, the fractional intensities can be represented as a mixture of Beta densities, and develop a procedure for using these results to draw statistical inference regarding differential gene expression. We illustrate the results with experimental data from gene expression studies on Deinococcus radiodurans following DNA damage using cDNA microarrays. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Microarray-based screening of heat shock protein inhibitors.
Schax, Emilia; Walter, Johanna-Gabriela; Märzhäuser, Helene; Stahl, Frank; Scheper, Thomas; Agard, David A; Eichner, Simone; Kirschning, Andreas; Zeilinger, Carsten
2014-06-20
Based on the importance of heat shock proteins (HSPs) in diseases such as cancer, Alzheimer's disease or malaria, inhibitors of these chaperons are needed. Today's state-of-the-art techniques to identify HSP inhibitors are performed in microplate format, requiring large amounts of proteins and potential inhibitors. In contrast, we have developed a miniaturized protein microarray-based assay to identify novel inhibitors, allowing analysis with 300 pmol of protein. The assay is based on competitive binding of fluorescence-labeled ATP and potential inhibitors to the ATP-binding site of HSP. Therefore, the developed microarray enables the parallel analysis of different ATP-binding proteins on a single microarray. We have demonstrated the possibility of multiplexing by immobilizing full-length human HSP90α and HtpG of Helicobacter pylori on microarrays. Fluorescence-labeled ATP was competed by novel geldanamycin/reblastatin derivatives with IC50 values in the range of 0.5 nM to 4 μM and Z(*)-factors between 0.60 and 0.96. Our results demonstrate the potential of a target-oriented multiplexed protein microarray to identify novel inhibitors for different members of the HSP90 family. Copyright © 2014 Elsevier B.V. All rights reserved.
Automatic Identification and Quantification of Extra-Well Fluorescence in Microarray Images.
Rivera, Robert; Wang, Jie; Yu, Xiaobo; Demirkan, Gokhan; Hopper, Marika; Bian, Xiaofang; Tahsin, Tasnia; Magee, D Mitchell; Qiu, Ji; LaBaer, Joshua; Wallstrom, Garrick
2017-11-03
In recent studies involving NAPPA microarrays, extra-well fluorescence is used as a key measure for identifying disease biomarkers because there is evidence to support that it is better correlated with strong antibody responses than statistical analysis involving intraspot intensity. Because this feature is not well quantified by traditional image analysis software, identification and quantification of extra-well fluorescence is performed manually, which is both time-consuming and highly susceptible to variation between raters. A system that could automate this task efficiently and effectively would greatly improve the process of data acquisition in microarray studies, thereby accelerating the discovery of disease biomarkers. In this study, we experimented with different machine learning methods, as well as novel heuristics, for identifying spots exhibiting extra-well fluorescence (rings) in microarray images and assigning each ring a grade of 1-5 based on its intensity and morphology. The sensitivity of our final system for identifying rings was found to be 72% at 99% specificity and 98% at 92% specificity. Our system performs this task significantly faster than a human, while maintaining high performance, and therefore represents a valuable tool for microarray image analysis.
Thermodynamically optimal whole-genome tiling microarray design and validation.
Cho, Hyejin; Chou, Hui-Hsien
2016-06-13
Microarray is an efficient apparatus to interrogate the whole transcriptome of species. Microarray can be designed according to annotated gene sets, but the resulted microarrays cannot be used to identify novel transcripts and this design method is not applicable to unannotated species. Alternatively, a whole-genome tiling microarray can be designed using only genomic sequences without gene annotations, and it can be used to detect novel RNA transcripts as well as known genes. The difficulty with tiling microarray design lies in the tradeoff between probe-specificity and coverage of the genome. Sequence comparison methods based on BLAST or similar software are commonly employed in microarray design, but they cannot precisely determine the subtle thermodynamic competition between probe targets and partially matched probe nontargets during hybridizations. Using the whole-genome thermodynamic analysis software PICKY to design tiling microarrays, we can achieve maximum whole-genome coverage allowable under the thermodynamic constraints of each target genome. The resulted tiling microarrays are thermodynamically optimal in the sense that all selected probes share the same melting temperature separation range between their targets and closest nontargets, and no additional probes can be added without violating the specificity of the microarray to the target genome. This new design method was used to create two whole-genome tiling microarrays for Escherichia coli MG1655 and Agrobacterium tumefaciens C58 and the experiment results validated the design.
Espin-Garcia, Osvaldo; Craiu, Radu V; Bull, Shelley B
2018-02-01
We evaluate two-phase designs to follow-up findings from genome-wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation-maximization-based inference under a semiparametric maximum likelihood formulation tailored for post-GWAS inference. A GWAS-SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT-SNP-dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme-QT strata yields significant power improvements compared to marginal QT- or SNP-based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure. © 2017 The Authors. Genetic Epidemiology Published by Wiley Periodicals, Inc.
Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon
2015-01-01
Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190
Howard, Nicholas P; van de Weg, Eric; Bedford, David S; Peace, Cameron P; Vanderzande, Stijn; Clark, Matthew D; Teh, Soon Li; Cai, Lichun; Luby, James J
2017-01-01
The apple (Malus×domestica) cultivar Honeycrisp has become important economically and as a breeding parent. An earlier study with SSR markers indicated the original recorded pedigree of ‘Honeycrisp’ was incorrect and ‘Keepsake’ was identified as one putative parent, the other being unknown. The objective of this study was to verify ‘Keepsake’ as a parent and identify and genetically describe the unknown parent and its grandparents. A multi-family based dense and high-quality integrated SNP map was created using the apple 8 K Illumina Infinium SNP array. This map was used alongside a large pedigree-connected data set from the RosBREED project to build extended SNP haplotypes and to identify pedigree relationships. ‘Keepsake’ was verified as one parent of ‘Honeycrisp’ and ‘Duchess of Oldenburg’ and ‘Golden Delicious’ were identified as grandparents through the unknown parent. Following this finding, siblings of ‘Honeycrisp’ were identified using the SNP data. Breeding records from several of these siblings suggested that the previously unreported parent is a University of Minnesota selection, MN1627. This selection is no longer available, but now is genetically described through imputed SNP haplotypes. We also present the mosaic grandparental composition of ‘Honeycrisp’ for each of its 17 chromosome pairs. This new pedigree and genetic information will be useful in future pedigree-based genetic studies to connect ‘Honeycrisp’ with other cultivars used widely in apple breeding programs. The created SNP linkage map will benefit future research using the data from the Illumina apple 8 and 20 K and Affymetrix 480 K SNP arrays. PMID:28243452
Galehdari, Hamid; Saki, Najmaldin; Mohammadi-Asl, Javad; Rahim, Fakher
2013-01-01
Crigler-Najjar syndrome (CNS) type I and type II are usually inherited as autosomal recessive conditions that result from mutations in the UGT1A1 gene. The main objective of the present review is to summarize results of all available evidence on the accuracy of SNP-based pathogenicity detection tools compared to published clinical result for the prediction of in nsSNPs that leads to disease using prediction performance method. A comprehensive search was performed to find all mutations related to CNS. Database searches included dbSNP, SNPdbe, HGMD, Swissvar, ensemble, and OMIM. All the mutation related to CNS was extracted. The pathogenicity prediction was done using SNP-based pathogenicity detection tools include SIFT, PHD-SNP, PolyPhen2, fathmm, Provean, and Mutpred. Overall, 59 different SNPs related to missense mutations in the UGT1A1 gene, were reviewed. Comparing the diagnostic OR, PolyPhen2 and Mutpred have the highest detection 4.983 (95% CI: 1.24 - 20.02) in both, following by SIFT (diagnostic OR: 3.25, 95% CI: 1.07 - 9.83). The highest MCC of SNP-based pathogenicity detection tools, was belong to SIFT (34.19%) followed by Provean, PolyPhen2, and Mutpred (29.99%, 29.89%, and 29.89%, respectively). Hence the highest SNP-based pathogenicity detection tools ACC, was fit to SIFT (62.71%) followed by PolyPhen2, and Mutpred (61.02%, in both). Our results suggest that some of the well-established SNP-based pathogenicity detection tools can appropriately reflect the role of a disease-associated SNP in both local and global structures.
SNP-VISTA: An interactive SNP visualization tool
Shah, Nameeta; Teplitsky, Michael V; Minovitsky, Simon; Pennacchio, Len A; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L
2005-01-01
Background Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at [1]. Results We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. Conclusion The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user. PMID:16336665
Usadel, Björn; Nagel, Axel; Steinhauser, Dirk; Gibon, Yves; Bläsing, Oliver E; Redestig, Henning; Sreenivasulu, Nese; Krall, Leonard; Hannah, Matthew A; Poree, Fabien; Fernie, Alisdair R; Stitt, Mark
2006-12-18
Microarray technology has become a widely accepted and standardized tool in biology. The first microarray data analysis programs were developed to support pair-wise comparison. However, as microarray experiments have become more routine, large scale experiments have become more common, which investigate multiple time points or sets of mutants or transgenics. To extract biological information from such high-throughput expression data, it is necessary to develop efficient analytical platforms, which combine manually curated gene ontologies with efficient visualization and navigation tools. Currently, most tools focus on a few limited biological aspects, rather than offering a holistic, integrated analysis. Here we introduce PageMan, a multiplatform, user-friendly, and stand-alone software tool that annotates, investigates, and condenses high-throughput microarray data in the context of functional ontologies. It includes a GUI tool to transform different ontologies into a suitable format, enabling the user to compare and choose between different ontologies. It is equipped with several statistical modules for data analysis, including over-representation analysis and Wilcoxon statistical testing. Results are exported in a graphical format for direct use, or for further editing in graphics programs.PageMan provides a fast overview of single treatments, allows genome-level responses to be compared across several microarray experiments covering, for example, stress responses at multiple time points. This aids in searching for trait-specific changes in pathways using mutants or transgenics, analyzing development time-courses, and comparison between species. In a case study, we analyze the results of publicly available microarrays of multiple cold stress experiments using PageMan, and compare the results to a previously published meta-analysis.PageMan offers a complete user's guide, a web-based over-representation analysis as well as a tutorial, and is freely available at http://mapman.mpimp-golm.mpg.de/pageman/. PageMan allows multiple microarray experiments to be efficiently condensed into a single page graphical display. The flexible interface allows data to be quickly and easily visualized, facilitating comparisons within experiments and to published experiments, thus enabling researchers to gain a rapid overview of the biological responses in the experiments.
Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design
Ramsey, John S; Wilson, Alex CC; de Vos, Martin; Sun, Qi; Tamborindeguy, Cecilia; Winfield, Agnese; Malloch, Gaynor; Smith, Dawn M; Fenton, Brian; Gray, Stewart M; Jander, Georg
2007-01-01
Background The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems through direct feeding damage and by its ability to transmit plant viruses, limited genomic information is available for this species. Results Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without Potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. Conclusion New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its host plants, complementing ongoing work illuminating plant molecular responses to phloem-feeding insects. PMID:18021414
Identification of candidate genes in osteoporosis by integrated microarray analysis.
Li, J J; Wang, B Q; Fei, Q; Yang, Y; Li, D
2016-12-01
In order to screen the altered gene expression profile in peripheral blood mononuclear cells of patients with osteoporosis, we performed an integrated analysis of the online microarray studies of osteoporosis. We searched the Gene Expression Omnibus (GEO) database for microarray studies of peripheral blood mononuclear cells in patients with osteoporosis. Subsequently, we integrated gene expression data sets from multiple microarray studies to obtain differentially expressed genes (DEGs) between patients with osteoporosis and normal controls. Gene function analysis was performed to uncover the functions of identified DEGs. A total of three microarray studies were selected for integrated analysis. In all, 1125 genes were found to be significantly differentially expressed between osteoporosis patients and normal controls, with 373 upregulated and 752 downregulated genes. Positive regulation of the cellular amino metabolic process (gene ontology (GO): 0033240, false discovery rate (FDR) = 1.00E + 00) was significantly enriched under the GO category for biological processes, while for molecular functions, flavin adenine dinucleotide binding (GO: 0050660, FDR = 3.66E-01) and androgen receptor binding (GO: 0050681, FDR = 6.35E-01) were significantly enriched. DEGs were enriched in many osteoporosis-related signalling pathways, including those of mitogen-activated protein kinase (MAPK) and calcium. Protein-protein interaction (PPI) network analysis showed that the significant hub proteins contained ubiquitin specific peptidase 9, X-linked (Degree = 99), ubiquitin specific peptidase 19 (Degree = 57) and ubiquitin conjugating enzyme E2 B (Degree = 57). Analysis of gene function of identified differentially expressed genes may expand our understanding of fundamental mechanisms leading to osteoporosis. Moreover, significantly enriched pathways, such as MAPK and calcium, may involve in osteoporosis through osteoblastic differentiation and bone formation.Cite this article: J. J. Li, B. Q. Wang, Q. Fei, Y. Yang, D. Li. Identification of candidate genes in osteoporosis by integrated microarray analysis. Bone Joint Res 2016;5:594-601. DOI: 10.1302/2046-3758.512.BJR-2016-0073.R1. © 2016 Fei et al.
Explaining the disease phenotype of intergenic SNP through predicted long range regulation
Chen, Jingqi; Tian, Weidong
2016-01-01
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. PMID:27280978
USDA-ARS?s Scientific Manuscript database
Evidence for the impact of mislabeling and/or pollen contamination on consistency of field performance has been lacking to reinforce the need for strict adherence to quality control protocols in cacao seed garden and germplasm plot management. The present study used SNP fingerprinting at 64 loci to ...
The Impact of a Common MDM2 SNP on the Sensitivity of Breast Cancer to Treatment
2011-06-01
Kirchhoff T, Alexe G, Bond EE, Robins H, Bartel F, Taubert H, Wuerl P, Hait W, Toppmeyer D, Offit K, and Levine A. MDM2 SNP309 accelerates tumor...the Western blot analysis corresponding to the quantification in the upper graphs . 29 Figure 5. Effect of
2013-01-02
intensity data from the SNP array were normalized using the Affymetrix GeneChip Targeted Genotyping Analysis Software ( GTGS ). To assess robustness of SNP...calls, genotypes were called using three algorithms: (i) GTGS , (ii) illuminus (27), and (iii) a heuristic algorithm based on discrete cutoffs of
Workflows for microarray data processing in the Kepler environment.
Stropp, Thomas; McPhillips, Timothy; Ludäscher, Bertram; Bieda, Mark
2012-05-17
Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.
Lomonaco, Sara; Furumoto, Emily J; Loquasto, Joseph R; Morra, Patrizia; Grassi, Ausilia; Roberts, Robert F
2015-02-01
Identification at the genus, species, and strain levels is desirable when a probiotic microorganism is added to foods. Strains of Bifidobacterium animalis ssp. lactis (BAL) are commonly used worldwide in dairy products supplemented with probiotic strains. However, strain discrimination is difficult because of the high degree of genome identity (99.975%) between different genomes of this subspecies. Typing of monomorphic species can be carried out efficiently by targeting informative single nucleotide polymorphisms (SNP). Findings from a previous study analyzing both reference and commercial strains of BAL identified SNP that could be used to discriminate common strains into 8 groups. This paper describes development of a minisequencing assay based on the primer extension reaction (PER) targeting multiple SNP that can allow strain differentiation of BAL. Based on previous data, 6 informative SNP were selected for further testing, and a multiplex preliminary PCR was optimized to amplify the DNA regions containing the selected SNP. Extension primers (EP) annealing immediately adjacent to the selected SNP were developed and tested in simplex and multiplex PER to evaluate their performance. Twenty-five strains belonging to 9 distinct genomic clusters of B. animalis ssp. lactis were selected and analyzed using the developed minisequencing assay, simultaneously targeting the 6 selected SNP. Fragment analysis was subsequently carried out in duplicate and demonstrated that the assay yielded 8 specific profiles separating the most commonly used commercial strains. This novel multiplex PER approach provides a simple, rapid, flexible SNP-based subtyping method for proper characterization and identification of commercial probiotic strains of BAL from fermented dairy products. To assess the usefulness of this method, DNA was extracted from yogurt manufactured with and without the addition of B. animalis ssp. lactis BB-12. Extracted DNA was then subjected to the minisequencing protocol, resulting in a SNP profile matching the profile for the strain BB-12. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Jonsson, Frida; Burstedt, Marie S; Sandgren, Ola; Norberg, Anna; Golovleva, Irina
2013-01-01
This study aimed to identify genetic mechanisms underlying severe retinal degeneration in one large family from northern Sweden, members of which presented with early-onset autosomal recessive retinitis pigmentosa and juvenile macular dystrophy. The clinical records of affected family members were analysed retrospectively and ophthalmological and electrophysiological examinations were performed in selected cases. Mutation screening was initially performed with microarrays, interrogating known mutations in the genes associated with recessive retinitis pigmentosa, Leber congenital amaurosis and Stargardt disease. Searching for homozygous regions with putative causative disease genes was done by high-density SNP-array genotyping, followed by segregation analysis of the family members. Two distinct phenotypes of retinal dystrophy, Leber congenital amaurosis and Stargardt disease were present in the family. In the family, four patients with Leber congenital amaurosis were homozygous for a novel c.2557C>T (p.Q853X) mutation in the CRB1 gene, while of two cases with Stargardt disease, one was homozygous for c.5461-10T>C in the ABCA4 gene and another was carrier of the same mutation and a novel ABCA4 mutation c.4773+3A>G. Sequence analysis of the entire ABCA4 gene in patients with Stargardt disease revealed complex alleles with additional sequence variants, which were evaluated by bioinformatics tools. In conclusion, presence of different genetic mechanisms resulting in variable phenotype within the family is not rare and can challenge molecular geneticists, ophthalmologists and genetic counsellors. PMID:23443024
Shioda, Setsuko; Kasai, Fumio; Ozawa, Midori; Hirayama, Noriko; Satoh, Motonobu; Kameoka, Yousuke; Watanabe, Ken; Shimizu, Norio; Tang, Huamin; Mori, Yasuko; Kohara, Arihiro
2018-02-01
Human herpes virus 6 (HHV-6) is a common human pathogen that is most often detected in hematopoietic cells. Although human cells harboring chromosomally integrated HHV-6 can be generated in vitro, the availability of such cell lines originating from in vivo tissues is limited. In this study, chromosomally integrated HHV-6B has been identified in a human vascular endothelial cell line, HUV-EC-C (IFO50271), derived from normal umbilical cord tissue. Sequence analysis revealed that the viral genome was similar to the HHV-6B HST strain. FISH analysis using a HHV-6 DNA probe showed one signal in each cell, detected at the distal end of the long arm of chromosome 9. This was consistent with a digital PCR assay, validating one copy of the viral DNA. Because exposure of HUV-EC-C to chemicals did not cause viral reactivation, long term cell culture of HUV-EC-C was carried out to assess the stability of viral integration. The growth rate was altered depending on passage numbers, and morphology also changed during culture. SNP microarray profiles showed some differences between low and high passages, implying that the HUV-EC-C genome had changed during culture. However, no detectable change was observed in chromosome 9, where HHV-6B integration and the viral copy number remained unchanged. Our results suggest that integrated HHV-6B is stable in HUV-EC-C despite genome instability.
Goodman, Corey W.; Major, Heather J.; Walls, William D.; Sheffield, Val C.; Casavant, Thomas L.; Darbro, Benjamin W.
2016-01-01
Chromosomal microarrays (CMAs) are routinely used in both research and clinical laboratories; yet, little attention has been given to the estimation of genome-wide true and false negatives during the assessment of these assays and how such information could be used to calibrate various algorithmic metrics to improve performance. Low-throughput, locus-specific methods such as fluorescence in situ hybridization (FISH), quantitative PCR (qPCR), or multiplex ligation-dependent probe amplification (MLPA) preclude rigorous calibration of various metrics used by copy number variant (CNV) detection algorithms. To aid this task, we have established a comparative methodology, CNV-ROC, which is capable of performing a high throughput, low cost, analysis of CMAs that takes into consideration genome-wide true and false negatives. CNV-ROC uses a higher resolution microarray to confirm calls from a lower resolution microarray and provides for a true measure of genome-wide performance metrics at the resolution offered by microarray testing. CNV-ROC also provides for a very precise comparison of CNV calls between two microarray platforms without the need to establish an arbitrary degree of overlap. Comparison of CNVs across microarrays is done on a per-probe basis and receiver operator characteristic (ROC) analysis is used to calibrate algorithmic metrics, such as log2 ratio threshold, to enhance CNV calling performance. CNV-ROC addresses a critical and consistently overlooked aspect of analytical assessments of genome-wide techniques like CMAs which is the measurement and use of genome-wide true and false negative data for the calculation of performance metrics and comparison of CNV profiles between different microarray experiments. PMID:25595567
Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun
2016-12-01
Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright © 2016 Elsevier Inc. All rights reserved.
Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies
Gimode, Davis; Odeny, Damaris A.; de Villiers, Etienne P.; Wanyonyi, Solomon; Dida, Mathews M.; Mneney, Emmarold E.; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M.
2016-01-01
Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional breeding programs in order to efficiently optimize productivity. PMID:27454301
Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M
2016-01-01
Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional breeding programs in order to efficiently optimize productivity.
Delannoy, Sabine; Mariani-Kurkdjian, Patricia; Webb, Hattie E; Bonacorsi, Stephane; Fach, Patrick
2017-01-01
Shiga toxin-producing Escherichia coli of serotype O26:H11/H- constitute a diverse group of strains and several clones with distinct genetic characteristics have been identified and characterized. Whole genome sequencing was performed using Illumina and PacBio technologies on eight stx2 -positive O26:H11 strains circulating in France. Comparative analyses of the whole genome of the stx2 -positive O26:H11 strains indicate that several clones of EHEC O26:H11 are co-circulating in France. Phylogenetic analysis of the French strains together with stx2 -positive and stx -negative E. coli O26:H11 genomes obtained from Genbank indicates the existence of four clonal complexes (SNP-CCs) separated in two distinct lineages, one of which comprises the "new French clone" (SNP-CC1) that appears genetically closely related to stx -negative attaching and effacing E. coli (AEEC) strains. Interestingly, the whole genome SNP (wgSNP) phylogeny is summarized in the cas gene phylogeny, and a simple qPCR assay targeting the CRISPR array specific to SNP-CC1 (SP_O26-E) can distinguish between the two main lineages. The PacBio sequencing allowed a detailed analysis of the mobile genetic elements (MGEs) of the strains. Numerous MGEs were identified in each strain, including a large number of prophages and up to four large plasmids, representing overall 8.7-19.8% of the total genome size. Analysis of the prophage pool of the strains shows a considerable diversity with a complex history of recombination. Each clonal complex (SNP-CC) is characterized by a unique set of plasmids and phages, including stx -prophages, suggesting evolution through separate acquisition events. Overall, the MGEs appear to play a major role in O26:H11 intra-serotype clonal diversification.
Delannoy, Sabine; Mariani-Kurkdjian, Patricia; Webb, Hattie E.; Bonacorsi, Stephane; Fach, Patrick
2017-01-01
Shiga toxin-producing Escherichia coli of serotype O26:H11/H- constitute a diverse group of strains and several clones with distinct genetic characteristics have been identified and characterized. Whole genome sequencing was performed using Illumina and PacBio technologies on eight stx2-positive O26:H11 strains circulating in France. Comparative analyses of the whole genome of the stx2-positive O26:H11 strains indicate that several clones of EHEC O26:H11 are co-circulating in France. Phylogenetic analysis of the French strains together with stx2-positive and stx-negative E. coli O26:H11 genomes obtained from Genbank indicates the existence of four clonal complexes (SNP-CCs) separated in two distinct lineages, one of which comprises the “new French clone” (SNP-CC1) that appears genetically closely related to stx-negative attaching and effacing E. coli (AEEC) strains. Interestingly, the whole genome SNP (wgSNP) phylogeny is summarized in the cas gene phylogeny, and a simple qPCR assay targeting the CRISPR array specific to SNP-CC1 (SP_O26-E) can distinguish between the two main lineages. The PacBio sequencing allowed a detailed analysis of the mobile genetic elements (MGEs) of the strains. Numerous MGEs were identified in each strain, including a large number of prophages and up to four large plasmids, representing overall 8.7–19.8% of the total genome size. Analysis of the prophage pool of the strains shows a considerable diversity with a complex history of recombination. Each clonal complex (SNP-CC) is characterized by a unique set of plasmids and phages, including stx-prophages, suggesting evolution through separate acquisition events. Overall, the MGEs appear to play a major role in O26:H11 intra-serotype clonal diversification. PMID:28932209
MICROARRAY DATA ANALYSIS USING MULTIPLE STATISTICAL MODELS
Microarray Data Analysis Using Multiple Statistical Models
Wenjun Bao1, Judith E. Schmid1, Amber K. Goetz1, Ming Ouyang2, William J. Welsh2,Andrew I. Brooks3,4, ChiYi Chu3,Mitsunori Ogihara3,4, Yinhe Cheng5, David J. Dix1. 1National Health and Environmental Effects Researc...
ERIC Educational Resources Information Center
Reiff, Marian; Giarelli, Ellen; Bernhardt, Barbara A.; Easley, Ebony; Spinner, Nancy B.; Sankar, Pamela L.; Mulchandani, Surabhi
2015-01-01
Clinical guidelines recommend chromosomal microarray analysis (CMA) for all children with autism spectrum disorders (ASDs). We explored the test's perceived usefulness among parents of children with ASD who had undergone CMA, and received a result categorized as pathogenic, variant of uncertain significance, or negative. Fifty-seven parents…
Oligonucleotide microarrays and other ‘omics’ approaches are powerful tools for unsupervised analysis of chemical impacts on biological systems. However, the lack of well annotated biological pathways for many aquatic organisms, including fish, and the poor power of microarray-b...
Bumm, Klaus; Zheng, Mingzhong; Bailey, Clyde; Zhan, Fenghuang; Chiriva-Internati, M; Eddlemon, Paul; Terry, Julian; Barlogie, Bart; Shaughnessy, John D
2002-02-01
Clinical GeneOrganizer (CGO) is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine. The program implements various user-friendly tools and extracts data for further statistical analysis. This software was written for Affymetrix GeneChip *.txt files, but can also be used for any other microarray-derived data. The MS-SQL server version acts as a data mart and links microarray data with clinical parameters of any other existing database and therefore represents a valuable tool for combining gene expression analysis and clinical disease characteristics.
cluML: A markup language for clustering and cluster validity assessment of microarray data.
Bolshakova, Nadia; Cunningham, Pádraig
2005-01-01
cluML is a new markup language for microarray data clustering and cluster validity assessment. The XML-based format has been designed to address some of the limitations observed in traditional formats, such as inability to store multiple clustering (including biclustering) and validation results within a dataset. cluML is an effective tool to support biomedical knowledge representation in gene expression data analysis. Although cluML was developed for DNA microarray analysis applications, it can be effectively used for the representation of clustering and for the validation of other biomedical and physical data that has no limitations.
Kirby, Ralph; Herron, Paul; Hoskisson, Paul
2011-02-01
Based on available genome sequences, Actinomycetales show significant gene synteny across a wide range of species and genera. In addition, many genera show varying degrees of complex morphological development. Using the presence of gene synteny as a basis, it is clear that an analysis of gene conservation across the Streptomyces and various other Actinomycetales will provide information on both the importance of genes and gene clusters and the evolution of morphogenesis in these bacteria. Genome sequencing, although becoming cheaper, is still relatively expensive for comparing large numbers of strains. Thus, a heterologous DNA/DNA microarray hybridization dataset based on a Streptomyces coelicolor microarray allows a cheaper and greater depth of analysis of gene conservation. This study, using both bioinformatical and microarray approaches, was able to classify genes previously identified as involved in morphogenesis in Streptomyces into various subgroups in terms of conservation across species and genera. This will allow the targeting of genes for further study based on their importance at the species level and at higher evolutionary levels.
Cross species analysis of microarray expression data
Lu, Yong; Huggins, Peter; Bar-Joseph, Ziv
2009-01-01
Motivation: Many biological systems operate in a similar manner across a large number of species or conditions. Cross-species analysis of sequence and interaction data is often applied to determine the function of new genes. In contrast to these static measurements, microarrays measure the dynamic, condition-specific response of complex biological systems. The recent exponential growth in microarray expression datasets allows researchers to combine expression experiments from multiple species to identify genes that are not only conserved in sequence but also operated in a similar way in the different species studied. Results: In this review we discuss the computational and technical challenges associated with these studies, the approaches that have been developed to address these challenges and the advantages of cross-species analysis of microarray data. We show how successful application of these methods lead to insights that cannot be obtained when analyzing data from a single species. We also highlight current open problems and discuss possible ways to address them. Contact: zivbj@cs.cmu.edu PMID:19357096
Yamamura, Shohei; Yamada, Eriko; Kimura, Fukiko; Miyajima, Kumiko; Shigeto, Hajime
2017-10-21
A new single-cell microarray chip was designed and developed to separate and analyze single adherent and non-adherent cancer cells. The single-cell microarray chip is made of polystyrene with over 60,000 microchambers of 10 different size patterns (31-40 µm upper diameter, 11-20 µm lower diameter). A drop of suspension of adherent carcinoma (NCI-H1650) and non-adherent leukocyte (CCRF-CEM) cells was placed onto the chip, and single-cell occupancy of NCI-H1650 and CCRF-CEM was determined to be 79% and 84%, respectively. This was achieved by controlling the chip design and surface treatment. Analysis of protein expression in single NCI-H1650 and CCRF-CEM cells was performed on the single-cell microarray chip by multi-antibody staining. Additionally, with this system, we retrieved positive single cells from the microchambers by a micromanipulator. Thus, this system demonstrates the potential for easy and accurate separation and analysis of various types of single cells.
Multi-locus variable number tandem repeat analysis of 7th pandemic Vibrio cholerae
2012-01-01
Background Seven pandemics of cholera have been recorded since 1817, with the current and ongoing pandemic affecting almost every continent. Cholera remains endemic in developing countries and is still a significant public health issue. In this study we use multilocus variable number of tandem repeats (VNTRs) analysis (MLVA) to discriminate between isolates of the 7th pandemic clone of Vibrio cholerae. Results MLVA of six VNTRs selected from previously published data distinguished 66 V. cholerae isolates collected between 1961–1999 into 60 unique MLVA profiles. Only 4 MLVA profiles consisted of more than 2 isolates. The discriminatory power was 0.995. Phylogenetic analysis showed that, except for the closely related profiles, the relationships derived from MLVA profiles were in conflict with that inferred from Single Nucleotide Polymorphism (SNP) typing. The six SNP groups share consensus VNTR patterns and two SNP groups contained isolates which differed by only one VNTR locus. Conclusions MLVA is highly discriminatory in differentiating 7th pandemic V. cholerae isolates and MLVA data was most useful in resolving the genetic relationships among isolates within groups previously defined by SNPs. Thus MLVA is best used in conjunction with SNP typing in order to best determine the evolutionary relationships among the 7th pandemic V. cholerae isolates and for longer term epidemiological typing. PMID:22624829
Evans, Daniel S.; Avery, Christy L.; Nalls, Mike A.; Li, Guo; Barnard, John; Smith, Erin N.; Tanaka, Toshiko; Butler, Anne M.; Buxbaum, Sarah G.; Alonso, Alvaro; Arking, Dan E.; Berenson, Gerald S.; Bis, Joshua C.; Buyske, Steven; Carty, Cara L.; Chen, Wei; Chung, Mina K.; Cummings, Steven R.; Deo, Rajat; Eaton, Charles B.; Fox, Ervin R.; Heckbert, Susan R.; Heiss, Gerardo; Hindorff, Lucia A.; Hsueh, Wen-Chi; Isaacs, Aaron; Jamshidi, Yalda; Kerr, Kathleen F.; Liu, Felix; Liu, Yongmei; Lohman, Kurt K.; Magnani, Jared W.; Maher, Joseph F.; Mehra, Reena; Meng, Yan A.; Musani, Solomon K.; Newton-Cheh, Christopher; North, Kari E.; Psaty, Bruce M.; Redline, Susan; Rotter, Jerome I.; Schnabel, Renate B.; Schork, Nicholas J.; Shohet, Ralph V.; Singleton, Andrew B.; Smith, Jonathan D.; Soliman, Elsayed Z.; Srinivasan, Sathanur R.; Taylor, Herman A.; Van Wagoner, David R.; Wilson, James G.; Young, Taylor; Zhang, Zhu-Ming; Zonderman, Alan B.; Evans, Michele K.; Ferrucci, Luigi; Murray, Sarah S.; Tranah, Gregory J.; Whitsel, Eric A.; Reiner, Alex P.; Sotoodehnia, Nona
2016-01-01
The electrocardiographic QRS duration, a measure of ventricular depolarization and conduction, is associated with cardiovascular mortality. While single nucleotide polymorphisms (SNPs) associated with QRS duration have been identified at 22 loci in populations of European descent, the genetic architecture of QRS duration in non-European populations is largely unknown. We therefore performed a genome-wide association study (GWAS) meta-analysis of QRS duration in 13,031 African Americans from ten cohorts and a transethnic GWAS meta-analysis with additional results from populations of European descent. In the African American GWAS, a single genome-wide significant SNP association was identified (rs3922844, P = 4 × 10−14) in intron 16 of SCN5A, a voltage-gated cardiac sodium channel gene. The QRS-prolonging rs3922844 C allele was also associated with decreased SCN5A RNA expression in human atrial tissue (P = 1.1 × 10−4). High density genotyping revealed that the SCN5A association region in African Americans was confined to intron 16. Transethnic GWAS meta-analysis identified novel SNP associations on chromosome 18 in MYL12A (rs1662342, P = 4.9 × 10−8) and chromosome 1 near CD1E and SPTA1 (rs7547997, P = 7.9 × 10−9). The 22 QRS loci previously identified in populations of European descent were enriched for significant SNP associations with QRS duration in African Americans (P = 9.9 × 10−7), and index SNP associations in or near SCN5A, SCN10A, CDKN1A, NFIA, HAND1, TBX5 and SETBP1 replicated in African Americans. In summary, rs3922844 was associated with QRS duration and SCN5A expression, two novel QRS loci were identified using transethnic meta-analysis, and a significant proportion of QRS–SNP associations discovered in populations of European descent were transferable to African Americans when adequate power was achieved. PMID:27577874
Evans, Daniel S; Avery, Christy L; Nalls, Mike A; Li, Guo; Barnard, John; Smith, Erin N; Tanaka, Toshiko; Butler, Anne M; Buxbaum, Sarah G; Alonso, Alvaro; Arking, Dan E; Berenson, Gerald S; Bis, Joshua C; Buyske, Steven; Carty, Cara L; Chen, Wei; Chung, Mina K; Cummings, Steven R; Deo, Rajat; Eaton, Charles B; Fox, Ervin R; Heckbert, Susan R; Heiss, Gerardo; Hindorff, Lucia A; Hsueh, Wen-Chi; Isaacs, Aaron; Jamshidi, Yalda; Kerr, Kathleen F; Liu, Felix; Liu, Yongmei; Lohman, Kurt K; Magnani, Jared W; Maher, Joseph F; Mehra, Reena; Meng, Yan A; Musani, Solomon K; Newton-Cheh, Christopher; North, Kari E; Psaty, Bruce M; Redline, Susan; Rotter, Jerome I; Schnabel, Renate B; Schork, Nicholas J; Shohet, Ralph V; Singleton, Andrew B; Smith, Jonathan D; Soliman, Elsayed Z; Srinivasan, Sathanur R; Taylor, Herman A; Van Wagoner, David R; Wilson, James G; Young, Taylor; Zhang, Zhu-Ming; Zonderman, Alan B; Evans, Michele K; Ferrucci, Luigi; Murray, Sarah S; Tranah, Gregory J; Whitsel, Eric A; Reiner, Alex P; Sotoodehnia, Nona
2016-10-01
The electrocardiographic QRS duration, a measure of ventricular depolarization and conduction, is associated with cardiovascular mortality. While single nucleotide polymorphisms (SNPs) associated with QRS duration have been identified at 22 loci in populations of European descent, the genetic architecture of QRS duration in non-European populations is largely unknown. We therefore performed a genome-wide association study (GWAS) meta-analysis of QRS duration in 13,031 African Americans from ten cohorts and a transethnic GWAS meta-analysis with additional results from populations of European descent. In the African American GWAS, a single genome-wide significant SNP association was identified (rs3922844, P = 4 × 10 -14 ) in intron 16 of SCN5A, a voltage-gated cardiac sodium channel gene. The QRS-prolonging rs3922844 C allele was also associated with decreased SCN5A RNA expression in human atrial tissue (P = 1.1 × 10 -4 ). High density genotyping revealed that the SCN5A association region in African Americans was confined to intron 16. Transethnic GWAS meta-analysis identified novel SNP associations on chromosome 18 in MYL12A (rs1662342, P = 4.9 × 10 -8 ) and chromosome 1 near CD1E and SPTA1 (rs7547997, P = 7.9 × 10 -9 ). The 22 QRS loci previously identified in populations of European descent were enriched for significant SNP associations with QRS duration in African Americans (P = 9.9 × 10 -7 ), and index SNP associations in or near SCN5A, SCN10A, CDKN1A, NFIA, HAND1, TBX5 and SETBP1 replicated in African Americans. In summary, rs3922844 was associated with QRS duration and SCN5A expression, two novel QRS loci were identified using transethnic meta-analysis, and a significant proportion of QRS-SNP associations discovered in populations of European descent were transferable to African Americans when adequate power was achieved. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Wu, Baolin
2006-02-15
Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.
Best practices for hybridization design in two-colour microarray analysis.
Knapen, Dries; Vergauwen, Lucia; Laukens, Kris; Blust, Ronny
2009-07-01
Two-colour microarrays are a popular platform of choice in gene expression studies. Because two different samples are hybridized on a single microarray, and several microarrays are usually needed in a given experiment, there are many possible ways to combine samples on different microarrays. The actual combination employed is commonly referred to as the 'hybridization design'. Different types of hybridization designs have been developed, all aimed at optimizing the experimental setup for the detection of differentially expressed genes while coping with technical noise. Here, we first provide an overview of the different classes of hybridization designs, discussing their advantages and limitations, and then we illustrate the current trends in the use of different hybridization design types in contemporary research.
Nakatochi, Masahiro; Ushida, Yasunori; Yasuda, Yoshinari; Yoshida, Yasuko; Kawai, Shun; Kato, Ryuji; Nakashima, Toru; Iwata, Masamitsu; Kuwatsuka, Yachiyo; Ando, Masahiko; Hamajima, Nobuyuki; Kondo, Takaaki; Oda, Hiroaki; Hayashi, Mutsuharu; Kato, Sawako; Yamaguchi, Makoto; Maruyama, Shoichi; Matsuo, Seiichi; Honda, Hiroyuki
2015-01-01
Although many single nucleotide polymorphisms (SNPs) have been identified to be associated with metabolic syndrome (MetS), there was only a slight improvement in the ability to predict future MetS by the simply addition of SNPs to clinical risk markers. To improve the ability to predict future MetS, combinational effects, such as SNP—SNP interaction, SNP—environment interaction, and SNP—clinical parameter (SNP × CP) interaction should be also considered. We performed a case-control study to explore novel SNP × CP interactions as risk markers for MetS based on health check-up data of Japanese male employees. We selected 99 SNPs that were previously reported to be associated with MetS and components of MetS; subsequently, we genotyped these SNPs from 360 cases and 1983 control subjects. First, we performed logistic regression analyses to assess the association of each SNP with MetS. Of these SNPs, five SNPs were significantly associated with MetS (P < 0.05): LRP2 rs2544390, rs1800592 between UCP1 and TBC1D9, APOA5 rs662799, VWF rs7965413, and rs1411766 between MYO16 and IRS2. Furthermore, we performed multiple logistic regression analyses, including an SNP term, a CP term, and an SNP × CP interaction term for each CP and SNP that was significantly associated with MetS. We identified a novel SNP × CP interaction between rs7965413 and platelet count that was significantly associated with MetS [SNP term: odds ratio (OR) = 0.78, P = 0.004; SNP × CP interaction term: OR = 1.33, P = 0.001]. This association of the SNP × CP interaction with MetS remained nominally significant in multiple logistic regression analysis after adjustment for either the number of MetS components or MetS components excluding obesity. Our results reveal new insight into platelet count as a risk marker for MetS. PMID:25646961
Sulovari, Arvis; Li, Dawei
2014-07-19
Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. http://www.uvm.edu/genomics/software/gact.
Jin, S J; Liu, M; Long, W J; Luo, X P
2016-12-02
Objective: To explore the clinical phenotypes and the genetic cause for a boy with unexplained growth retardation, nephrocalcinosis, auditory anomalies and multi-organ/system developmental disorders. Method: Routine G-banding and chromosome microarray analysis were applied to a child with unexplained growth retardation, nephrocalcinosis, auditory anomalies and multi-organ/system developmental disorders treated in the Department of Pediatrics of Tongji Hospital Affiliated to Tongji Medical College of Huazhong University of Science and Technology in September 2015 and his parents to conduct the chromosomal karyotype analysis and the whole genome scanning. Deleted genes were searched in the Decipher and NCBI databases, and their relationships with the clinical phenotypes were analyzed. Result: A six-month-old boy was refered to us because of unexplained growth retardation and feeding intolerance.The affected child presented with abnormal manifestation such as special face, umbilical hernia, growth retardation, hypothyroidism, congenital heart disease, right ear sensorineural deafness, hypercalcemia and nephrocalcinosis. The child's karyotype was 46, XY, 16qh + , and his parents' karyotypes were normal. Chromosome microarray analysis revealed a 1 436 kb deletion on the 7q11.23(72701098_74136633) region of the child. This region included 23 protein-coding genes, which were reported to be corresponding to Williams-Beuren syndrome and its certain clinical phenotypes. His parents' results of chromosome microarray analysis were normal. Conclusion: A boy with characteristic manifestation of Williams-Beuren syndrome and rare nephrocalcinosis was diagnosed using chromosome microarray analysis. The deletion on the 7q11.23 might be related to the clinical phenotypes of Williams-Beuren syndrome, yet further studies are needed.
AFM 4.0: a toolbox for DNA microarray analysis
Breitkreutz, Bobby-Joe; Jorgensen, Paul; Breitkreutz, Ashton; Tyers, Mike
2001-01-01
We have developed a series of programs, collectively packaged as Array File Maker 4.0 (AFM), that manipulate and manage DNA microarray data. AFM 4.0 is simple to use, applicable to any organism or microarray, and operates within the familiar confines of Microsoft Excel. Given a database of expression ratios, AFM 4.0 generates input files for clustering, helps prepare colored figures and Venn diagrams, and can uncover aneuploidy in yeast microarray data. AFM 4.0 should be especially useful to laboratories that do not have access to specialized commercial or in-house software. PMID:11532221
Fully Automated Complementary DNA Microarray Segmentation using a Novel Fuzzy-based Algorithm.
Saberkari, Hamidreza; Bahrami, Sheyda; Shamsi, Mousa; Amoshahy, Mohammad Javad; Ghavifekr, Habib Badri; Sedaaghi, Mohammad Hossein
2015-01-01
DNA microarray is a powerful approach to study simultaneously, the expression of 1000 of genes in a single experiment. The average value of the fluorescent intensity could be calculated in a microarray experiment. The calculated intensity values are very close in amount to the levels of expression of a particular gene. However, determining the appropriate position of every spot in microarray images is a main challenge, which leads to the accurate classification of normal and abnormal (cancer) cells. In this paper, first a preprocessing approach is performed to eliminate the noise and artifacts available in microarray cells using the nonlinear anisotropic diffusion filtering method. Then, the coordinate center of each spot is positioned utilizing the mathematical morphology operations. Finally, the position of each spot is exactly determined through applying a novel hybrid model based on the principle component analysis and the spatial fuzzy c-means clustering (SFCM) algorithm. Using a Gaussian kernel in SFCM algorithm will lead to improving the quality in complementary DNA microarray segmentation. The performance of the proposed algorithm has been evaluated on the real microarray images, which is available in Stanford Microarray Databases. Results illustrate that the accuracy of microarray cells segmentation in the proposed algorithm reaches to 100% and 98% for noiseless/noisy cells, respectively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jackson, Paul J.; Hill, Karen K.
2009-11-09
The results outlined in this report provide the information for needed to apply a SNP-based forensic analysis to diverse ricin preparations. The same methods could be useful in castor breeding programs that seek to reduce or eliminate ricin in oil-producing R. communis cultivars.
Li, X; Buitenhuis, A J; Lund, M S; Li, C; Sun, D; Zhang, Q; Poulsen, N A; Su, G
2015-11-01
The identification of causal genes or genomic regions associated with fatty acids (FA) will enhance our understanding of the pathways underlying FA synthesis and provide opportunities for changing milk fat composition through a genetic approach. The linkage disequilibrium between adjacent markers is highly consistent between the Chinese and Danish Holstein populations, such that a joint genome-wide association study (GWAS) can be performed. In this study, a joint GWAS was performed for 16 milk FA traits based on data of 784 Chinese and 371 Danish Holstein cows genotyped by a high-density bovine single nucleotide polymorphism (SNP) array. A total of 486,464 SNP markers on 29 bovine autosomes were used. Bonferroni corrections were applied to adjust the significance thresholds for multiple testing at the genome- and chromosome-wide levels. According to the analysis of either the Chinese or Danish data individually, the total numbers of overlapping SNP that were significant at the chromosome level were 94 for C14:1, 208 for the C14 index, and 1 for C18:0. Joint analysis using the combined data of the 2 populations detected greater numbers of significant SNP compared with either of the individual populations alone for 7 and 10 traits at the genome- and chromosome-wide significance levels, respectively. Greater numbers of significant SNP were detected for C18:0 and the C18 index in the Chinese population compared with the joint analysis. Sixty-five significant SNP across all traits had significantly different effects in the 2 populations. Ten FA were influenced by a quantitative trait loci (QTL) region including DGAT1. Both C14:1 and the C14 index were influenced by a QTL region including SCD1 in the combined population. Other QTL regions also showed significant associations with the studied FA. A large region (14.9-24.9 Mbp) in BTA26 significantly influenced C14:1 and the C14 index in both populations, mostly likely due to the SNP in SCD1. A QTL region (69.97-73.69 Mbp) on BTA9 showed a significantly different effect on C18:0 between the 2 populations. Detection of these important SNP and the corresponding QTL regions will be helpful for follow-up studies to identify causal mutations and their interaction with environments for milk FA in dairy cattle. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Large-scale analysis of gene expression using cDNA microarrays promises the
rapid detection of the mode of toxicity for drugs and other chemicals. cDNA
microarrays were used to examine chemically-induced alterations of gene
expression in HepG2 cells exposed to oxidative ...
Where statistics and molecular microarray experiments biology meet.
Kelmansky, Diana M
2013-01-01
This review chapter presents a statistical point of view to microarray experiments with the purpose of understanding the apparent contradictions that often appear in relation to their results. We give a brief introduction of molecular biology for nonspecialists. We describe microarray experiments from their construction and the biological principles the experiments rely on, to data acquisition and analysis. The role of epidemiological approaches and sample size considerations are also discussed.
Grenville-Briggs, Laura J; Stansfield, Ian
2011-01-01
This report describes a linked series of Masters-level computer practical workshops. They comprise an advanced functional genomics investigation, based upon analysis of a microarray dataset probing yeast DNA damage responses. The workshops require the students to analyse highly complex transcriptomics datasets, and were designed to stimulate active learning through experience of current research methods in bioinformatics and functional genomics. They seek to closely mimic a realistic research environment, and require the students first to propose research hypotheses, then test those hypotheses using specific sections of the microarray dataset. The complexity of the microarray data provides students with the freedom to propose their own unique hypotheses, tested using appropriate sections of the microarray data. This research latitude was highly regarded by students and is a strength of this practical. In addition, the focus on DNA damage by radiation and mutagenic chemicals allows them to place their results in a human medical context, and successfully sparks broad interest in the subject material. In evaluation, 79% of students scored the practical workshops on a five-point scale as 4 or 5 (totally effective) for student learning. More broadly, the general use of microarray data as a "student research playground" is also discussed. Copyright © 2011 Wiley Periodicals, Inc.
Genome-wide Association Study of Obsessive-Compulsive Disorder
Stewart, S Evelyn; Yu, Dongmei; Scharf, Jeremiah M; Neale, Benjamin M; Fagerness, Jesen A; Mathews, Carol A; Arnold, Paul D; Evans, Patrick D; Gamazon, Eric R; Osiecki, Lisa; McGrath, Lauren; Haddad, Stephen; Crane, Jacquelyn; Hezel, Dianne; Illman, Cornelia; Mayerfeld, Catherine; Konkashbaev, Anuar; Liu, Chunyu; Pluzhnikov, Anna; Tikhomirov, Anna; Edlund, Christopher K; Rauch, Scott L; Moessner, Rainald; Falkai, Peter; Maier, Wolfgang; Ruhrmann, Stephan; Grabe, Hans-Jörgen; Lennertz, Leonard; Wagner, Michael; Bellodi, Laura; Cavallini, Maria Cristina; Richter, Margaret A; Cook, Edwin H; Kennedy, James L; Rosenberg, David; Stein, Dan J; Hemmings, Sian MJ; Lochner, Christine; Azzam, Amin; Chavira, Denise A; Fournier, Eduardo; Garrido, Helena; Sheppard, Brooke; Umaña, Paul; Murphy, Dennis L; Wendland, Jens R; Veenstra-VanderWeele, Jeremy; Denys, Damiaan; Blom, Rianne; Deforce, Dieter; Van Nieuwerburgh, Filip; Westenberg, Herman GM; Walitza, Susanne; Egberts, Karin; Renner, Tobias; Miguel, Euripedes Constantino; Cappi, Carolina; Hounie, Ana G; Conceição do Rosário, Maria; Sampaio, Aline S; Vallada, Homero; Nicolini, Humberto; Lanzagorta, Nuria; Camarena, Beatriz; Delorme, Richard; Leboyer, Marion; Pato, Carlos N; Pato, Michele T; Voyiaziakis, Emanuel; Heutink, Peter; Cath, Danielle C; Posthuma, Danielle; Smit, Jan H; Samuels, Jack; Bienvenu, O Joseph; Cullen, Bernadette; Fyer, Abby J; Grados, Marco A; Greenberg, Benjamin D; McCracken, James T; Riddle, Mark A; Wang, Ying; Coric, Vladimir; Leckman, James F; Bloch, Michael; Pittenger, Christopher; Eapen, Valsamma; Black, Donald W; Ophoff, Roel A; Strengman, Eric; Cusi, Daniele; Turiel, Maurizio; Frau, Francesca; Macciardi, Fabio; Gibbs, J Raphael; Cookson, Mark R; Singleton, Andrew; Hardy, John; Crenshaw, Andrew T; Parkin, Melissa A; Mirel, Daniel B; Conti, David V; Purcell, Shaun; Nestadt, Gerald; Hanna, Gregory L; Jenike, Michael A; Knowles, James A; Cox, Nancy; Pauls, David L
2014-01-01
Obsessive-compulsive disorder (OCD) is a common, debilitating neuropsychiatric illness with complex genetic etiology. The International OCD Foundation Genetics Collaborative (IOCDF-GC) is a multi-national collaboration established to discover the genetic variation predisposing to OCD. A set of individuals affected with DSM-IV OCD, a subset of their parents, and unselected controls, were genotyped with several different Illumina SNP microarrays. After extensive data cleaning, 1,465 cases, 5,557 ancestry-matched controls and 400 complete trios remained, with a common set of 469,410 autosomal and 9,657 X-chromosome SNPs. Ancestry-stratified case-control association analyses were conducted for three genetically-defined subpopulations and combined in two meta-analyses, with and without the trio-based analysis. In the case-control analysis, the lowest two p-values were located within DLGAP1 (p=2.49×10-6 and p=3.44×10-6), a member of the neuronal postsynaptic density complex. In the trio analysis, rs6131295, near BTBD3, exceeded the genome-wide significance threshold with a p-value=3.84 × 10-8. However, when trios were meta-analyzed with the combined case-control samples, the p-value for this variant was 3.62×10-5, losing genome-wide significance. Although no SNPs were identified to be associated with OCD at a genome-wide significant level in the combined trio-case-control sample, a significant enrichment of methylation-QTLs (p<0.001) and frontal lobe eQTLs (p=0.001) was observed within the top-ranked SNPs (p<0.01) from the trio-case-control analysis, suggesting these top signals may have a broad role in gene expression in the brain, and possibly in the etiology of OCD. PMID:22889921
Genome-wide association study with the risk of schizophrenia in a Korean population.
Kim, Lyoung Hyo; Park, Byung Lae; Cheong, Hyun Sub; Namgoong, Suhg; Kim, Ji On; Kim, Jeong-Hyun; Shin, Joong-Gon; Park, Chul Soo; Kim, Bong-Jo; Kim, Jae Won; Choi, Ihn-Geun; Hwang, Jaeuk; Shin, Hyoung Doo; Woo, Sung-Il
2016-03-01
Schizophrenia is regarded as a multifactorial and polygenic brain disorder that is attributed to different combinations of genetic and environmental risk factors. Recently, several genome-wide association studies (GWASs) of schizophrenia have identified numerous risk factors, but the replication results remain controversial and ambiguous. To identify schizophrenia susceptibility loci in the Korean population, we performed a GWAS using the Illumina HumanOmni1-Quad V1.0 Microarray. We genotyped 1,140,419 single nucleotide polymorphisms (SNPs) in 350 Korea schizophrenia patients and 700 control subjects, and approximately 620,001 autosomal SNPs were passed our quality control. In the case-control analysis, the rs9607195 A>G on intergenic area 250 kb away from the ISX gene and the rs12738007 A>G on the intron of the MECR gene were the most strongly associated SNPs with the risk of schizophrenia (P = 6.2 × 10(-8) , OR = 0.50 and P = 3.7 × 10(-7) , OR = 2.39, respectively). In subsequent fine-mapping analysis, 6 SNPs of MECR were genotyped with 310 schizophrenia patients and 604 control subjects. The association of the MECR rs12738007, a top ranked-SNP in GWAS, was replicated (P = 1.5 × 10(-2) , OR = 1.53 in fine mapping analysis, P = 1.5 × 10(-6) , OR = 1.90 in combined analysis). The identification of putative schizophrenia susceptibility loci could provide new insights into genetic factors related with schizophrenia and clues for the development of diagnosis strategies. © 2015 Wiley Periodicals, Inc.
Shih, P Betty; Manzi, Susan; Shaw, Penny; Kenney, Margaret; Kao, Amy H; Bontempo, Franklin; Barmada, M Michael; Kammerer, Candace; Kamboh, M Ilyas
2008-11-01
The gene coding for C-reactive protein (CRP) is located on chromosome 1q23.2, which falls within a linkage region thought to harbor a systemic lupus erythematosus (SLE) susceptibility gene. Recently, 2 single-nucleotide polymorphisms (SNP) in the CRP gene (+838, +2043) have been shown to be associated with CRP concentrations and/or SLE risk in a British family-based cohort. Our study was done to confirm the reported association in an independent population-based case-control cohort, and also to investigate the influence of 3 additional CRP tagSNP (-861, -390, +90) on SLE risk and serum CRP concentrations. DNA from 337 Caucasian women who met the American College of Rheumatology criteria for definite (n = 324) or probable (n = 13) SLE and 448 Caucasian healthy female controls was genotyped for 5 CRP tagSNP (-861, -390, +90, +838, +2043). Genotyping was performed using restriction fragment length polymorphism-polymerase chain reaction, pyrosequencing, or TaqMan assays. Serum CRP levels were measured using ELISA. Association studies were performed using the chi-squared distribution, Z-test, Fisher's exact test, and analysis of variance. Haplotype analysis was performed using EH software and the haplo.stats package in R 2.1.2. While none of the SNP were found to be associated with SLE risk individually, there was an association with the 5 SNP haplotypes (p < 0.001). Three SNP (-861, -390, +90) were found to significantly influence serum CRP level in SLE cases, both independently and as haplotypes. Our data suggest that unique haplotype combinations in the CRP gene may modify the risk of developing SLE and influence circulating CRP levels.
Xu, Jin; Lu, Zhigang; Xu, Mingming; Pan, Ling; Deng, Yi; Xie, Xiaohu; Liu, Huifen; Ding, Shixiong; Hurd, Yasmin L.; Pasternak, Gavril W.; Klein, Robert J.; Cartegni, Luca
2014-01-01
Single nucleotide polymorphisms (SNPs) in the OPRM1 gene have been associated with vulnerability to opioid dependence. The current study identifies an association of an intronic SNP (rs9479757) with the severity of heroin addiction among Han-Chinese male heroin addicts. Individual SNP analysis and haplotype-based analysis with additional SNPs in the OPRM1 locus showed that mild heroin addiction was associated with the AG genotype, whereas severe heroin addiction was associated with the GG genotype. In vitro studies such as electrophoretic mobility shift assay, minigene, siRNA, and antisense morpholino oligonucleotide studies have identified heterogeneous nuclear ribonucleoprotein H (hnRNPH) as the major binding partner for the G-containing SNP site. The G-to-A transition weakens hnRNPH binding and facilitates exon 2 skipping, leading to altered expressions of OPRM1 splice-variant mRNAs and hMOR-1 proteins. Similar changes in splicing and hMOR-1 proteins were observed in human postmortem prefrontal cortex with the AG genotype of this SNP when compared with the GG genotype. Interestingly, the altered splicing led to an increase in hMOR-1 protein levels despite decreased hMOR-1 mRNA levels, which is likely contributed by a concurrent increase in single transmembrane domain variants that have a chaperone-like function on MOR-1 protein stability. Our studies delineate the role of this SNP as a modifier of OPRM1 alternative splicing via hnRNPH interactions, and suggest a functional link between an SNP-containing splicing modifier and the severity of heroin addiction. PMID:25122903
Filliol, Ingrid; Motiwala, Alifiya S.; Cavatore, Magali; Qi, Weihong; Hazbón, Manzour Hernando; Bobadilla del Valle, Miriam; Fyfe, Janet; García-García, Lourdes; Rastogi, Nalin; Sola, Christophe; Zozio, Thierry; Guerrero, Marta Inírida; León, Clara Inés; Crabtree, Jonathan; Angiuoli, Sam; Eisenach, Kathleen D.; Durmaz, Riza; Joloba, Moses L.; Rendón, Adrian; Sifuentes-Osornio, José; Ponce de León, Alfredo; Cave, M. Donald; Fleischmann, Robert; Whittam, Thomas S.; Alland, David
2006-01-01
We analyzed a global collection of Mycobacterium tuberculosis strains using 212 single nucleotide polymorphism (SNP) markers. SNP nucleotide diversity was high (average across all SNPs, 0.19), and 96% of the SNP locus pairs were in complete linkage disequilibrium. Cluster analyses identified six deeply branching, phylogenetically distinct SNP cluster groups (SCGs) and five subgroups. The SCGs were strongly associated with the geographical origin of the M. tuberculosis samples and the birthplace of the human hosts. The most ancestral cluster (SCG-1) predominated in patients from the Indian subcontinent, while SCG-1 and another ancestral cluster (SCG-2) predominated in patients from East Asia, suggesting that M. tuberculosis first arose in the Indian subcontinent and spread worldwide through East Asia. Restricted SCG diversity and the prevalence of less ancestral SCGs in indigenous populations in Uganda and Mexico suggested a more recent introduction of M. tuberculosis into these regions. The East African Indian and Beijing spoligotypes were concordant with SCG-1 and SCG-2, respectively; X and Central Asian spoligotypes were also associated with one SCG or subgroup combination. Other clades had less consistent associations with SCGs. Mycobacterial interspersed repetitive unit (MIRU) analysis provided less robust phylogenetic information, and only 6 of the 12 MIRU microsatellite loci were highly differentiated between SCGs as measured by GST. Finally, an algorithm was devised to identify two minimal sets of either 45 or 6 SNPs that could be used in future investigations to enable global collaborations for studies on evolution, strain differentiation, and biological differences of M. tuberculosis. PMID:16385065
Familiality and SNP heritability of age at onset and episodicity in major depressive disorder.
Ferentinos, P; Koukounari, A; Power, R; Rivera, M; Uher, R; Craddock, N; Owen, M J; Korszun, A; Jones, L; Jones, I; Gill, M; Rice, J P; Ising, M; Maier, W; Mors, O; Rietschel, M; Preisig, M; Binder, E B; Aitchison, K J; Mendlewicz, J; Souery, D; Hauser, J; Henigsberg, N; Breen, G; Craig, I W; Farmer, A E; Müller-Myhsok, B; McGuffin, P; Lewis, C M
2015-07-01
Strategies to dissect phenotypic and genetic heterogeneity of major depressive disorder (MDD) have mainly relied on subphenotypes, such as age at onset (AAO) and recurrence/episodicity. Yet, evidence on whether these subphenotypes are familial or heritable is scarce. The aims of this study are to investigate the familiality of AAO and episode frequency in MDD and to assess the proportion of their variance explained by common single nucleotide polymorphisms (SNP heritability). For investigating familiality, we used 691 families with 2-5 full siblings with recurrent MDD from the DeNt study. We fitted (square root) AAO and episode count in a linear and a negative binomial mixed model, respectively, with family as random effect and adjusting for sex, age and center. The strength of familiality was assessed with intraclass correlation coefficients (ICC). For estimating SNP heritabilities, we used 3468 unrelated MDD cases from the RADIANT and GSK Munich studies. After similarly adjusting for covariates, derived residuals were used with the GREML method in GCTA (genome-wide complex trait analysis) software. Significant familial clustering was found for both AAO (ICC = 0.28) and episodicity (ICC = 0.07). We calculated from respective ICC estimates the maximal additive heritability of AAO (0.56) and episodicity (0.15). SNP heritability of AAO was 0.17 (p = 0.04); analysis was underpowered for calculating SNP heritability of episodicity. AAO and episodicity aggregate in families to a moderate and small degree, respectively. AAO is under stronger additive genetic control than episodicity. Larger samples are needed to calculate the SNP heritability of episodicity. The described statistical framework could be useful in future analyses.
Tumor Touch Imprints as Source for Whole Genome Analysis of Neuroblastoma Tumors
Brunner, Clemens; Brunner-Herglotz, Bettina; Ziegler, Andrea; Frech, Christian; Amann, Gabriele; Ladenstein, Ruth; Ambros, Inge M.; Ambros, Peter F.
2016-01-01
Introduction Tumor touch imprints (TTIs) are routinely used for the molecular diagnosis of neuroblastomas by interphase fluorescence in-situ hybridization (I-FISH). However, in order to facilitate a comprehensive, up-to-date molecular diagnosis of neuroblastomas and to identify new markers to refine risk and therapy stratification methods, whole genome approaches are needed. We examined the applicability of an ultra-high density SNP array platform that identifies copy number changes of varying sizes down to a few exons for the detection of genomic changes in tumor DNA extracted from TTIs. Material and Methods DNAs were extracted from TTIs of 46 neuroblastoma and 4 other pediatric tumors. The DNAs were analyzed on the Cytoscan HD SNP array platform to evaluate numerical and structural genomic aberrations. The quality of the data obtained from TTIs was compared to that from randomly chosen fresh or fresh frozen solid tumors (n = 212) and I-FISH validation was performed. Results SNP array profiles were obtained from 48 (out of 50) TTI DNAs of which 47 showed genomic aberrations. The high marker density allowed for single gene analysis, e.g. loss of nine exons in the ATRX gene and the visualization of chromothripsis. Data quality was comparable to fresh or fresh frozen tumor SNP profiles. SNP array results were confirmed by I-FISH. Conclusion TTIs are an excellent source for SNP array processing with the advantage of simple handling, distribution and storage of tumor tissue on glass slides. The minimal amount of tumor tissue needed to analyze whole genomes makes TTIs an economic surrogate source in the molecular diagnostic work up of tumor samples. PMID:27560999
Kongchum, Pawapol; Palti, Yniv; Hallerman, Eric M; Hulata, Gideon; David, Lior
2010-08-01
Single nucleotide polymorphisms (SNPs) in immune response genes have been reported as markers for susceptibility to infectious diseases in human and livestock. A disease caused by cyprinid herpesvirus 3 (CyHV-3) is highly contagious and virulent in common carp (Cyprinus carpio). With the aim to develop molecular tools for breeding CyHV-3-resistant carp, we have amplified and sequenced 11 candidate genes for viral disease resistance including TLR2, TLR3, TLR4ba, TLR7, TLR9, TLR21, TLR22, MyD88, TRAF6, type I IFN and IL-1beta. For each gene, we initially cloned and sequenced PCR amplicons from 8 to 12 fish (2-3 fish per strain) from the SNP discovery panel. We then identified and evaluated putative SNPs for their polymorphisms in the SNP discovery panel and validated their usefulness for linkage analysis in a full-sib family using the SNaPshot method. Our sequencing results and phylogenetic analyses suggested that TLR3, TLR7 and MyD88 genes are duplicated in the common carp genome. We, therefore, developed locus-specific PCR primers and SNP genotyping assays for the duplicated loci. A total of 48 SNP markers were developed from PCR fragments of the 13 loci (7 single-locus and 3 duplicated genes). Thirty-nine markers were polymorphic with estimated minor allele frequencies of more than 0.1. The utility of the SNP markers was evaluated in one full-sib family and revealed that 20 markers from 9 loci segregated in a disomic and Mendelian pattern and would be useful for linkage analysis. Published by Elsevier Ltd.
Association of single-nucleotide polymorphisms of the tau gene with late-onset Parkinson disease.
Martin, E R; Scott, W K; Nance, M A; Watts, R L; Hubble, J P; Koller, W C; Lyons, K; Pahwa, R; Stern, M B; Colcher, A; Hiner, B C; Jankovic, J; Ondo, W G; Allen, F H; Goetz, C G; Small, G W; Masterman, D; Mastaglia, F; Laing, N G; Stajich, J M; Ribble, R C; Booze, M W; Rogala, A; Hauser, M A; Zhang, F; Gibson, R A; Middleton, L T; Roses, A D; Haines, J L; Scott, B L; Pericak-Vance, M A; Vance, J M
2001-11-14
The human tau gene, which promotes assembly of neuronal microtubules, has been associated with several rare neurologic diseases that clinically include parkinsonian features. We recently observed linkage in idiopathic Parkinson disease (PD) to a region on chromosome 17q21 that contains the tau gene. These factors make tau a good candidate for investigation as a susceptibility gene for idiopathic PD, the most common form of the disease. To investigate whether the tau gene is involved in idiopathic PD. Among a sample of 1056 individuals from 235 families selected from 13 clinical centers in the United States and Australia and from a family ascertainment core center, we tested 5 single-nucleotide polymorphisms (SNPs) within the tau gene for association with PD, using family-based tests of association. Both affected (n = 426) and unaffected (n = 579) family members were included; 51 individuals had unclear PD status. Analyses were conducted to test individual SNPs and SNP haplotypes within the tau gene. Family-based tests of association, calculated using asymptotic distributions. Analysis of association between the SNPs and PD yielded significant evidence of association for 3 of the 5 SNPs tested: SNP 3, P =.03; SNP 9i, P =.04; and SNP 11, P =.04. The 2 other SNPs did not show evidence of significant association (SNP 9ii, P =.11, and SNP 9iii, P =.87). Strong evidence of association was found with haplotype analysis, with a positive association with one haplotype (P =.009) and a negative association with another haplotype (P =.007). Substantial linkage disequilibrium (P<.001) was detected between 4 of the 5 SNPs (SNPs 3, 9i, 9ii, and 11). This integrated approach of genetic linkage and positional association analyses implicates tau as a susceptibility gene for idiopathic PD.
Loya Méndez, Yolanda; Reyes Leal, Gilberto; Sánchez González, Adriana; Portillo Reyes, Verónica; Reyes Ruvalcaba, David; Bojórquez Rangel, Guillermo
2014-09-28
Diabetes Mellitus (DM) type 2 is a common pathology with multifactorial etiology, which exact genetic bases remain unknown. Some studies suggest that single nucleotides polymorphisms (SNPs) in the CAPN10 gene (Locus 2q37.3) could be associated with the development of this disease, including the insertion/deletion polymorphism SNP-19 (2R→3R). The present study determined the association between the SNP-19 and the risk of developing DM type 2 in Ciudad Juarez population. For this study 107 participants were selected: 43 diabetics type 2 (cases) and 64 non diabetics with no family history of DM type 2 in first grade (control). Anthropometric studies were realized as well as lipids, lipoproteins and serum glucose biochemical profiles. The genotypification of SNP-19 was performed using peripheral blood lymphocytes DNA, polymerase chain reactions (PCR), and electrophoretic analysis in agarose gels. Once obtained the genotypic and allelic frequencies, the Hardy-Weinberg equilibrium test (GenAlEx 6.4) was also performed. Using the X² analysis it was identified the genotypic differences between cases and control with higher frequency of the homozygous genotype 3R of SNP- 19 in the cases group (0.418) compared to control group (0.265). Also, it was observed an association between genotype 2R/3R with elevated weight, body mass index, and waist and hip circumferences, but only in the diabetic group (P=< 0.05). The findings in this study suggest that SNP-19 in CAPN10 may participate in the development of DM type 2 in the studied population. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
SNP discovery in the bovine milk transcriptome using RNA-Seq technology.
Cánovas, Angela; Rincon, Gonzalo; Islas-Trejo, Alma; Wickramasinghe, Saumya; Medrano, Juan F
2010-12-01
High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. However, it also is an efficient way to discover coding SNPs. The objective of this study was to perform a SNP discovery analysis in the milk transcriptome using RNA-Seq. Seven milk samples from Holstein cows were analyzed by sequencing cDNAs using the Illumina Genome Analyzer system. We detected 19,175 genes expressed in milk samples corresponding to approximately 70% of the total number of genes analyzed. The SNP detection analysis revealed 100,734 SNPs in Holstein samples, and a large number of those corresponded to differences between the Holstein breed and the Hereford bovine genome assembly Btau4.0. The number of polymorphic SNPs within Holstein cows was 33,045. The accuracy of RNA-Seq SNP discovery was tested by comparing SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced earlier using Sanger sequencing technology. Seventy of 86 SNPs were detected using both RNA-Seq and Sanger sequencing technologies. The KASPar Genotyping System was used to validate unique SNPs found by RNA-Seq but not observed by Sanger technology. Our results confirm that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. This study creates guidelines to maximize the accuracy of SNP discovery and prevention of false-positive SNP detection, and provides more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used to develop genotyping platforms to perform marker-trait association studies in Holstein cattle.
Brodsky, Leonid; Leontovich, Andrei; Shtutman, Michael; Feinstein, Elena
2004-01-01
Mathematical methods of analysis of microarray hybridizations deal with gene expression profiles as elementary units. However, some of these profiles do not reflect a biologically relevant transcriptional response, but rather stem from technical artifacts. Here, we describe two technically independent but rationally interconnected methods for identification of such artifactual profiles. Our diagnostics are based on detection of deviations from uniformity, which is assumed as the main underlying principle of microarray design. Method 1 is based on detection of non-uniformity of microarray distribution of printed genes that are clustered based on the similarity of their expression profiles. Method 2 is based on evaluation of the presence of gene-specific microarray spots within the slides’ areas characterized by an abnormal concentration of low/high differential expression values, which we define as ‘patterns of differentials’. Applying two novel algorithms, for nested clustering (method 1) and for pattern detection (method 2), we can make a dual estimation of the profile’s quality for almost every printed gene. Genes with artifactual profiles detected by method 1 may then be removed from further analysis. Suspicious differential expression values detected by method 2 may be either removed or weighted according to the probabilities of patterns that cover them, thus diminishing their input in any further data analysis. PMID:14999086
Brunner, C; Hoffmann, K; Thiele, T; Schedler, U; Jehle, H; Resch-Genger, U
2015-04-01
Commercial platforms consisting of ready-to-use microarrays printed with target-specific DNA probes, a microarray scanner, and software for data analysis are available for different applications in medical diagnostics and food analysis, detecting, e.g., viral and bacteriological DNA sequences. The transfer of these tools from basic research to routine analysis, their broad acceptance in regulated areas, and their use in medical practice requires suitable calibration tools for regular control of instrument performance in addition to internal assay controls. Here, we present the development of a novel assay-adapted calibration slide for a commercialized DNA-based assay platform, consisting of precisely arranged fluorescent areas of various intensities obtained by incorporating different concentrations of a "green" dye and a "red" dye in a polymer matrix. These dyes present "Cy3" and "Cy5" analogues with improved photostability, chosen based upon their spectroscopic properties closely matching those of common labels for the green and red channel of microarray scanners. This simple tool allows to efficiently and regularly assess and control the performance of the microarray scanner provided with the biochip platform and to compare different scanners. It will be eventually used as fluorescence intensity scale for referencing of assays results and to enhance the overall comparability of diagnostic tests.
Stekel, Dov J.; Sarti, Donatella; Trevino, Victor; Zhang, Lihong; Salmon, Mike; Buckley, Chris D.; Stevens, Mark; Pallen, Mark J.; Penn, Charles; Falciani, Francesco
2005-01-01
A key step in the analysis of microarray data is the selection of genes that are differentially expressed. Ideally, such experiments should be properly replicated in order to infer both technical and biological variability, and the data should be subjected to rigorous hypothesis tests to identify the differentially expressed genes. However, in microarray experiments involving the analysis of very large numbers of biological samples, replication is not always practical. Therefore, there is a need for a method to select differentially expressed genes in a rational way from insufficiently replicated data. In this paper, we describe a simple method that uses bootstrapping to generate an error model from a replicated pilot study that can be used to identify differentially expressed genes in subsequent large-scale studies on the same platform, but in which there may be no replicated arrays. The method builds a stratified error model that includes array-to-array variability, feature-to-feature variability and the dependence of error on signal intensity. We apply this model to the characterization of the host response in a model of bacterial infection of human intestinal epithelial cells. We demonstrate the effectiveness of error model based microarray experiments and propose this as a general strategy for a microarray-based screening of large collections of biological samples. PMID:15800204
Schmidt-Lebuhn, Alexander N; Aitken, Nicola C; Chuah, Aaron
2017-11-01
Datasets of hundreds or thousands of SNPs (Single Nucleotide Polymorphisms) from multiple individuals per species are increasingly used to study population structure, species delimitation and shallow phylogenetics. The principal software tool to infer species or population trees from SNP data is currently the BEAST template SNAPP which uses a Bayesian coalescent analysis. However, it is computationally extremely demanding and tolerates only small amounts of missing data. We used simulated and empirical SNPs from plants (Australian Craspedia, Asteraceae, and Pelargonium, Geraniaceae) to compare species trees produced (1) by SNAPP, (2) using SVD quartets, and (3) using Bayesian and parsimony analysis with several different approaches to summarising data from multiple samples into one set of traits per species. Our aims were to explore the impact of tree topology and missing data on the results, and to test which data summarising and analyses approaches would best approximate the results obtained from SNAPP for empirical data. SVD quartets retrieved the correct topology from simulated data, as did SNAPP except in the case of a very unbalanced phylogeny. Both methods failed to retrieve the correct topology when large amounts of data were missing. Bayesian analysis of species level summary data scoring the two alleles of each SNP as independent characters and parsimony analysis of data scoring each SNP as one character produced trees with branch length distributions closest to the true trees on which SNPs were simulated. For empirical data, Bayesian inference and Dollo parsimony analysis of data scored allele-wise produced phylogenies most congruent with the results of SNAPP. In the case of study groups divergent enough for missing data to be phylogenetically informative (because of additional mutations preventing amplification of genomic fragments or bioinformatic establishment of homology), scoring of SNP data as a presence/absence matrix irrespective of allele content might be an additional option. As this depends on sampling across species being reasonably even and a random distribution of non-informative instances of missing data, however, further exploration of this approach is needed. Properly chosen data summary approaches to inferring species trees from SNP data may represent a potential alternative to currently available individual-level coalescent analyses especially for quick data exploration and when dealing with computationally demanding or patchy datasets. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison
Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth
2006-01-01
Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. PMID:16626497
ERIC Educational Resources Information Center
McGrew, Susan G.; Peters, Brittany R.; Crittendon, Julie A.; Veenstra-VanderWeele, Jeremy
2012-01-01
Genetic testing is recommended for patients with ASD; however specific recommendations vary by specialty. American Academy of Pediatrics and American Academy of Neurology guidelines recommend G-banded karyotype and Fragile X DNA. The American College of Medical Genetics recommends Chromosomal Microarray Analysis (CMA). We determined the yield of…
ERIC Educational Resources Information Center
Grenville-Briggs, Laura J.; Stansfield, Ian
2011-01-01
This report describes a linked series of Masters-level computer practical workshops. They comprise an advanced functional genomics investigation, based upon analysis of a microarray dataset probing yeast DNA damage responses. The workshops require the students to analyse highly complex transcriptomics datasets, and were designed to stimulate…
The observation of transcriptional changes following embryonic ethanol exposure may provide significant insights into the biological response to ethanol exposure. In this study, we used microarray analysis to examine the transcriptional response of the developing limb to a dose ...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovacik, Meric A.; Sen, Banalata; Euling, Susan Y.
Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significancemore » analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data.« less
SNP ID-info: SNP ID searching and visualization platform.
Yang, Cheng-Hong; Chuang, Li-Yeh; Cheng, Yu-Huei; Wen, Cheng-Hao; Chang, Phei-Lang; Chang, Hsueh-Wei
2008-09-01
Many association studies provide the relationship between single nucleotide polymorphisms (SNPs), diseases and cancers, without giving a SNP ID, however. Here, we developed the SNP ID-info freeware to provide the SNP IDs within inputting genetic and physical information of genomes. The program provides an "SNP-ePCR" function to generate the full-sequence using primers and template inputs. In "SNPosition," sequence from SNP-ePCR or direct input is fed to match the SNP IDs from SNP fasta-sequence. In "SNP search" and "SNP fasta" function, information of SNPs within the cytogenetic band, contig position, and keyword input are acceptable. Finally, the SNP ID neighboring environment for inputs is completely visualized in the order of contig position and marked with SNP and flanking hits. The SNP identification problems inherent in NCBI SNP BLAST are also avoided. In conclusion, the SNP ID-info provides a visualized SNP ID environment for multiple inputs and assists systematic SNP association studies. The server and user manual are available at http://bio.kuas.edu.tw/snpid-info.
Explaining the disease phenotype of intergenic SNP through predicted long range regulation.
Chen, Jingqi; Tian, Weidong
2016-10-14
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Liu, Yanfang; Liao, Huidan; Liu, Ying; Guo, Juanjuan; Sun, Yi; Fu, Xiaoliang; Xiao, Ding; Cai, Jifeng; Lan, Lingmei; Xie, Pingli; Zha, Lagabaiyila
2017-04-01
Nonbinary single-nucleotide polymorphisms (SNPs) are potential forensic genetic markers because their discrimination power is greater than that of normal binary SNPs, and that they can detect highly degraded samples. We previously developed a nonbinary SNP multiplex typing assay. In this study, we selected additional 20 nonbinary SNPs from the NCBI SNP database and verified them through pyrosequencing. These 20 nonbinary SNPs were analyzed using the fluorescent-labeled SNaPshot multiplex SNP typing method. The allele frequencies and genetic parameters of these 20 nonbinary SNPs were determined among 314 unrelated individuals from Han populations from China. The total power of discrimination was 0.9999999999994, and the cumulative probability of exclusion was 0.9986. Moreover, the result of the combination of this 20 nonbinary SNP assay with the 20 nonbinary SNP assay we previously developed demonstrated that the cumulative probability of exclusion of the 40 nonbinary SNPs was 0.999991 and that no significant linkage disequilibrium was observed in all 40 nonbinary SNPs. Thus, we concluded that this new system consisting of new 20 nonbinary SNPs could provide highly informative polymorphic data which would be further used in forensic application and would serve as a potentially valuable supplement to forensic DNA analysis. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
SNP by SNP by environment interaction network of alcoholism.
Zollanvari, Amin; Alterovitz, Gil
2017-03-14
Alcoholism has a strong genetic component. Twin studies have demonstrated the heritability of a large proportion of phenotypic variance of alcoholism ranging from 50-80%. The search for genetic variants associated with this complex behavior has epitomized sequence-based studies for nearly a decade. The limited success of genome-wide association studies (GWAS), possibly precipitated by the polygenic nature of complex traits and behaviors, however, has demonstrated the need for novel, multivariate models capable of quantitatively capturing interactions between a host of genetic variants and their association with non-genetic factors. In this regard, capturing the network of SNP by SNP or SNP by environment interactions has recently gained much interest. Here, we assessed 3,776 individuals to construct a network capable of detecting and quantifying the interactions within and between plausible genetic and environmental factors of alcoholism. In this regard, we propose the use of first-order dependence tree of maximum weight as a potential statistical learning technique to delineate the pattern of dependencies underpinning such a complex trait. Using a predictive based analysis, we further rank the genes, demographic factors, biological pathways, and the interactions represented by our SNP [Formula: see text]SNP[Formula: see text]E network. The proposed framework is quite general and can be potentially applied to the study of other complex traits.
Goodman, Corey W; Major, Heather J; Walls, William D; Sheffield, Val C; Casavant, Thomas L; Darbro, Benjamin W
2015-04-01
Chromosomal microarrays (CMAs) are routinely used in both research and clinical laboratories; yet, little attention has been given to the estimation of genome-wide true and false negatives during the assessment of these assays and how such information could be used to calibrate various algorithmic metrics to improve performance. Low-throughput, locus-specific methods such as fluorescence in situ hybridization (FISH), quantitative PCR (qPCR), or multiplex ligation-dependent probe amplification (MLPA) preclude rigorous calibration of various metrics used by copy number variant (CNV) detection algorithms. To aid this task, we have established a comparative methodology, CNV-ROC, which is capable of performing a high throughput, low cost, analysis of CMAs that takes into consideration genome-wide true and false negatives. CNV-ROC uses a higher resolution microarray to confirm calls from a lower resolution microarray and provides for a true measure of genome-wide performance metrics at the resolution offered by microarray testing. CNV-ROC also provides for a very precise comparison of CNV calls between two microarray platforms without the need to establish an arbitrary degree of overlap. Comparison of CNVs across microarrays is done on a per-probe basis and receiver operator characteristic (ROC) analysis is used to calibrate algorithmic metrics, such as log2 ratio threshold, to enhance CNV calling performance. CNV-ROC addresses a critical and consistently overlooked aspect of analytical assessments of genome-wide techniques like CMAs which is the measurement and use of genome-wide true and false negative data for the calculation of performance metrics and comparison of CNV profiles between different microarray experiments. Copyright © 2015 Elsevier Inc. All rights reserved.
Distinct contributions of replication and transcription to mutation rate variation of human genomes.
Cui, Peng; Ding, Feng; Lin, Qiang; Zhang, Lingfang; Li, Ang; Zhang, Zhang; Hu, Songnian; Yu, Jun
2012-02-01
Here, we evaluate the contribution of two major biological processes--DNA replication and transcription--to mutation rate variation in human genomes. Based on analysis of the public human tissue transcriptomics data, high-resolution replicating map of Hela cells and dbSNP data, we present significant correlations between expression breadth, replication time in local regions and SNP density. SNP density of tissue-specific (TS) genes is significantly higher than that of housekeeping (HK) genes. TS genes tend to locate in late-replicating genomic regions and genes in such regions have a higher SNP density compared to those in early-replication regions. In addition, SNP density is found to be positively correlated with expression level among HK genes. We conclude that the process of DNA replication generates stronger mutational pressure than transcription-associated biological processes do, resulting in an increase of mutation rate in TS genes while having weaker effects on HK genes. In contrast, transcription-associated processes are mainly responsible for the accumulation of mutations in highly-expressed HK genes. Copyright © 2012 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.
Arenillas, Leonor; Mallo, Mar; Ramos, Fernando; Guinta, Kathryn; Barragán, Eva; Lumbreras, Eva; Larráyoz, María-José; De Paz, Raquel; Tormo, Mar; Abáigar, María; Pedro, Carme; Cervera, José; Such, Esperanza; José Calasanz, María; Díez-Campelo, María; Sanz, Guillermo F; Hernández, Jesús María; Luño, Elisa; Saumell, Sílvia; Maciejewski, Jaroslaw; Florensa, Lourdes; Solé, Francesc
2013-12-01
Cytogenetic aberrations identified by metaphase cytogenetics (MC) have diagnostic, prognostic, and therapeutic implications in myelodysplastic syndromes (MDS). However, in some MDS patients MC study is unsuccesful. Single nucleotide polymorphism array (SNP-A) based karyotyping could be helpful in these cases. We performed SNP-A in 62 samples from bone marrow or peripheral blood of primary MDS with an unsuccessful MC study. SNP-A analysis enabled the detection of aberrations in 31 (50%) patients. We used the copy number alteration information to apply the International Prognostic Scoring System (IPSS) and we observed differences in survival between the low/intermediate-1 and intermediate-2/high risk patients. We also saw differences in survival between very low/low/intermediate and the high/very high patients when we applied the revised IPSS (IPSS-R). In conclusion, SNP-A can be used successfully in PB samples and the identification of CNA by SNP-A improve the diagnostic and prognostic evaluation of this group of MDS patients. Copyright © 2013 Wiley Periodicals, Inc.
High density genetic mapping identifies new susceptibility loci for rheumatoid arthritis
Eyre, Steve; Bowes, John; Diogo, Dorothée; Lee, Annette; Barton, Anne; Martin, Paul; Zhernakova, Alexandra; Stahl, Eli; Viatte, Sebastien; McAllister, Kate; Amos, Christopher I.; Padyukov, Leonid; Toes, Rene E.M.; Huizinga, Tom W.J.; Wijmenga, Cisca; Trynka, Gosia; Franke, Lude; Westra, Harm-Jan; Alfredsson, Lars; Hu, Xinli; Sandor, Cynthia; de Bakker, Paul I.W.; Davila, Sonia; Khor, Chiea Chuen; Heng, Khai Koon; Andrews, Robert; Edkins, Sarah; Hunt, Sarah E; Langford, Cordelia; Symmons, Deborah; Concannon, Pat; Onengut-Gumuscu, Suna; Rich, Stephen S; Deloukas, Panos; Gonzalez-Gay, Miguel A.; Rodriguez-Rodriguez, Luis; Ärlsetig, Lisbeth; Martin, Javier; Rantapää-Dahlqvist, Solbritt; Plenge, Robert; Raychaudhuri, Soumya; Klareskog, Lars; Gregersen, Peter K; Worthington, Jane
2012-01-01
Summary Using the Immunochip custom single nucleotide polymorphism (SNP) array, designed for dense genotyping of 186 genome wide association study (GWAS) confirmed loci we analysed 11,475 rheumatoid arthritis cases of European ancestry and 15,870 controls for 129,464 markers. The data were combined in meta-analysis with GWAS data from additional independent cases (n=2,363) and controls (n=17,872). We identified fourteen novel loci; nine were associated with rheumatoid arthritis overall and 5 specifically in anti-citrillunated peptide antibody positive disease, bringing the number of confirmed European ancestry rheumatoid arthritis loci to 46. We refined the peak of association to a single gene for 19 loci, identified secondary independent effects at six loci and association to low frequency variants (minor allele frequency <0.05) at 4 loci. Bioinformatic analysis of the data generated strong hypotheses for the causal SNP at seven loci. This study illustrates the advantages of dense SNP mapping analysis to inform subsequent functional investigations. PMID:23143596
R classes and methods for SNP array data.
Scharpf, Robert B; Ruczinski, Ingo
2010-01-01
The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.
Mining microarrays for metabolic meaning: nutritional regulation of hypothalamic gene expression.
Mobbs, Charles V; Yen, Kelvin; Mastaitis, Jason; Nguyen, Ha; Watson, Elizabeth; Wurmbach, Elisa; Sealfon, Stuart C; Brooks, Andrew; Salton, Stephen R J
2004-06-01
DNA microarray analysis has been used to investigate relative changes in the level of gene expression in the CNS, including changes that are associated with disease, injury, psychiatric disorders, drug exposure or withdrawal, and memory formation. We have used oligonucleotide microarrays to identify hypothalamic genes that respond to nutritional manipulation. In addition to commonly used microarray analysis based on criteria such as fold-regulation, we have also found that simply carrying out multiple t tests then sorting by P value constitutes a highly reliable method to detect true regulation, as assessed by real-time polymerase chain reaction (PCR), even for relatively low abundance genes or relatively low magnitude of regulation. Such analyses directly suggested novel mechanisms that mediate effects of nutritional state on neuroendocrine function and are being used to identify regulated gene products that may elucidate the metabolic pathology of obese ob/ob, lean Vgf-/Vgf-, and other models with profound metabolic impairments.
Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.
Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W
1996-01-01
Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227
Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas
2016-09-19
Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.
Psifidi, Androniki; Dovas, Chrysostomos; Banos, Georgios
2011-01-19
Single nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and, in general, currently applied methods give inconsistent results in selected cohorts. In the present study we sought to develop a novel method for accurate detection and quantification of SNP in DNA pooled samples. The development and evaluation of a novel Ligase Chain Reaction (LCR) protocol that uses a DNA-specific fluorescent dye to allow quantitative real-time analysis is described. Different reaction components and thermocycling parameters affecting the efficiency and specificity of LCR were examined. Several protocols, including gap-LCR modifications, were evaluated using plasmid standard and genomic DNA pools. A protocol of choice was identified and applied for the quantification of a polymorphism at codon 136 of the ovine PRNP gene that is associated with susceptibility to a transmissible spongiform encephalopathy in sheep. The real-time LCR protocol developed in the present study showed high sensitivity, accuracy, reproducibility and a wide dynamic range of SNP quantification in different DNA pools. The limits of detection and quantification of SNP frequencies were 0.085% and 0.35%, respectively. The proposed real-time LCR protocol is applicable when sensitive detection and accurate quantification of low copy number mutations in DNA pools is needed. Examples include oncogenes and tumour suppressor genes, infectious diseases, pathogenic bacteria, fungal species, viral mutants, drug resistance resulting from point mutations, and genetically modified organisms in food.
Psifidi, Androniki; Dovas, Chrysostomos; Banos, Georgios
2011-01-01
Background Single nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and, in general, currently applied methods give inconsistent results in selected cohorts. In the present study we sought to develop a novel method for accurate detection and quantification of SNP in DNA pooled samples. Methods The development and evaluation of a novel Ligase Chain Reaction (LCR) protocol that uses a DNA-specific fluorescent dye to allow quantitative real-time analysis is described. Different reaction components and thermocycling parameters affecting the efficiency and specificity of LCR were examined. Several protocols, including gap-LCR modifications, were evaluated using plasmid standard and genomic DNA pools. A protocol of choice was identified and applied for the quantification of a polymorphism at codon 136 of the ovine PRNP gene that is associated with susceptibility to a transmissible spongiform encephalopathy in sheep. Conclusions The real-time LCR protocol developed in the present study showed high sensitivity, accuracy, reproducibility and a wide dynamic range of SNP quantification in different DNA pools. The limits of detection and quantification of SNP frequencies were 0.085% and 0.35%, respectively. Significance The proposed real-time LCR protocol is applicable when sensitive detection and accurate quantification of low copy number mutations in DNA pools is needed. Examples include oncogenes and tumour suppressor genes, infectious diseases, pathogenic bacteria, fungal species, viral mutants, drug resistance resulting from point mutations, and genetically modified organisms in food. PMID:21283808
Zeng, Guohui; Teng, Yaoshu; Zhu, Jin; Zhu, Darong; Yang, Bin; Hu, Linping; Chen, Manman; Fu, Xiao
2018-01-01
The objective of the present study was to investigate the clinical application of magnetic resonance imaging (MRI)-respiratory gating technology for assessing illness severity in children with obstructive sleep apnea hypopnea syndrome (OSAHS).MRI-respiratory gating technology was used to scan the nasopharyngeal cavities of 51 children diagnosed with OSAHS during 6 respiratory phases. Correlations between the ratio of the area of the adenoid to the area of the nasopalatine pharyngeal cavity (Sa/Snp), with the main indexes of polysomnography (PSG), were analyzed. Receiver operator characteristic (ROC) curve and Kappa analysis were used to determine the diagnostic accuracy of Sa/Snp in pediatric OSAHS.The Sa/Snp was positively correlated with the apnea hypopnea index (AHI) (P < .001) and negatively correlated with the lowest oxygen saturation of blood during sleep (LaSO2) (P < .001). ROC analysis in the 6 respiratory phases showed that the area under the curve (AUC) of the Sa/Snp in the end-expiratory phase was the largest (0.992, P < .001), providing a threshold of 69.5% for the diagnosis of severe versus slight-moderate OSAHS in children. Consistency analysis with the AHI showed a diagnosis accordance rate of 96.0% in severe pediatric OSAHS and 96.2% in slight-moderate pediatric OSAHS (Kappa = 0.922, P < .001).Stenosis of the nasopalatine pharyngeal cavity in children with adenoidal hypertrophy was greatest at the end-expiration phase during sleep. The end-expiratory Sa/Snp obtained by a combination of MRI and respiratory gating technology has potential as an important imaging index for diagnosing and evaluating severity in pediatric OSAHS.
Rajkumar, Thangarajan; Samson, Mani; Rama, Ranganathan; Sridevi, Veluswami; Mahji, Urmila; Swaminathan, Rajaraman; Nancy, Nirmala K
2008-11-01
The breast cancer incidence has been increasing in the south Indian women. A case (n=250)-control (n=500) study was undertaken to investigate the role of Single Nucleotide Polymorphisms (SNP's) in GSTM1 (Present/Null); GSTP1 (Ile105Val), p53 (Arg72Pro), TGFbeta1 (Leu10Pro), c-erbB2 (Ile655Val), and GSTT1 (Null/Present) in breast cancer. In addition, the value of the SNP's in predicting primary tumor's pathologic response following neo-adjuvant chemo-radiotherapy was assessed. Genotyping was done using PCR (GSTM1, GSTT1), Taqman Allelic discrimination assay (GSTP1, c-erbB2) and PCR-CTPP (p53 and TGFbeta1). None of the gene SNP's studied were associated with a statistically significant increased risk for the breast cancer. However, combined analysis of the SNP's showed that p53 (Arg/Arg and Arg/Pro) with TGFbeta1 (Pro/Pro and Leu/Pro) were associated with greater than 2 fold increased risk for breast cancer in Univariate (P=0.01) and Multivariate (P=0.003) analysis. There was no statistically significant association for the GST family members with the breast cancer risk. TGFbeta1 (Pro/Pro) allele was found to predict complete pathologic response in the primary tumour following neo-adjuvant chemo-radiotherapy (OR=6.53 and 10.53 in Univariate and Multivariate analysis respectively) (P=0.004) and was independent of stage. This study suggests that SNP's can help predict breast cancer risk in south Indian women and that TGFbeta1 (Pro/Pro) allele is associated with a better pCR in the primary tumour.
Harker, Mark; Carvell, Ann-Marie; Marti, Vernon P J; Riazanskaia, Svetlana; Kelso, Hailey; Taylor, David; Grimshaw, Sally; Arnold, David S; Zillmer, Ruediger; Shaw, Jane; Kirk, Jayne M; Alcasid, Zee M; Gonzales-Tanon, Sheila; Chan, Gertrude P; Rosing, Egge A E; Smith, Adrian M
2014-01-01
A single nucleotide polymorphism (SNP), 538G→A, leading to a G180R substitution in the ABCC11 gene results in reduced concentrations of apocrine derived axillary odour precursors. Determine the axillary odour levels in the SNP ABCC11 genotype variants and to investigate if other parameters associated with odour production are affected. Axillary odour was assessed by subjective quantification and gas chromatography headspace analysis. Metabolite profiles, microbiome diversity and personal hygiene habits were also assessed. Axillary odour in the A/A homozygotes was significantly lower compared to the G/A and G/G genotypes. However, the perception-based measures still detected appreciable levels of axillary odour in the A/A subjects. Metabolomic analysis highlighted significant differences in axillary skin metabolites between A/A subjects compared to those carrying the G allele. These differences resulted in A/A subjects lacking specific volatile odourants in the axillary headspace, but all genotypes produced odoriferous short chain fatty acids. Microbiomic analysis revealed differences in the relative abundance of key bacterial genera associated with odour generation between the different genotypes. Deodorant usage indicated a high level of self awareness of axillary odour levels with A/A individuals less likely to adopt personal hygiene habits designed to eradicate/mask its presence. The SNP in the ABCC11 gene results in lower levels of axillary odour in the A/A homozygotes compared to those carrying the G allele, but A/A subjects still produce noticeable amounts of axillary odour. Differences in axillary skin metabolites, bacterial genera and personal hygiene behaviours also appear to be influenced by this SNP. Copyright © 2013. Published by Elsevier Ireland Ltd.
Zhou, Hongfei; Diao, Mengyuan; Zhang, Mingyue
2016-08-01
The associations of ANXA11 gene polymorphisms and susceptibility to sarcoidosis have been evaluated in recent years. However, the results remain controversial, especially in different ethnicity. To assess the associations between ANXA11 and sarcoidosis, we conducted this meta-analysis. Articles were searched in MEDLINE, EMBASE and PubMed from their establishment date to August of 2014, and 4,567 sarcoidosis patients and 4,278 controls from 6 studies were included. The strength of associations was determined by ORs with 95% CIs. The associations between ANXA11 SNP rs1049550, rs2573346, rs2789679 polymorphisms and sarcoidosis risk were assessed using additive, recessive and dominant models. ANXA11 SNP rs2573346 and rs2789679 T allele conferred protection against sarcoidosis (OR: 0.664, 95% CI: 0.607-0.726 for rs2573346, and OR: 0.698, 95% CI: 0.640-0.762 for rs2789679). For SNP rs1049550, individuals carrying the ''T'' allele (TT+CT) had a nearly 46% increased risk for the development of sarcoidosis, when compared with CC homozygotes (OR: 1.461, 95% CI: 1.183-1.803) in overall population. A significant association was also found in additive model (OR: 1.477, 95% CI: 1.328-1.642 for CC vs. CT; OR: 0.610, 95% CI: 0.412-0.905 for TT vs. CC). In addition, ethnicity factors may contribute to the disease risk. The meta-analysis revealed that ''T'' allele of ANXA11 SNP rs2573346 and rs2789679 conferred protection against sarcoidosis. ''C'' allele of SNP rs1049550 may be a risk factor for sarcoidosis in overall population. Our study shows that ANXA11 closely associated with the development of sarcoidosis but further studies in different ethnicity were needed.
Chevret, Sylvie; Nibourel, Olivier; Cheok, Meyling; Pautas, Cécile; Duléry, Rémy; Boyer, Thomas; Cayuela, Jean-Michel; Hayette, Sandrine; Raffoux, Emmanuel; Farhat, Hassan; Boissel, Nicolas; Terre, Christine
2014-01-01
We recently showed that the addition of fractionated doses of gemtuzumab ozogamicin (GO) to standard chemotherapy improves clinical outcome of acute myeloid leukemia (AML) patients. In the present study, we performed mutational analysis of 11 genes (FLT3, NPM1, CEBPA, MLL, WT1, IDH1/2, RUNX1, ASXL1, TET2, DNMT3A), EVI1 overexpression screening, and 6.0 single-nucleotide polymorphism array (SNP-A) analysis in diagnostic samples of the 278 AML patients enrolled in the ALFA-0701 trial. In cytogenetically normal (CN) AML (n = 146), 38% of the patients had at least 1 SNP-A lesion and 89% of the patients had at least 1 molecular alteration. In multivariate analysis, the independent predictors of higher cumulative incidence of relapse were unfavorable karyotype (P = 0.013) and randomization in the control arm (P = 0.007) in the whole cohort, and MLL partial tandem duplications (P = 0.014) and DNMT3A mutations (P = 0.010) in CN-AML. The independent predictors of shorter overall survival (OS) were unfavorable karyotype (P < 0.001) and SNP-A lesion(s) (P = 0.001) in the whole cohort, and SNP-A lesion(s) (P = 0.006), DNMT3A mutations (P = 0.042) and randomization in the control arm (P = 0.043) in CN-AML. Interestingly, CN-AML patients benefited preferentially more from GO treatment as compared to AML patients with abnormal cytogenetics (hazard ratio for death, 0.52 versus 1.14; test for interaction, P = 0.04). Although the interaction test was not statistically significant, the OS benefit associated with GO treatment appeared also more pronounced in FLT3 internal tandem duplication positive than in negative patients. PMID:24659740
Renneville, Aline; Abdelali, Raouf Ben; Chevret, Sylvie; Nibourel, Olivier; Cheok, Meyling; Pautas, Cécile; Duléry, Rémy; Boyer, Thomas; Cayuela, Jean-Michel; Hayette, Sandrine; Raffoux, Emmanuel; Farhat, Hassan; Boissel, Nicolas; Terre, Christine; Dombret, Hervé; Castaigne, Sylvie; Preudhomme, Claude
2014-02-28
We recently showed that the addition of fractionated doses of gemtuzumab ozogamicin (GO) to standard chemotherapy improves clinical outcome of acute myeloid leukemia (AML) patients. In the present study, we performed mutational analysis of 11 genes (FLT3, NPM1, CEBPA, MLL, WT1, IDH1/2, RUNX1, ASXL1, TET2, DNMT3A), EVI1 overexpression screening, and 6.0 single-nucleotide polymorphism array (SNP-A) analysis in diagnostic samples of the 278 AML patients enrolled in the ALFA-0701 trial. In cytogenetically normal (CN) AML (n=146), 38% of the patients had at least 1 SNP-A lesion and 89% of the patients had at least 1 molecular alteration. In multivariate analysis, the independent predictors of higher cumulative incidence of relapse were unfavorable karyotype (P = 0.013) and randomization in the control arm (P = 0.007) in the whole cohort, and MLL partial tandem duplications (P = 0.014) and DNMT3A mutations (P = 0.010) in CN-AML. The independent predictors of shorter overall survival (OS) were unfavorable karyotype (P <0.001) and SNP-A lesion(s) (P = 0.001) in the whole cohort, and SNP-A lesion(s) (P = 0.006), DNMT3A mutations (P = 0.042) and randomization in the control arm (P = 0.043) in CN-AML. Interestingly, CN-AML patients benefited preferentially more from GO treatment as compared to AML patients with abnormal cytogenetics (hazard ratio for death, 0.52 versus 1.14; test for interaction, P = 0.04). Although the interaction test was not statistically significant, the OS benefit associated with GO treatment appeared also more pronounced in FLT3 internal tandem duplication positive than in negative patients.
Shahid, Muhammad Qasim; Çiftçi, Vahdettin; E. Sáenz de Miera, Luis; Aasim, Muhammad; Nadeem, Muhammad Azhar; Aktaş, Husnu; Özkan, Hakan; Hatipoğlu, Rüştü
2017-01-01
Until now, little attention has been paid to the geographic distribution and evaluation of genetic diversity of durum wheat from the Central Fertile Crescent (modern-day Turkey and Syria). Turkey and Syria are considered as primary centers of wheat diversity, and thousands of locally adapted wheat landraces are still present in the farmers’ small fields. We planned this study to evaluate the genetic diversity of durum wheat landraces from the Central Fertile Crescent by genotyping based on DArTseq and SNP analysis. A total of 39,568 DArTseq and 20,661 SNP markers were used to characterize the genetic characteristic of 91 durum wheat land races. Clustering based on Neighbor joining analysis, principal coordinate as well as Bayesian model implemented in structure, clearly showed that the grouping pattern is not associated with the geographical distribution of the durum wheat due to the mixing of the Turkish and Syrian landraces. Significant correlation between DArTseq and SNP markers was observed in the Mantel test. However, we detected a non-significant relationship between geographical coordinates and DArTseq (r = -0.085) and SNP (r = -0.039) loci. These results showed that unconscious farmer selection and lack of the commercial varieties might have resulted in the exchange of genetic material and this was apparent in the genetic structure of durum wheat in Turkey and Syria. The genomic characterization presented here is an essential step towards a future exploitation of the available durum wheat genetic resources in genomic and breeding programs. The results of this study have also depicted a clear insight about the genetic diversity of wheat accessions from the Central Fertile Crescent. PMID:28099442
Genome wide association study (GWAS) for grain yield in rice cultivated under water deficit.
Pantalião, Gabriel Feresin; Narciso, Marcelo; Guimarães, Cléber; Castro, Adriano; Colombari, José Manoel; Breseghello, Flavio; Rodrigues, Luana; Vianello, Rosana Pereira; Borba, Tereza Oliveira; Brondani, Claudio
2016-12-01
The identification of rice drought tolerant materials is crucial for the development of best performing cultivars for the upland cultivation system. This study aimed to identify markers and candidate genes associated with drought tolerance by Genome Wide Association Study analysis, in order to develop tools for use in rice breeding programs. This analysis was made with 175 upland rice accessions (Oryza sativa), evaluated in experiments with and without water restriction, and 150,325 SNPs. Thirteen SNP markers associated with yield under drought conditions were identified. Through stepwise regression analysis, eight SNP markers were selected and validated in silico, and when tested by PCR, two out of the eight SNP markers were able to identify a group of rice genotypes with higher productivity under drought. These results are encouraging for deriving markers for the routine analysis of marker assisted selection. From the drought experiment, including the genes inherited in linkage blocks, 50 genes were identified, from which 30 were annotated, and 10 were previously related to drought and/or abiotic stress tolerance, such as the transcription factors WRKY and Apetala2, and protein kinases.
Bikel, Shirley; Jacobo-Albavera, Leonor; Sánchez-Muñoz, Fausto; Cornejo-Granados, Fernanda; Canizales-Quinteros, Samuel; Soberón, Xavier; Sotelo-Mundo, Rogerio R; Del Río-Navarro, Blanca E; Mendoza-Vargas, Alfredo; Sánchez, Filiberto; Ochoa-Leyva, Adrian
2017-01-01
In spite of the emergence of RNA sequencing (RNA-seq), microarrays remain in widespread use for gene expression analysis in the clinic. There are over 767,000 RNA microarrays from human samples in public repositories, which are an invaluable resource for biomedical research and personalized medicine. The absolute gene expression analysis allows the transcriptome profiling of all expressed genes under a specific biological condition without the need of a reference sample. However, the background fluorescence represents a challenge to determine the absolute gene expression in microarrays. Given that the Y chromosome is absent in female subjects, we used it as a new approach for absolute gene expression analysis in which the fluorescence of the Y chromosome genes of female subjects was used as the background fluorescence for all the probes in the microarray. This fluorescence was used to establish an absolute gene expression threshold, allowing the differentiation between expressed and non-expressed genes in microarrays. We extracted the RNA from 16 children leukocyte samples (nine males and seven females, ages 6-10 years). An Affymetrix Gene Chip Human Gene 1.0 ST Array was carried out for each sample and the fluorescence of 124 genes of the Y chromosome was used to calculate the absolute gene expression threshold. After that, several expressed and non-expressed genes according to our absolute gene expression threshold were compared against the expression obtained using real-time quantitative polymerase chain reaction (RT-qPCR). From the 124 genes of the Y chromosome, three genes (DDX3Y, TXLNG2P and EIF1AY) that displayed significant differences between sexes were used to calculate the absolute gene expression threshold. Using this threshold, we selected 13 expressed and non-expressed genes and confirmed their expression level by RT-qPCR. Then, we selected the top 5% most expressed genes and found that several KEGG pathways were significantly enriched. Interestingly, these pathways were related to the typical functions of leukocytes cells, such as antigen processing and presentation and natural killer cell mediated cytotoxicity. We also applied this method to obtain the absolute gene expression threshold in already published microarray data of liver cells, where the top 5% expressed genes showed an enrichment of typical KEGG pathways for liver cells. Our results suggest that the three selected genes of the Y chromosome can be used to calculate an absolute gene expression threshold, allowing a transcriptome profiling of microarray data without the need of an additional reference experiment. Our approach based on the establishment of a threshold for absolute gene expression analysis will allow a new way to analyze thousands of microarrays from public databases. This allows the study of different human diseases without the need of having additional samples for relative expression experiments.
Henshall, John M; Dierens, Leanne; Sellars, Melony J
2014-09-02
While much attention has focused on the development of high-density single nucleotide polymorphism (SNP) assays, the costs of developing and running low-density assays have fallen dramatically. This makes it feasible to develop and apply SNP assays for agricultural species beyond the major livestock species. Although low-cost low-density assays may not have the accuracy of the high-density assays widely used in human and livestock species, we show that when combined with statistical analysis approaches that use quantitative instead of discrete genotypes, their utility may be improved. The data used in this study are from a 63-SNP marker Sequenom® iPLEX Platinum panel for the Black Tiger shrimp, for which high-density SNP assays are not currently available. For quantitative genotypes that could be estimated, in 5% of cases the most likely genotype for an individual at a SNP had a probability of less than 0.99. Matrix formulations of maximum likelihood equations for parentage assignment were developed for the quantitative genotypes and also for discrete genotypes perturbed by an assumed error term. Assignment rates that were based on maximum likelihood with quantitative genotypes were similar to those based on maximum likelihood with perturbed genotypes but, for more than 50% of cases, the two methods resulted in individuals being assigned to different families. Treating genotypes as quantitative values allows the same analysis framework to be used for pooled samples of DNA from multiple individuals. Resulting correlations between allele frequency estimates from pooled DNA and individual samples were consistently greater than 0.90, and as high as 0.97 for some pools. Estimates of family contributions to the pools based on quantitative genotypes in pooled DNA had a correlation of 0.85 with estimates of contributions from DNA-derived pedigree. Even with low numbers of SNPs of variable quality, parentage testing and family assignment from pooled samples are sufficiently accurate to provide useful information for a breeding program. Treating genotypes as quantitative values is an alternative to perturbing genotypes using an assumed error distribution, but can produce very different results. An understanding of the distribution of the error is required for SNP genotyping platforms.
Genetic source tracking of an anthrax outbreak in Shaanxi province, China.
Liu, Dong-Li; Wei, Jian-Chun; Chen, Qiu-Lan; Guo, Xue-Jun; Zhang, En-Min; He, Li; Liang, Xu-Dong; Ma, Guo-Zhu; Zhou, Ti-Cao; Yin, Wen-Wu; Liu, Wei; Liu, Kai; Shi, Yi; Ji, Jian-Jun; Zhang, Hui-Juan; Ma, Lin; Zhang, Fa-Xin; Zhang, Zhi-Kai; Zhou, Hang; Yu, Hong-Jie; Kan, Biao; Xu, Jian-Guo; Liu, Feng; Li, Wei
2017-01-17
Anthrax is an acute zoonotic infectious disease caused by the bacterium known as Bacillus anthracis. From 26 July to 8 August 2015, an outbreak with 20 suspected cutaneous anthrax cases was reported in Ganquan County, Shaanxi province in China. The genetic source tracking analysis of the anthrax outbreak was performed by molecular epidemiological methods in this study. Three molecular typing methods, namely canonical single nucleotide polymorphisms (canSNP), multiple-locus variable-number tandem repeat analysis (MLVA), and single nucleotide repeat (SNR) analysis, were used to investigate the possible source of transmission and identify the genetic relationship among the strains isolated from human cases and diseased animals during the outbreak. Five strains isolated from diseased mules were clustered together with patients' isolates using canSNP typing and MLVA. The causative B. anthracis lineages in this outbreak belonged to the A.Br.001/002 canSNP subgroup and the MLVA15-31 genotype (the 31 genotype in MLVA15 scheme). Because nine isolates from another four provinces in China were clustered together with outbreak-related strains by the canSNP (A.Br.001/002 subgroup) and MLVA15 method (MLVA15-31 genotype), still another SNR analysis (CL10, CL12, CL33, and CL35) was used to source track the outbreak, and the results suggesting that these patients in the anthrax outbreak were probably infected by the same pathogen clone. It was deduced that the anthrax outbreak occurred in Shaanxi province, China in 2015 was a local occurrence.
Tyrer, Jonathan; Fasching, Peter A.; Beckmann, Matthias W.; Ekici, Arif B.; Schulz-Wendtland, Rüdiger; Bojesen, Stig E.; Nordestgaard, Børge G.; Flyger, Henrik; Milne, Roger L.; Arias, José Ignacio; Menéndez, Primitiva; Benítez, Javier; Chang-Claude, Jenny; Hein, Rebecca; Wang-Gohrke, Shan; Nevanlinna, Heli; Heikkinen, Tuomas; Aittomäki, Kristiina; Blomqvist, Carl; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Kataja, Vesa; Beesley, Jonathan; Chen, Xiaoqing; Chenevix-Trench, Georgia; Couch, Fergus J.; Olson, Janet E.; Fredericksen, Zachary S.; Wang, Xianshu; Giles, Graham G.; Severi, Gianluca; Baglietto, Laura; Southey, Melissa C.; Devilee, Peter; Tollenaar, Rob A. E. M.; Seynaeve, Caroline; García-Closas, Montserrat; Lissowska, Jolanta; Sherman, Mark E.; Bolton, Kelly L.; Hall, Per; Czene, Kamila; Cox, Angela; Brock, Ian W.; Elliott, Graeme C.; Reed, Malcolm W. R.; Greenberg, David; Anton-Culver, Hoda; Ziogas, Argyrios; Humphreys, Manjeet; Easton, Douglas F.; Caporaso, Neil E.; Pharoah, Paul D. P.
2010-01-01
Background Traditional prognostic factors for survival and treatment response of patients with breast cancer do not fully account for observed survival variation. We used available genotype data from a previously conducted two-stage, breast cancer susceptibility genome-wide association study (ie, Studies of Epidemiology and Risk factors in Cancer Heredity [SEARCH]) to investigate associations between variation in germline DNA and overall survival. Methods We evaluated possible associations between overall survival after a breast cancer diagnosis and 10 621 germline single-nucleotide polymorphisms (SNPs) from up to 3761 patients with invasive breast cancer (including 647 deaths and 26 978 person-years at risk) that were genotyped previously in the SEARCH study with high-density oligonucleotide microarrays (ie, hypothesis-generating set). Associations with all-cause mortality were assessed for each SNP by use of Cox regression analysis, generating a per rare allele hazard ratio (HR). To validate putative associations, we used patient genotype information that had been obtained with 5′ nuclease assay or mass spectrometry and overall survival information for up to 14 096 patients with invasive breast cancer (including 2303 deaths and 70 019 person-years at risk) from 15 international case–control studies (ie, validation set). Fixed-effects meta-analysis was used to generate an overall effect estimate in the validation dataset and in combined SEARCH and validation datasets. All statistical tests were two-sided. Results In the hypothesis-generating dataset, SNP rs4778137 (C>G) of the OCA2 gene at 15q13.1 was statistically significantly associated with overall survival among patients with estrogen receptor–negative tumors, with the rare G allele being associated with increased overall survival (HR of death per rare allele carried = 0.56, 95% confidence interval [CI] = 0.41 to 0.75, P = 9.2 × 10−5). This association was also observed in the validation dataset (HR of death per rare allele carried = 0.88, 95% CI = 0.78 to 0.99, P = .03) and in the combined dataset (HR of death per rare allele carried = 0.82, 95% CI = 0.73 to 0.92, P = 5 × 10−4). Conclusion The rare G allele of the OCA2 polymorphism, rs4778137, may be associated with improved overall survival among patients with estrogen receptor–negative breast cancer. PMID:20308648
Jiang, Rong; French, John E.; Stober, Vandy P.; Kang-Sickel, Juei-Chuan C.; Zou, Fei
2012-01-01
Background: Individual genetic variation that results in differences in systemic response to xenobiotic exposure is not accounted for as a predictor of outcome in current exposure assessment models. Objective: We developed a strategy to investigate individual differences in single-nucleotide polymorphisms (SNPs) as genetic markers associated with naphthyl–keratin adduct (NKA) levels measured in the skin of workers exposed to naphthalene. Methods: The SNP-association analysis was conducted in PLINK using candidate-gene analysis and genome-wide analysis. We identified significant SNP–NKA associations and investigated the potential impact of these SNPs along with personal and workplace factors on NKA levels using a multiple linear regression model and the Pratt index. Results: In candidate-gene analysis, a SNP (rs4852279) located near the CYP26B1 gene contributed to the 2-naphthyl–keratin adduct (2NKA) level. In the multiple linear regression model, the SNP rs4852279, dermal exposure, exposure time, task replacing foam, age, and ethnicity all were significant predictors of 2NKA level. In genome-wide analysis, no single SNP reached genome-wide significance for NKA levels (all p ≥ 1.05 × 10–5). Pathway and network analyses of SNPs associated with NKA levels were predicted to be involved in the regulation of cellular processes and homeostasis. Conclusions: These results provide evidence that a quantitative biomarker can be used as an intermediate phenotype when investigating the association between genetic markers and exposure–dose relationship in a small, well-characterized exposed worker population. PMID:22391508
DOE Office of Scientific and Technical Information (OSTI.GOV)
With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and treesmore » determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less
Nakajima, Ayaka; Kawaguchi, Fuki; Uemoto, Yoshinobu; Fukushima, Moriyuki; Yoshida, Emi; Iwamoto, Eiji; Akiyama, Takayuki; Kohama, Namiko; Kobayashi, Eiji; Honda, Takeshi; Oyama, Kenji; Mannen, Hideyuki; Sasazaki, Shinji
2018-05-01
The objective of this study was to identify genomic regions associated with fat-related traits using a Japanese Black cattle population in Hyogo. From 1836 animals, those with high or low values were selected on the basis of corrected phenotype and then pooled into high and low groups (n = 100 each), respectively. DNA pool-based genome-wide association study (GWAS) was performed using Illumina BovineSNP50 BeadChip v2 with three replicate assays for each pooled sample. GWAS detected that two single nucleotide polymorphisms (SNPs) on BTA7 (ARS-BFGL-NGS-35463 and Hapmap23838-BTA-163815) and one SNP on BTA12 (ARS-BFGL-NGS-2915) significantly affected fat percentage (FAR). The significance of ARS-BFGL-NGS-35463 on BTA7 was confirmed by individual genotyping in all pooled samples. Moreover, association analysis between SNP and FAR in 803 Japanese Black cattle revealed a significant effect of SNP on FAR. Thus, further investigation of these regions is required to identify FAR-associated genes and mutations, which can lead to the development of DNA markers for marker-assisted selection for the genetic improvement of beef quality. © 2018 Japanese Society of Animal Science.
Chipster: user-friendly analysis software for microarray and other high-throughput data.
Kallio, M Aleksi; Tuimala, Jarno T; Hupponen, Taavi; Klemelä, Petri; Gentile, Massimiliano; Scheinin, Ilari; Koski, Mikko; Käki, Janne; Korpelainen, Eija I
2011-10-14
The growth of high-throughput technologies such as microarrays and next generation sequencing has been accompanied by active research in data analysis methodology, producing new analysis methods at a rapid pace. While most of the newly developed methods are freely available, their use requires substantial computational skills. In order to enable non-programming biologists to benefit from the method development in a timely manner, we have created the Chipster software. Chipster (http://chipster.csc.fi/) brings a powerful collection of data analysis methods within the reach of bioscientists via its intuitive graphical user interface. Users can analyze and integrate different data types such as gene expression, miRNA and aCGH. The analysis functionality is complemented with rich interactive visualizations, allowing users to select datapoints and create new gene lists based on these selections. Importantly, users can save the performed analysis steps as reusable, automatic workflows, which can also be shared with other users. Being a versatile and easily extendable platform, Chipster can be used for microarray, proteomics and sequencing data. In this article we describe its comprehensive collection of analysis and visualization tools for microarray data using three case studies. Chipster is a user-friendly analysis software for high-throughput data. Its intuitive graphical user interface enables biologists to access a powerful collection of data analysis and integration tools, and to visualize data interactively. Users can collaborate by sharing analysis sessions and workflows. Chipster is open source, and the server installation package is freely available.
Chipster: user-friendly analysis software for microarray and other high-throughput data
2011-01-01
Background The growth of high-throughput technologies such as microarrays and next generation sequencing has been accompanied by active research in data analysis methodology, producing new analysis methods at a rapid pace. While most of the newly developed methods are freely available, their use requires substantial computational skills. In order to enable non-programming biologists to benefit from the method development in a timely manner, we have created the Chipster software. Results Chipster (http://chipster.csc.fi/) brings a powerful collection of data analysis methods within the reach of bioscientists via its intuitive graphical user interface. Users can analyze and integrate different data types such as gene expression, miRNA and aCGH. The analysis functionality is complemented with rich interactive visualizations, allowing users to select datapoints and create new gene lists based on these selections. Importantly, users can save the performed analysis steps as reusable, automatic workflows, which can also be shared with other users. Being a versatile and easily extendable platform, Chipster can be used for microarray, proteomics and sequencing data. In this article we describe its comprehensive collection of analysis and visualization tools for microarray data using three case studies. Conclusions Chipster is a user-friendly analysis software for high-throughput data. Its intuitive graphical user interface enables biologists to access a powerful collection of data analysis and integration tools, and to visualize data interactively. Users can collaborate by sharing analysis sessions and workflows. Chipster is open source, and the server installation package is freely available. PMID:21999641
CD44 Gene Polymorphisms in Breast Cancer Risk and Prognosis: A Study in North Indian Population
Tulsyan, Sonam; Agarwal, Gaurav; Lal, Punita; Agrawal, Sushma; Mittal, Rama Devi; Mittal, Balraj
2013-01-01
Background Cell surface biomarker CD44 plays an important role in breast cancer cell growth, differentiation, invasion, angiogenesis and tumour metastasis. Therefore, we aimed to investigate the role of CD44 gene polymorphisms in breast cancer risk and prognosis in North Indian population. Materials & Methods A total of 258 breast cancer patients and 241 healthy controls were included in the case-control study for risk prediction. According to RECIST, 114 patients who received neo-adjuvant chemotherapy were recruited for the evaluation of breast cancer prognosis. We examined the association of tagging SNP (rs353639) of Hapmap Gujrati Indians in Houston (GIH population) in CD44 gene along with a significant reported SNP (rs13347) in Chinese population by genotyping using Taqman allelic discrimination assays. Statistical analysis was done using SPSS software, version 17. In-silico analysis for prediction of functional effects was done using F-SNP and FAST-SNP. Results No significant association of both the genetic variants of the CD44 gene polymorphisms was found with breast cancer risk. On performing univariate analysis with clinicopathological characteristics and treatment response, we found significant association of genotype (CT+TT) of rs13347 polymorphism with earlier age of onset (P = 0.029, OR = 0.037). However, significance was lost in multivariate analysis. For rs353639 polymorphism, significant association was seen with clinical tumour size, both at the genotypic (AC+CC) (P = 0.039, OR = 3.02) as well as the allelic (C) (P = 0.042, OR = 2.87) levels. On performing multivariate analysis, increased significance of variant genotype (P = 0.017, OR = 4.29) and allele (P = 0.025, OR = 3.34) of rs353639 was found with clinical tumour size. In-silico analysis using F-SNP, showed altered transcriptional regulation for rs353639 polymorphism. Conclusions These findings suggest that CD44 rs353639 genetic variants may have significant effect in breast cancer prognosis. However, both the polymorphisms- rs13347 and rs353639 had no effect on breast cancer susceptibility. PMID:23940692
A database for the analysis of immunity genes in Drosophila: PADMA database.
Lee, Mark J; Mondal, Ariful; Small, Chiyedza; Paddibhatla, Indira; Kawaguchi, Akira; Govind, Shubha
2011-01-01
While microarray experiments generate voluminous data, discerning trends that support an existing or alternative paradigm is challenging. To synergize hypothesis building and testing, we designed the Pathogen Associated Drosophila MicroArray (PADMA) database for easy retrieval and comparison of microarray results from immunity-related experiments (www.padmadatabase.org). PADMA also allows biologists to upload their microarray-results and compare it with datasets housed within PADMA. We tested PADMA using a preliminary dataset from Ganaspis xanthopoda-infected fly larvae, and uncovered unexpected trends in gene expression, reshaping our hypothesis. Thus, the PADMA database will be a useful resource to fly researchers to evaluate, revise, and refine hypotheses.
ERIC Educational Resources Information Center
Tra, Yolande V.; Evans, Irene M.
2010-01-01
"BIO2010" put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on…
ERIC Educational Resources Information Center
Al-Mamari, Watfa; Al-Saegh, Abeer; Al-Kindy, Adila; Bruwer, Zandre; Al-Murshedi, Fathiya; Al-Thihli, Khalid
2015-01-01
Autism Spectrum Disorders are a complicated group of disorders characterized with heterogeneous genetic etiologies. The genetic investigations for this group of disorders have expanded considerably over the past decade. In our study we designed a tired approach and studied the diagnostic yield of chromosomal microarray analysis on patients…
Immunological Targeting of Tumor Initiating Prostate Cancer Cells
2014-10-01
clinically using well-accepted immuno-competent animal models. 2) Keywords: Prostate Cancer, Lymphocyte, Vaccine, Antibody 3) Overall Project Summary...castrate animals . Task 1: Identify and verify antigenic targets from CAstrate Resistant Luminal Epithelial Cells (CRLEC) (months 1-16... animals per group will be processed to derive sufficient RNA for microarray analysis; the experiment will be repeated x 3. Microarray analysis will
MiMiR – an integrated platform for microarray data sharing, mining and analysis
Tomlinson, Chris; Thimma, Manjula; Alexandrakis, Stelios; Castillo, Tito; Dennis, Jayne L; Brooks, Anthony; Bradley, Thomas; Turnbull, Carly; Blaveri, Ekaterini; Barton, Geraint; Chiba, Norie; Maratou, Klio; Soutter, Pat; Aitman, Tim; Game, Laurence
2008-01-01
Background Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data. Results A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package. Conclusion The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies. PMID:18801157
MiMiR--an integrated platform for microarray data sharing, mining and analysis.
Tomlinson, Chris; Thimma, Manjula; Alexandrakis, Stelios; Castillo, Tito; Dennis, Jayne L; Brooks, Anthony; Bradley, Thomas; Turnbull, Carly; Blaveri, Ekaterini; Barton, Geraint; Chiba, Norie; Maratou, Klio; Soutter, Pat; Aitman, Tim; Game, Laurence
2008-09-18
Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data. A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package. The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andersen, G.L.; He, Z.; DeSantis, T.Z.
Microarrays have proven to be a useful and high-throughput method to provide targeted DNA sequence information for up to many thousands of specific genetic regions in a single test. A microarray consists of multiple DNA oligonucleotide probes that, under high stringency conditions, hybridize only to specific complementary nucleic acid sequences (targets). A fluorescent signal indicates the presence and, in many cases, the abundance of genetic regions of interest. In this chapter we will look at how microarrays are used in microbial ecology, especially with the recent increase in microbial community DNA sequence data. Of particular interest to microbial ecologists, phylogeneticmore » microarrays are used for the analysis of phylotypes in a community and functional gene arrays are used for the analysis of functional genes, and, by inference, phylotypes in environmental samples. A phylogenetic microarray that has been developed by the Andersen laboratory, the PhyloChip, will be discussed as an example of a microarray that targets the known diversity within the 16S rRNA gene to determine microbial community composition. Using multiple, confirmatory probes to increase the confidence of detection and a mismatch probe for every perfect match probe to minimize the effect of cross-hybridization by non-target regions, the PhyloChip is able to simultaneously identify any of thousands of taxa present in an environmental sample. The PhyloChip is shown to reveal greater diversity within a community than rRNA gene sequencing due to the placement of the entire gene product on the microarray compared with the analysis of up to thousands of individual molecules by traditional sequencing methods. A functional gene array that has been developed by the Zhou laboratory, the GeoChip, will be discussed as an example of a microarray that dynamically identifies functional activities of multiple members within a community. The recent version of GeoChip contains more than 24,000 50mer oligonucleotide probes and covers more than 10,000 gene sequences in 150 gene categories involved in carbon, nitrogen, sulfur, and phosphorus cycling, metal resistance and reduction, and organic contaminant degradation. GeoChip can be used as a generic tool for microbial community analysis, and also link microbial community structure to ecosystem functioning. Examples of the application of both arrays in different environmental samples will be described in the two subsequent sections.« less
Gillet, Jean-Pierre; Molina, Thierry Jo; Jamart, Jacques; Gaulard, Philippe; Leroy, Karen; Briere, Josette; Theate, Ivan; Thieblemont, Catherine; Bosly, Andre; Herin, Michel; Hamels, Jacques; Remacle, Jose
2009-03-01
Lymphomas are classified according to the World Health Organisation (WHO) classification which defines subtypes on the basis of clinical, morphological, immunophenotypic, molecular and cytogenetic criteria. Differential diagnosis of the subtypes is sometimes difficult, especially for small B-cell lymphoma (SBCL). Standardisation of molecular genetic assays using multiple gene expression analysis by microarrays could be a useful complement to the current diagnosis. The aim of the present study was to develop a low density DNA microarray for the analysis of 107 genes associated with B-cell non-Hodgkin lymphoma and to evaluate its performance in the diagnosis of SBCL. A predictive tool based on Fisher discriminant analysis using a training set of 40 patients including four different subtypes (follicular lymphoma n = 15, mantle cell lymphoma n = 7, B-cell chronic lymphocytic leukemia n = 6 and splenic marginal zone lymphoma n = 12) was designed. A short additional preliminary analysis to gauge the accuracy of this signature was then performed on an external set of nine patients. Using this model, eight of nine of those samples were classified successfully. This pilot study demonstrates that such a microarray tool may be a promising diagnostic approach for small B-cell non-Hodgkin lymphoma.
MAGMA: analysis of two-channel microarrays made easy.
Rehrauer, Hubert; Zoller, Stefan; Schlapbach, Ralph
2007-07-01
The web application MAGMA provides a simple and intuitive interface to identify differentially expressed genes from two-channel microarray data. While the underlying algorithms are not superior to those of similar web applications, MAGMA is particularly user friendly and can be used without prior training. The user interface guides the novice user through the most typical microarray analysis workflow consisting of data upload, annotation, normalization and statistical analysis. It automatically generates R-scripts that document MAGMA's entire data processing steps, thereby allowing the user to regenerate all results in his local R installation. The implementation of MAGMA follows the model-view-controller design pattern that strictly separates the R-based statistical data processing, the web-representation and the application logic. This modular design makes the application flexible and easily extendible by experts in one of the fields: statistical microarray analysis, web design or software development. State-of-the-art Java Server Faces technology was used to generate the web interface and to perform user input processing. MAGMA's object-oriented modular framework makes it easily extendible and applicable to other fields and demonstrates that modern Java technology is also suitable for rather small and concise academic projects. MAGMA is freely available at www.magma-fgcz.uzh.ch.
Tomato Expression Database (TED): a suite of data presentation and analysis tools
Fei, Zhangjun; Tang, Xuemei; Alba, Rob; Giovannoni, James
2006-01-01
The Tomato Expression Database (TED) includes three integrated components. The Tomato Microarray Data Warehouse serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. In addition to expression data, TED stores experimental design and array information in compliance with the MIAME guidelines and provides web interfaces for researchers to retrieve data for their own analysis and use. The Tomato Microarray Expression Database contains normalized and processed microarray data for ten time points with nine pair-wise comparisons during fruit development and ripening in a normal tomato variety and nearly isogenic single gene mutants impacting fruit development and ripening. Finally, the Tomato Digital Expression Database contains raw and normalized digital expression (EST abundance) data derived from analysis of the complete public tomato EST collection containing >150 000 ESTs derived from 27 different non-normalized EST libraries. This last component also includes tools for the comparison of tomato and Arabidopsis digital expression data. A set of query interfaces and analysis, and visualization tools have been developed and incorporated into TED, which aid users in identifying and deciphering biologically important information from our datasets. TED can be accessed at . PMID:16381976
Tomato Expression Database (TED): a suite of data presentation and analysis tools.
Fei, Zhangjun; Tang, Xuemei; Alba, Rob; Giovannoni, James
2006-01-01
The Tomato Expression Database (TED) includes three integrated components. The Tomato Microarray Data Warehouse serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. In addition to expression data, TED stores experimental design and array information in compliance with the MIAME guidelines and provides web interfaces for researchers to retrieve data for their own analysis and use. The Tomato Microarray Expression Database contains normalized and processed microarray data for ten time points with nine pair-wise comparisons during fruit development and ripening in a normal tomato variety and nearly isogenic single gene mutants impacting fruit development and ripening. Finally, the Tomato Digital Expression Database contains raw and normalized digital expression (EST abundance) data derived from analysis of the complete public tomato EST collection containing >150,000 ESTs derived from 27 different non-normalized EST libraries. This last component also includes tools for the comparison of tomato and Arabidopsis digital expression data. A set of query interfaces and analysis, and visualization tools have been developed and incorporated into TED, which aid users in identifying and deciphering biologically important information from our datasets. TED can be accessed at http://ted.bti.cornell.edu.
Tong, Steven Y C; Xie, Shirley; Richardson, Leisha J; Ballard, Susan A; Dakh, Farshid; Grabsch, Elizabeth A; Grayson, M Lindsay; Howden, Benjamin P; Johnson, Paul D R; Giffard, Philip M
2011-01-01
We have developed a single nucleotide polymorphism (SNP) nucleated high-resolution melting (HRM) technique to genotype Enterococcus faecium. Eight SNPs were derived from the E. faecium multilocus sequence typing (MLST) database and amplified fragments containing these SNPs were interrogated by HRM. We tested the HRM genotyping scheme on 85 E. faecium bloodstream isolates and compared the results with MLST, pulsed-field gel electrophoresis (PFGE) and an allele specific real-time PCR (AS kinetic PCR) SNP typing method. In silico analysis based on predicted HRM curves according to the G+C content of each fragment for all 567 sequence types (STs) in the MLST database together with empiric data from the 85 isolates demonstrated that HRM analysis resolves E. faecium into 231 "melting types" (MelTs) and provides a Simpson's Index of Diversity (D) of 0.991 with respect to MLST. This is a significant improvement on the AS kinetic PCR SNP typing scheme that resolves 61 SNP types with D of 0.95. The MelTs were concordant with the known ST of the isolates. For the 85 isolates, there were 13 PFGE patterns, 17 STs, 14 MelTs and eight SNP types. There was excellent concordance between PFGE, MLST and MelTs with Adjusted Rand Indices of PFGE to MelT 0.936 and ST to MelT 0.973. In conclusion, this HRM based method appears rapid and reproducible. The results are concordant with MLST and the MLST based population structure.
2015-01-01
Background Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs. Results We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we empirically validated the statistical significance (α < 0.00025) of the differences in TBP affinity values between the minor and ancestral alleles of 4 out of the 22 SNPs: rs200487063, rs201381696, rs34104384, and rs183433761. We also measured half-life (t1/2), Gibbs free energy change (ΔG), and the association and dissociation rate constants, ka and kd, of the TBP-DNA complex for these SNPs. Conclusions Validation of the 22 candidate SNP markers by proper clinical protocols appears to have a strong rationale and may advance postgenomic predictive preventive personalized medicine. PMID:26694100
The pig genome project has plenty to squeal about.
Fan, B; Gorbach, D M; Rothschild, M F
2011-01-01
Significant progress on pig genetics and genomics research has been witnessed in recent years due to the integration of advanced molecular biology techniques, bioinformatics and computational biology, and the collaborative efforts of researchers in the swine genomics community. Progress on expanding the linkage map has slowed down, but the efforts have created a higher-resolution physical map integrating the clone map and BAC end sequence. The number of QTL mapped is still growing and most of the updated QTL mapping results are available through PigQTLdb. Additionally, expression studies using high-throughput microarrays and other gene expression techniques have made significant advancements. The number of identified non-coding RNAs is rapidly increasing and their exact regulatory functions are being explored. A publishable draft (build 10) of the swine genome sequence was available for the pig genomics community by the end of December 2010. Build 9 of the porcine genome is currently available with Ensembl annotation; manual annotation is ongoing. These drafts provide useful tools for such endeavors as comparative genomics and SNP scans for fine QTL mapping. A recent community-wide effort to create a 60K porcine SNP chip has greatly facilitated whole-genome association analyses, haplotype block construction and linkage disequilibrium mapping, which can contribute to whole-genome selection. The future 'systems biology' that integrates and optimizes the information from all research levels can enhance the pig community's understanding of the full complexity of the porcine genome. These recent technological advances and where they may lead are reviewed. Copyright © 2011 S. Karger AG, Basel.
Development and application of a microarray meter tool to optimize microarray experiments
Rouse, Richard JD; Field, Katrine; Lapira, Jennifer; Lee, Allen; Wick, Ivan; Eckhardt, Colleen; Bhasker, C Ramana; Soverchia, Laura; Hardiman, Gary
2008-01-01
Background Successful microarray experimentation requires a complex interplay between the slide chemistry, the printing pins, the nucleic acid probes and targets, and the hybridization milieu. Optimization of these parameters and a careful evaluation of emerging slide chemistries are a prerequisite to any large scale array fabrication effort. We have developed a 'microarray meter' tool which assesses the inherent variations associated with microarray measurement prior to embarking on large scale projects. Findings The microarray meter consists of nucleic acid targets (reference and dynamic range control) and probe components. Different plate designs containing identical probe material were formulated to accommodate different robotic and pin designs. We examined the variability in probe quality and quantity (as judged by the amount of DNA printed and remaining post-hybridization) using three robots equipped with capillary printing pins. Discussion The generation of microarray data with minimal variation requires consistent quality control of the (DNA microarray) manufacturing and experimental processes. Spot reproducibility is a measure primarily of the variations associated with printing. The microarray meter assesses array quality by measuring the DNA content for every feature. It provides a post-hybridization analysis of array quality by scoring probe performance using three metrics, a) a measure of variability in the signal intensities, b) a measure of the signal dynamic range and c) a measure of variability of the spot morphologies. PMID:18710498
2011-01-01
Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434
High-throughput SNP-genotyping analysis of the relationships among Ponto-Caspian sturgeon species
Rastorguev, Sergey M; Nedoluzhko, Artem V; Mazur, Alexander M; Gruzdeva, Natalia M; Volkov, Alexander A; Barmintseva, Anna E; Mugue, Nikolai S; Prokhortchouk, Egor B
2013-01-01
Abstract Legally certified sturgeon fisheries require population protection and conservation methods, including DNA tests to identify the source of valuable sturgeon roe. However, the available genetic data are insufficient to distinguish between different sturgeon populations, and are even unable to distinguish between some species. We performed high-throughput single-nucleotide polymorphism (SNP)-genotyping analysis on different populations of Russian (Acipenser gueldenstaedtii), Persian (A. persicus), and Siberian (A. baerii) sturgeon species from the Caspian Sea region (Volga and Ural Rivers), the Azov Sea, and two Siberian rivers. We found that Russian sturgeons from the Volga and Ural Rivers were essentially indistinguishable, but they differed from Russian sturgeons in the Azov Sea, and from Persian and Siberian sturgeons. We identified eight SNPs that were sufficient to distinguish these sturgeon populations with 80% confidence, and allowed the development of markers to distinguish sturgeon species. Finally, on the basis of our SNP data, we propose that the A. baerii-like mitochondrial DNA found in some Russian sturgeons from the Caspian Sea arose via an introgression event during the Pleistocene glaciation. In the present study, the high-throughput genotyping analysis of several sturgeon populations was performed. SNP markers for species identification were defined. The possible explanation of the baerii-like mitotype presence in some Russian sturgeons in the Caspian Sea was suggested. PMID:24567827
Yao, Yao; Wen, Yueqiang; Du, Tingfu; Sun, Ning; Deng, Hong; Ryan, Joanne; Rao, Shuquan
2016-03-15
Major depressive disorder (MDD) is one of the most prevalent psychiatric illnesses with heritability of up to 38%. The fat mass- and obesity-associated (FTO) gene, in particular the single nucleotide polymorphism (SNP) rs9939609, has been identified as a genetic risk loci associated with MDD. However, most prior studies have involved European and American populations. Whether rs9939609 is an true risk SNP for MDD in Asian populations remains inconclusive. In the present study, we conducted a meta-analysis of the association between rs9939609 and MDD in Asian populations by combining 5 available case-control samples totaling 6531 cases and 12,359 controls. Our meta-analysis suggests that rs9939609 is not a risk SNP for MDD in Asian populations by fixed effect model (Z=1.04, P=0.30, OR=0.96, 95% CI=0.90-1.03). The age distribution and gender ratios were not matched well in the combined samples of cases and controls. Publication bias might be also considered with only a relatively small number of association studies of FTO rs9939609 with MDD in Asian populations. The absence of association of rs9939609 with MDD in our Asian populations suggests a potential genetic heterogeneity in the susceptibility of MDD on this locus. Copyright © 2015 Elsevier B.V. All rights reserved.
Selection Signature Analysis Implicates the PC1/PCSK1 Region for Chicken Abdominal Fat Content
Wang, Zhipeng; Zhang, Yuandan; Wang, Shouzhi; Wang, Ning; Ma, Li; Leng, Li; Wang, Shengwen; Wang, Qigui; Wang, Yuxiang; Tang, Zhiquan; Li, Ning; Da, Yang; Li, Hui
2012-01-01
We conducted a selection signature analysis using the chicken 60k SNP chip in two chicken lines that had been divergently selected for abdominal fat content (AFC) for 11 generations. The selection signature analysis used multiple signals of selection, including long-range allele frequency differences between the lean and fat lines, long-range heterozygosity changes, linkage disequilibrium, haplotype frequencies, and extended haplotype homozygosity. Multiple signals of selection identified ten signatures on chromosomes 1, 2, 4, 5, 11, 15, 20, 26 and Z. The 0.73 Mb PC1/PCSK1 region of the Z chromosome at 55.43-56.16 Mb was the most heavily selected region. This region had 26 SNP markers and seven genes, Mar-03, SLC12A2, FBN2, ERAP1, CAST, PC1/PCSK1 and ELL2, where PC1/PCSK1 are the chicken/human names for the same gene. The lean and fat lines had two main haplotypes with completely opposite SNP alleles for the 26 SNP markers and were virtually line-specific, and had a recombinant haplotype with nearly equal frequency (0.193 and 0.196) in both lines. Other haplotypes in this region had negligible frequencies. Nine other regions with selection signatures were PAH-IGF1, TRPC4, GJD4-CCNY, NDST4, NOVA1, GALNT9, the ESRP2-GALR1 region with five genes, the SYCP2-CADH4 with six genes, and the TULP1-KIF21B with 14 genes. Genome-wide association analysis showed that nearly all regions with evidence of selection signature had SNP effects with genome-wide significance (P<10–6) on abdominal fat weight and percentage. The results of this study provide specific gene targets for the control of chicken AFC and a potential model of AFC in human obesity. PMID:22792402
Selection signature analysis implicates the PC1/PCSK1 region for chicken abdominal fat content.
Zhang, Hui; Hu, Xiaoxiang; Wang, Zhipeng; Zhang, Yuandan; Wang, Shouzhi; Wang, Ning; Ma, Li; Leng, Li; Wang, Shengwen; Wang, Qigui; Wang, Yuxiang; Tang, Zhiquan; Li, Ning; Da, Yang; Li, Hui
2012-01-01
We conducted a selection signature analysis using the chicken 60k SNP chip in two chicken lines that had been divergently selected for abdominal fat content (AFC) for 11 generations. The selection signature analysis used multiple signals of selection, including long-range allele frequency differences between the lean and fat lines, long-range heterozygosity changes, linkage disequilibrium, haplotype frequencies, and extended haplotype homozygosity. Multiple signals of selection identified ten signatures on chromosomes 1, 2, 4, 5, 11, 15, 20, 26 and Z. The 0.73 Mb PC1/PCSK1 region of the Z chromosome at 55.43-56.16 Mb was the most heavily selected region. This region had 26 SNP markers and seven genes, Mar-03, SLC12A2, FBN2, ERAP1, CAST, PC1/PCSK1 and ELL2, where PC1/PCSK1 are the chicken/human names for the same gene. The lean and fat lines had two main haplotypes with completely opposite SNP alleles for the 26 SNP markers and were virtually line-specific, and had a recombinant haplotype with nearly equal frequency (0.193 and 0.196) in both lines. Other haplotypes in this region had negligible frequencies. Nine other regions with selection signatures were PAH-IGF1, TRPC4, GJD4-CCNY, NDST4, NOVA1, GALNT9, the ESRP2-GALR1 region with five genes, the SYCP2-CADH4 with six genes, and the TULP1-KIF21B with 14 genes. Genome-wide association analysis showed that nearly all regions with evidence of selection signature had SNP effects with genome-wide significance (P<10(-6)) on abdominal fat weight and percentage. The results of this study provide specific gene targets for the control of chicken AFC and a potential model of AFC in human obesity.
Scherrer, Daniel Zanetti; Zago, Vanessa Helena de Souza; Vieira, Isabela Calanca; Parra, Eliane Soler; Panzoldo, Natália Baratella; Alexandre, Fernanda; Secolin, Rodrigo; Baracat, Jamal; Quintão, Eder Carlos Rocha; de Faria, Eliana Cotta
2015-01-01
Background Evidences suggest that paraoxonase 1 (PON1) confers important antioxidant and anti-inflammatory properties when associated with high-density lipoprotein (HDL). Objective To investigate the relationships between p.Q192R SNP of PON1, biochemical parameters and carotid atherosclerosis in an asymptomatic, normolipidemic Brazilian population sample. Methods We studied 584 volunteers (females n = 326, males n = 258; 19-75 years of age). Total genomic DNA was extracted and SNP was detected in the TaqMan® SNP OpenArray® genotyping platform (Applied Biosystems, Foster City, CA). Plasma lipoproteins and apolipoproteins were determined and PON1 activity was measured using paraoxon as a substrate. High-resolution β-mode ultrasonography was used to measure cIMT and the presence of carotid atherosclerotic plaques in a subgroup of individuals (n = 317). Results The presence of p.192Q was associated with a significant increase in PON1 activity (RR = 12.30 (11.38); RQ = 46.96 (22.35); QQ = 85.35 (24.83) μmol/min; p < 0.0001), HDL-C (RR= 45 (37); RQ = 62 (39); QQ = 69 (29) mg/dL; p < 0.001) and apo A-I (RR = 140.76 ± 36.39; RQ = 147.62 ± 36.92; QQ = 147.49 ± 36.65 mg/dL; p = 0.019). Stepwise regression analysis revealed that heterozygous and p.192Q carriers influenced by 58% PON1 activity towards paraoxon. The univariate linear regression analysis demonstrated that p.Q192R SNP was not associated with mean cIMT; as a result, in the multiple regression analysis, no variables were selected with 5% significance. In logistic regression analysis, the studied parameters were not associated with the presence of carotid plaques. Conclusion In low-risk individuals, the presence of the p.192Q variant of PON1 is associated with a beneficial plasma lipid profile but not with carotid atherosclerosis. PMID:26039660
Murabito, Joanne M.; White, Charles C.; Kavousi, Maryam; Sun, Yan V.; Feitosa, Mary F.; Nambi, Vijay; Lamina, Claudia; Schillert, Arne; Coassin, Stefan; Bis, Joshua C.; Broer, Linda; Crawford, Dana C.; Franceschini, Nora; Frikke-Schmidt, Ruth; Haun, Margot; Holewijn, Suzanne; Huffman, Jennifer E.; Hwang, Shih-Jen; Kiechl, Stefan; Kollerits, Barbara; Montasser, May E.; Nolte, Ilja M.; Rudock, Megan E.; Senft, Andrea; Teumer, Alexander; van der Harst, Pim; Vitart, Veronique; Waite, Lindsay L.; Wood, Andrew R.; Wassel, Christina L.; Absher, Devin M.; Allison, Matthew A.; Amin, Najaf; Arnold, Alice; Asselbergs, Folkert W.; Aulchenko, Yurii; Bandinelli, Stefania; Barbalic, Maja; Boban, Mladen; Brown-Gentry, Kristin; Couper, David J.; Criqui, Michael H.; Dehghan, Abbas; Heijer, Martin den; Dieplinger, Benjamin; Ding, Jingzhong; Dörr, Marcus; Espinola-Klein, Christine; Felix, Stephan B.; Ferrucci, Luigi; Folsom, Aaron R.; Fraedrich, Gustav; Gibson, Quince; Goodloe, Robert; Gunjaca, Grgo; Haltmayer, Meinhard; Heiss, Gerardo; Hofman, Albert; Kieback, Arne; Kiemeney, Lambertus A.; Kolcic, Ivana; Kullo, Iftikhar J.; Kritchevsky, Stephen B.; Lackner, Karl J.; Li, Xiaohui; Lieb, Wolfgang; Lohman, Kurt; Meisinger, Christa; Melzer, David; Mohler, Emile R; Mudnic, Ivana; Mueller, Thomas; Navis, Gerjan; Oberhollenzer, Friedrich; Olin, Jeffrey W.; O’Connell, Jeff; O’Donnell, Christopher J.; Palmas, Walter; Penninx, Brenda W.; Petersmann, Astrid; Polasek, Ozren; Psaty, Bruce M.; Rantner, Barbara; Rice, Ken; Rivadeneira, Fernando; Rotter, Jerome I.; Seldenrijk, Adrie; Stadler, Marietta; Summerer, Monika; Tanaka, Toshiko; Tybjaerg-Hansen, Anne; Uitterlinden, Andre G.; van Gilst, Wiek H.; Vermeulen, Sita H.; Wild, Sarah H.; Wild, Philipp S.; Willeit, Johann; Zeller, Tanja; Zemunik, Tatijana; Zgaga, Lina; Assimes, Themistocles L.; Blankenberg, Stefan; Boerwinkle, Eric; Campbell, Harry; Cooke, John P.; de Graaf, Jacqueline; Herrington, David; Kardia, Sharon L. R.; Mitchell, Braxton D.; Murray, Anna; Münzel, Thomas; Newman, Anne; Oostra, Ben A.; Rudan, Igor; Shuldiner, Alan R.; Snieder, Harold; van Duijn, Cornelia M.; Völker, Uwe; Wright, Alan F.; Wichmann, H.-Erich; Wilson, James F.; Witteman, Jacqueline C.M.; Liu, Yongmei; Hayward, Caroline; Borecki, Ingrid B.; Ziegler, Andreas; North, Kari E.; Cupples, L. Adrienne; Kronenberg, Florian
2012-01-01
Background Genetic determinants of peripheral arterial disease (PAD) remain largely unknown. To identify genetic variants associated with the ankle-brachial index (ABI), a noninvasive measure of PAD, we conducted a meta-analysis of genome-wide association study data from 21 population-based cohorts. Methods and Results Continuous ABI and PAD (ABI≤0.9) phenotypes adjusted for age and sex were examined. Each study conducted genotyping and imputed data to the ~2.5 million SNPs in HapMap. Linear and logistic regression models were used to test each SNP for association with ABI and PAD using additive genetic models. Study-specific data were combined using fixed-effects inverse variance weighted meta-analyses. There were a total of 41,692 participants of European ancestry (~60% women, mean ABI 1.02 to 1.19), including 3,409 participants with PAD and with GWAS data available. In the discovery meta-analysis, rs10757269 on chromosome 9 near CDKN2B had the strongest association with ABI (β= −0.006, p=2.46x10−8). We sought replication of the 6 strongest SNP associations in 5 population-based studies and 3 clinical samples (n=16,717). The association for rs10757269 strengthened in the combined discovery and replication analysis (p=2.65x10−9). No other SNP associations for ABI or PAD achieved genome-wide significance. However, two previously reported candidate genes for PAD and one SNP associated with coronary artery disease (CAD) were associated with ABI : DAB21P (rs13290547, p=3.6x10−5); CYBA (rs3794624, p=6.3x10−5); and rs1122608 (LDLR, p=0.0026). Conclusions GWAS in more than 40,000 individuals identified one genome-wide significant association on chromosome 9p21 with ABI. Two candidate genes for PAD and 1 SNP for CAD are associated with ABI. PMID:22199011
Scherrer, Daniel Zanetti; Zago, Vanessa Helena de Souza; Vieira, Isabela Calanca; Parra, Eliane Soler; Panzoldo, Natália Baratella; Alexandre, Fernanda; Secolin, Rodrigo; Baracat, Jamal; Quintão, Eder Carlos Rocha; Faria, Eliana Cotta de
2015-07-01
Evidences suggest that paraoxonase 1 (PON1) confers important antioxidant and anti-inflammatory properties when associated with high-density lipoprotein (HDL). To investigate the relationships between p.Q192R SNP of PON1, biochemical parameters and carotid atherosclerosis in an asymptomatic, normolipidemic Brazilian population sample. We studied 584 volunteers (females n = 326, males n = 258; 19-75 years of age). Total genomic DNA was extracted and SNP was detected in the TaqMan® SNP OpenArray® genotyping platform (Applied Biosystems, Foster City, CA). Plasma lipoproteins and apolipoproteins were determined and PON1 activity was measured using paraoxon as a substrate. High-resolution β-mode ultrasonography was used to measure cIMT and the presence of carotid atherosclerotic plaques in a subgroup of individuals (n = 317). The presence of p.192Q was associated with a significant increase in PON1 activity (RR = 12.30 (11.38); RQ = 46.96 (22.35); QQ = 85.35 (24.83) μmol/min; p < 0.0001), HDL-C (RR= 45 (37); RQ = 62 (39); QQ = 69 (29) mg/dL; p < 0.001) and apo A-I (RR = 140.76 ± 36.39; RQ = 147.62 ± 36.92; QQ = 147.49 ± 36.65 mg/dL; p = 0.019). Stepwise regression analysis revealed that heterozygous and p.192Q carriers influenced by 58% PON1 activity towards paraoxon. The univariate linear regression analysis demonstrated that p.Q192R SNP was not associated with mean cIMT; as a result, in the multiple regression analysis, no variables were selected with 5% significance. In logistic regression analysis, the studied parameters were not associated with the presence of carotid plaques. In low-risk individuals, the presence of the p.192Q variant of PON1 is associated with a beneficial plasma lipid profile but not with carotid atherosclerosis.
Inoue, Daisuke; Hinoura, Takuji; Suzuki, Noriko; Pang, Junqin; Malla, Rabin; Shrestha, Sadhana; Chapagain, Saroj Kumar; Matsuzawa, Hiroaki; Nakamura, Takashi; Tanaka, Yasuhiro; Ike, Michihiko; Nishida, Kei; Sei, Kazunari
2015-01-01
Because of heavy dependence on groundwater for drinking water and other domestic use, microbial contamination of groundwater is a serious problem in the Kathmandu Valley, Nepal. This study investigated comprehensively the occurrence of pathogenic bacteria in shallow well groundwater in the Kathmandu Valley by applying DNA microarray analysis targeting 941 pathogenic bacterial species/groups. Water quality measurements found significant coliform (fecal) contamination in 10 of the 11 investigated groundwater samples and significant nitrogen contamination in some samples. The results of DNA microarray analysis revealed the presence of 1-37 pathogen species/groups, including 1-27 biosafety level 2 ones, in 9 of the 11 groundwater samples. While the detected pathogens included several feces- and animal-related ones, those belonging to Legionella and Arthrobacter, which were considered not to be directly associated with feces, were detected prevalently. This study could provide a rough picture of overall pathogenic bacterial contamination in the Kathmandu Valley, and demonstrated the usefulness of DNA microarray analysis as a comprehensive screening tool of a wide variety of pathogenic bacteria.
Microarray Analysis of Long Noncoding RNAs in Female Diabetic Peripheral Neuropathy Patients.
Luo, Lin; Ji, Lin-Dan; Cai, Jiang-Jia; Feng, Mei; Zhou, Mi; Hu, Su-Pei; Xu, Jin; Zhou, Wen-Hua
2018-01-01
Diabetic peripheral neuropathy (DPN) is the most common complication of diabetes mellitus (DM). Because of its controversial pathogenesis, DPN is still not diagnosed or managed properly in most patients. In this study, human lncRNA microarrays were used to identify the differentially expressed lncRNAs in DM and DPN patients, and some of the discovered lncRNAs were further validated in additional 78 samples by quantitative realtime PCR (qRT-PCR). The microarray analysis identified 446 and 1327 differentially expressed lncRNAs in DM and DPN, respectively. The KEGG pathway analysis further revealed that the differentially expressed lncRNA-coexpressed mRNAs between DPN and DM groups were significantly enriched in the MAPK signaling pathway. The lncRNA/mRNA coexpression network indicated that BDNF and TRAF2 correlated with 6 lncRNAs. The qRT-PCR confirmed the initial microarray results. These findings demonstrated that the interplay between lncRNAs and mRNA may be involved in the pathogenesis of DPN, especially the neurotrophin-MAPK signaling pathway, thus providing relevant information for future studies. © 2018 The Author(s). Published by S. Karger AG, Basel.
MASQOT: a method for cDNA microarray spot quality control
Bylesjö, Max; Eriksson, Daniel; Sjödin, Andreas; Sjöström, Michael; Jansson, Stefan; Antti, Henrik; Trygg, Johan
2005-01-01
Background cDNA microarray technology has emerged as a major player in the parallel detection of biomolecules, but still suffers from fundamental technical problems. Identifying and removing unreliable data is crucial to prevent the risk of receiving illusive analysis results. Visual assessment of spot quality is still a common procedure, despite the time-consuming work of manually inspecting spots in the range of hundreds of thousands or more. Results A novel methodology for cDNA microarray spot quality control is outlined. Multivariate discriminant analysis was used to assess spot quality based on existing and novel descriptors. The presented methodology displays high reproducibility and was found superior in identifying unreliable data compared to other evaluated methodologies. Conclusion The proposed methodology for cDNA microarray spot quality control generates non-discrete values of spot quality which can be utilized as weights in subsequent analysis procedures as well as to discard spots of undesired quality using the suggested threshold values. The MASQOT approach provides a consistent assessment of spot quality and can be considered an alternative to the labor-intensive manual quality assessment process. PMID:16223442
Xia, Yu; Yang, Yongchao; Huang, Shufang; Wu, Yueheng; Li, Ping; Zhuang, Jian
2018-03-24
This study aimed to determine chromosomal abnormalities and copy number variations (CNVs) in fetuses with congenital heart disease (CHD) by chromosomal microarray analysis (CMA). One hundred and ten cases with CHD detected by prenatal echocardiography were enrolled in the study; 27 cases were simple CHDs, and 83 were complex CHDs. Chromosomal microarray analysis was performed on the Affymetrix CytoScan HD platform. All annotated CNVs were validated by quantitative PCR. Chromosomal microarray analysis identified 6 cases with chromosomal abnormalities, including 2 cases with trisomy 21, 2 cases with trisomy 18, 1 case with trisomy 13, and 1 unusual case of mosaic trisomy 21. Pathogenic CNVs were detected in 15.5% (17/110) of the fetuses with CHDs, including 13 cases with CHD-associated CNVs. We further identified 10 genes as likely novel CHD candidate genes through gene functional enrichment analysis. We also found that pathogenic CMA results impacted the rate of pregnancy termination. This study shows that CMA is particularly effective for identifying chromosomal abnormalities and CNVs in fetuses with CHDs as well as having an effect on obstetrical outcomes. The elucidation of the genetic basis of CHDs will continue to expand our understanding of the etiology of CHDs. © 2018 John Wiley & Sons, Ltd.
Karsten, Stanislav L.; Van Deerlin, Vivianna M. D.; Sabatti, Chiara; Gill, Lisa H.; Geschwind, Daniel H.
2002-01-01
Archival formalin-fixed, paraffin-embedded and ethanol-fixed tissues represent a potentially invaluable resource for gene expression analysis, as they are the most widely available material for studies of human disease. Little data are available evaluating whether RNA obtained from fixed (archival) tissues could produce reliable and reproducible microarray expression data. Here we compare the use of RNA isolated from human archival tissues fixed in ethanol and formalin to frozen tissue in cDNA microarray experiments. Since an additional factor that can limit the utility of archival tissue is the often small quantities available, we also evaluate the use of the tyramide signal amplification method (TSA), which allows the use of small amounts of RNA. Detailed analysis indicates that TSA provides a consistent and reproducible signal amplification method for cDNA microarray analysis, across both arrays and the genes tested. Analysis of this method also highlights the importance of performing non-linear channel normalization and dye switching. Furthermore, archived, fixed specimens can perform well, but not surprisingly, produce more variable results than frozen tissues. Consistent results are more easily obtainable using ethanol-fixed tissues, whereas formalin-fixed tissue does not typically provide a useful substrate for cDNA synthesis and labeling. PMID:11788730
MIGS-GPU: Microarray Image Gridding and Segmentation on the GPU.
Katsigiannis, Stamos; Zacharia, Eleni; Maroulis, Dimitris
2017-05-01
Complementary DNA (cDNA) microarray is a powerful tool for simultaneously studying the expression level of thousands of genes. Nevertheless, the analysis of microarray images remains an arduous and challenging task due to the poor quality of the images that often suffer from noise, artifacts, and uneven background. In this study, the MIGS-GPU [Microarray Image Gridding and Segmentation on Graphics Processing Unit (GPU)] software for gridding and segmenting microarray images is presented. MIGS-GPU's computations are performed on the GPU by means of the compute unified device architecture (CUDA) in order to achieve fast performance and increase the utilization of available system resources. Evaluation on both real and synthetic cDNA microarray images showed that MIGS-GPU provides better performance than state-of-the-art alternatives, while the proposed GPU implementation achieves significantly lower computational times compared to the respective CPU approaches. Consequently, MIGS-GPU can be an advantageous and useful tool for biomedical laboratories, offering a user-friendly interface that requires minimum input in order to run.
2012-01-01
Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings. PMID:16964229
Women's experiences receiving abnormal prenatal chromosomal microarray testing results.
Bernhardt, Barbara A; Soucier, Danielle; Hanson, Karen; Savage, Melissa S; Jackson, Laird; Wapner, Ronald J
2013-02-01
Genomic microarrays can detect copy-number variants not detectable by conventional cytogenetics. This technology is diffusing rapidly into prenatal settings even though the clinical implications of many copy-number variants are currently unknown. We conducted a qualitative pilot study to explore the experiences of women receiving abnormal results from prenatal microarray testing performed in a research setting. Participants were a subset of women participating in a multicenter prospective study "Prenatal Cytogenetic Diagnosis by Array-based Copy Number Analysis." Telephone interviews were conducted with 23 women receiving abnormal prenatal microarray results. We found that five key elements dominated the experiences of women who had received abnormal prenatal microarray results: an offer too good to pass up, blindsided by the results, uncertainty and unquantifiable risks, need for support, and toxic knowledge. As prenatal microarray testing is increasingly used, uncertain findings will be common, resulting in greater need for careful pre- and posttest counseling, and more education of and resources for providers so they can adequately support the women who are undergoing testing.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Haitsma, Jack J.; Furmli, Suleiman; Masoom, Hussain; Liu, Mingyao; Imai, Yumiko; Slutsky, Arthur S.; Beyene, Joseph; Greenwood, Celia M. T.; dos Santos, Claudia
2012-01-01
Objectives To perform a meta-analysis of gene expression microarray data from animal studies of lung injury, and to identify an injury-specific gene expression signature capable of predicting the development of lung injury in humans. Methods We performed a microarray meta-analysis using 77 microarray chips across six platforms, two species and different animal lung injury models exposed to lung injury with or/and without mechanical ventilation. Individual gene chips were classified and grouped based on the strategy used to induce lung injury. Effect size (change in gene expression) was calculated between non-injurious and injurious conditions comparing two main strategies to pool chips: (1) one-hit and (2) two-hit lung injury models. A random effects model was used to integrate individual effect sizes calculated from each experiment. Classification models were built using the gene expression signatures generated by the meta-analysis to predict the development of lung injury in human lung transplant recipients. Results Two injury-specific lists of differentially expressed genes generated from our meta-analysis of lung injury models were validated using external data sets and prospective data from animal models of ventilator-induced lung injury (VILI). Pathway analysis of gene sets revealed that both new and previously implicated VILI-related pathways are enriched with differentially regulated genes. Classification model based on gene expression signatures identified in animal models of lung injury predicted development of primary graft failure (PGF) in lung transplant recipients with larger than 80% accuracy based upon injury profiles from transplant donors. We also found that better classifier performance can be achieved by using meta-analysis to identify differentially-expressed genes than using single study-based differential analysis. Conclusion Taken together, our data suggests that microarray analysis of gene expression data allows for the detection of “injury" gene predictors that can classify lung injury samples and identify patients at risk for clinically relevant lung injury complications. PMID:23071521
Röper, Andrea; Reichert, Walter; Mattern, Rainer
2007-01-01
In the field of forensic DNA typing, the analysis of Short Tandem Repeats (STRs) can fail in cases of degraded DNA. The typing of coding region Single Nucleotide Polymorphisms (SNPs) of the mitochondrial genome provides an approach to acquire additional information. In the examined case of aggravated theft, both suspects could be excluded of having left the analyzed hair on the crime scene by SNP typing. This conclusion was not possible subsequent to STR typing. SNP typing of the trace on the torch light left on the crime scene increased the likelihood for suspect no. 2 to be the origin of this trace. This finding was already indicated by STR analysis. Suspect no. 1 was excluded for being the origin of this trace by SNP typing which was also indicated by STR analysis. A limiting factor for the analysis of SNPs is the maternal inheritance of mitochondrial DNA. Individualisation is not possible. In conclusion, it can be said that in the case of traces which cause problems with conventional STR typing the supplementary analysis of coding region SNPs from the mitochondrial genome is very reasonable and greatly contributes to the refinement of analysis methods in the field of forensic genetics.
Addressable droplet microarrays for single cell protein analysis.
Salehi-Reyhani, Ali; Burgin, Edward; Ces, Oscar; Willison, Keith R; Klug, David R
2014-11-07
Addressable droplet microarrays are potentially attractive as a way to achieve miniaturised, reduced volume, high sensitivity analyses without the need to fabricate microfluidic devices or small volume chambers. We report a practical method for producing oil-encapsulated addressable droplet microarrays which can be used for such analyses. To demonstrate their utility, we undertake a series of single cell analyses, to determine the variation in copy number of p53 proteins in cells of a human cancer cell line.
Popescu, F; Jaslow, C R; Kutteh, W H
2018-04-01
Will the addition of 24-chromosome microarray analysis on miscarriage tissue combined with the standard American Society for Reproductive Medicine (ASRM) evaluation for recurrent miscarriage explain most losses? Over 90% of patients with recurrent pregnancy loss (RPL) will have a probable or definitive cause identified when combining genetic testing on miscarriage tissue with the standard ASRM evaluation for recurrent miscarriage. RPL is estimated to occur in 2-4% of reproductive age couples. A probable cause can be identified in approximately 50% of patients after an ASRM recommended workup including an evaluation for parental chromosomal abnormalities, congenital and acquired uterine anomalies, endocrine imbalances and autoimmune factors including antiphospholipid syndrome. Single-center, prospective cohort study that included 100 patients seen in a private RPL clinic from 2014 to 2017. All 100 women had two or more pregnancy losses, a complete evaluation for RPL as defined by the ASRM, and miscarriage tissue evaluated by 24-chromosome microarray analysis after their second or subsequent miscarriage. Frequencies of abnormal results for evidence-based diagnostic tests considered definite or probable causes of RPL (karyotyping for parental chromosomal abnormalities, and 24-chromosome microarray evaluation for products of conception (POC); pelvic sonohysterography, hysterosalpingogram, or hysteroscopy for uterine anomalies; immunological tests for lupus anticoagulant and anticardiolipin antibodies; and blood tests for thyroid stimulating hormone (TSH), prolactin and hemoglobin A1c) were evaluated. We excluded cases where there was maternal cell contamination of the miscarriage tissue or if the ASRM evaluation was incomplete. A cost analysis for the evaluation of RPL was conducted to determine whether a proposed procedure of 24-chromome microarray evaluation followed by an ASRM RPL workup (for those RPL patients who had a normal 24-chromosome microarray evaluation) was more cost-efficient than conducting ASRM RPL workups on RPL patients followed by 24-chromosome microarray analysis (for those RPL patients who had a normal RPL workup). A definite or probable cause of pregnancy loss was identified in the vast majority (95/100; 95%) of RPL patients when a 24-chromosome pair microarray evaluation of POC testing is combined with the standard ASRM RPL workup evaluation at the time of the second or subsequent loss. The ASRM RPL workup identified an abnormality and a probable explanation for pregnancy loss in only 45/100 or 45% of all patients. A definite abnormality was identified in 67/100 patients or 67% when initial testing was performed using 24-chromosome microarray analyses on the miscarriage tissue. Only 5/100 (5%) patients, who had a euploid loss and a normal ASRM RPL workup, had a pregnancy loss without a probable or definitive cause identified. All other losses were explained by an abnormal 24-chromosome microarray analysis of the miscarriage tissue, an abnormal finding of the RPL workup, or a combination of both. Results from the cost analysis indicated that an initial approach of using a 24-chromosome microarray analysis on miscarriage tissue resulted in a 50% savings in cost to the health care system and to the patient. This is a single-center study on a small group of well-characterized women with RPL. There was an incomplete follow-up on subsequent pregnancy outcomes after evaluation, however this should not affect our principal results. The maternal age of patients varied from 26 to 45 years old. More aneuploid pregnancy losses would be expected in older women, particularly over the age of 35 years old. Evaluation of POC using 24-chromosome microarray analysis adds significantly to the ASRM recommended evaluation of RPL. Genetic evaluation on miscarriage tissue obtained at the time of the second and subsequent pregnancy losses should be offered to all couples with two or more consecutive pregnancy losses. The combination of a genetic evaluation on miscarriage tissue with an evidence-based evaluation for RPL will identify a probable or definitive cause in over 90% of miscarriages. No funding was received for this study and there are no conflicts of interest to declare. Not applicable.
A Platform for Combined DNA and Protein Microarrays Based on Total Internal Reflection Fluorescence
Asanov, Alexander; Zepeda, Angélica; Vaca, Luis
2012-01-01
We have developed a novel microarray technology based on total internal reflection fluorescence (TIRF) in combination with DNA and protein bioassays immobilized at the TIRF surface. Unlike conventional microarrays that exhibit reduced signal-to-background ratio, require several stages of incubation, rinsing and stringency control, and measure only end-point results, our TIRF microarray technology provides several orders of magnitude better signal-to-background ratio, performs analysis rapidly in one step, and measures the entire course of association and dissociation kinetics between target DNA and protein molecules and the bioassays. In many practical cases detection of only DNA or protein markers alone does not provide the necessary accuracy for diagnosing a disease or detecting a pathogen. Here we describe TIRF microarrays that detect DNA and protein markers simultaneously, which reduces the probabilities of false responses. Supersensitive and multiplexed TIRF DNA and protein microarray technology may provide a platform for accurate diagnosis or enhanced research studies. Our TIRF microarray system can be mounted on upright or inverted microscopes or interfaced directly with CCD cameras equipped with a single objective, facilitating the development of portable devices. As proof-of-concept we applied TIRF microarrays for detecting molecular markers from Bacillus anthracis, the pathogen responsible for anthrax. PMID:22438738
Validation of MIMGO: a method to identify differentially expressed GO terms in a microarray dataset
2012-01-01
Background We previously proposed an algorithm for the identification of GO terms that commonly annotate genes whose expression is upregulated or downregulated in some microarray data compared with in other microarray data. We call these “differentially expressed GO terms” and have named the algorithm “matrix-assisted identification method of differentially expressed GO terms” (MIMGO). MIMGO can also identify microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. However, MIMGO has not yet been validated on a real microarray dataset using all available GO terms. Findings We combined Gene Set Enrichment Analysis (GSEA) with MIMGO to identify differentially expressed GO terms in a yeast cell cycle microarray dataset. GSEA followed by MIMGO (GSEA + MIMGO) correctly identified (p < 0.05) microarray data in which genes annotated to differentially expressed GO terms are upregulated. We found that GSEA + MIMGO was slightly less effective than, or comparable to, GSEA (Pearson), a method that uses Pearson’s correlation as a metric, at detecting true differentially expressed GO terms. However, unlike other methods including GSEA (Pearson), GSEA + MIMGO can comprehensively identify the microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. Conclusions MIMGO is a reliable method to identify differentially expressed GO terms comprehensively. PMID:23232071
The use of open source bioinformatics tools to dissect transcriptomic data.
Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera
2012-01-01
Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.
NASA Astrophysics Data System (ADS)
Liu, Robin H.; Lodes, Mike; Fuji, H. Sho; Danley, David; McShea, Andrew
Microarray assays typically involve multistage sample processing and fluidic handling, which are generally labor-intensive and time-consuming. Automation of these processes would improve robustness, reduce run-to-run and operator-to-operator variation, and reduce costs. In this chapter, a fully integrated and self-contained microfluidic biochip device that has been developed to automate the fluidic handling steps for microarray-based gene expression or genotyping analysis is presented. The device consists of a semiconductor-based CustomArray® chip with 12,000 features and a microfluidic cartridge. The CustomArray was manufactured using a semiconductor-based in situ synthesis technology. The micro-fluidic cartridge consists of microfluidic pumps, mixers, valves, fluid channels, and reagent storage chambers. Microarray hybridization and subsequent fluidic handling and reactions (including a number of washing and labeling steps) were performed in this fully automated and miniature device before fluorescent image scanning of the microarray chip. Electrochemical micropumps were integrated in the cartridge to provide pumping of liquid solutions. A micromixing technique based on gas bubbling generated by electrochemical micropumps was developed. Low-cost check valves were implemented in the cartridge to prevent cross-talk of the stored reagents. Gene expression study of the human leukemia cell line (K562) and genotyping detection and sequencing of influenza A subtypes have been demonstrated using this integrated biochip platform. For gene expression assays, the microfluidic CustomArray device detected sample RNAs with a concentration as low as 0.375 pM. Detection was quantitative over more than three orders of magnitude. Experiment also showed that chip-to-chip variability was low indicating that the integrated microfluidic devices eliminate manual fluidic handling steps that can be a significant source of variability in genomic analysis. The genotyping results showed that the device identified influenza A hemagglutinin and neuraminidase subtypes and sequenced portions of both genes, demonstrating the potential of integrated microfluidic and microarray technology for multiple virus detection. The device provides a cost-effective solution to eliminate labor-intensive and time-consuming fluidic handling steps and allows microarray-based DNA analysis in a rapid and automated fashion.
Peterson, Leif E
2002-01-01
CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816
NASA Technical Reports Server (NTRS)
Koizumi, Yoshikazu; Kelly, John J.; Nakagawa, Tatsunori; Urakawa, Hidetoshi; El-Fantroussi, Said; Al-Muzaini, Saleh; Fukui, Manabu; Urushigawa, Yoshikuni; Stahl, David A.
2002-01-01
A mesophilic toluene-degrading consortium (TDC) and an ethylbenzene-degrading consortium (EDC) were established under sulfate-reducing conditions. These consortia were first characterized by denaturing gradient gel electrophoresis (DGGE) fingerprinting of PCR-amplified 16S rRNA gene fragments, followed by sequencing. The sequences of the major bands (T-1 and E-2) belonging to TDC and EDC, respectively, were affiliated with the family Desulfobacteriaceae. Another major band from EDC (E-1) was related to an uncultured non-sulfate-reducing soil bacterium. Oligonucleotide probes specific for the 16S rRNAs of target organisms corresponding to T-1, E-1, and E-2 were designed, and hybridization conditions were optimized for two analytical formats, membrane and DNA microarray hybridization. Both formats were used to characterize the TDC and EDC, and the results of both were consistent with DGGE analysis. In order to assess the utility of the microarray format for analysis of environmental samples, oil-contaminated sediments from the coast of Kuwait were analyzed. The DNA microarray successfully detected bacterial nucleic acids from these samples, but probes targeting specific groups of sulfate-reducing bacteria did not give positive signals. The results of this study demonstrate the limitations and the potential utility of DNA microarrays for microbial community analysis.
Koizumi, Yoshikazu; Kelly, John J.; Nakagawa, Tatsunori; Urakawa, Hidetoshi; El-Fantroussi, Saïd; Al-Muzaini, Saleh; Fukui, Manabu; Urushigawa, Yoshikuni; Stahl, David A.
2002-01-01
A mesophilic toluene-degrading consortium (TDC) and an ethylbenzene-degrading consortium (EDC) were established under sulfate-reducing conditions. These consortia were first characterized by denaturing gradient gel electrophoresis (DGGE) fingerprinting of PCR-amplified 16S rRNA gene fragments, followed by sequencing. The sequences of the major bands (T-1 and E-2) belonging to TDC and EDC, respectively, were affiliated with the family Desulfobacteriaceae. Another major band from EDC (E-1) was related to an uncultured non-sulfate-reducing soil bacterium. Oligonucleotide probes specific for the 16S rRNAs of target organisms corresponding to T-1, E-1, and E-2 were designed, and hybridization conditions were optimized for two analytical formats, membrane and DNA microarray hybridization. Both formats were used to characterize the TDC and EDC, and the results of both were consistent with DGGE analysis. In order to assess the utility of the microarray format for analysis of environmental samples, oil-contaminated sediments from the coast of Kuwait were analyzed. The DNA microarray successfully detected bacterial nucleic acids from these samples, but probes targeting specific groups of sulfate-reducing bacteria did not give positive signals. The results of this study demonstrate the limitations and the potential utility of DNA microarrays for microbial community analysis. PMID:12088997
Ontology-based, Tissue MicroArray oriented, image centered tissue bank
Viti, Federica; Merelli, Ivan; Caprera, Andrea; Lazzari, Barbara; Stella, Alessandra; Milanesi, Luciano
2008-01-01
Background Tissue MicroArray technique is becoming increasingly important in pathology for the validation of experimental data from transcriptomic analysis. This approach produces many images which need to be properly managed, if possible with an infrastructure able to support tissue sharing between institutes. Moreover, the available frameworks oriented to Tissue MicroArray provide good storage for clinical patient, sample treatment and block construction information, but their utility is limited by the lack of data integration with biomolecular information. Results In this work we propose a Tissue MicroArray web oriented system to support researchers in managing bio-samples and, through the use of ontologies, enables tissue sharing aimed at the design of Tissue MicroArray experiments and results evaluation. Indeed, our system provides ontological description both for pre-analysis tissue images and for post-process analysis image results, which is crucial for information exchange. Moreover, working on well-defined terms it is then possible to query web resources for literature articles to integrate both pathology and bioinformatics data. Conclusions Using this system, users associate an ontology-based description to each image uploaded into the database and also integrate results with the ontological description of biosequences identified in every tissue. Moreover, it is possible to integrate the ontological description provided by the user with a full compliant gene ontology definition, enabling statistical studies about correlation between the analyzed pathology and the most commonly related biological processes. PMID:18460177
Kim, Hyun-Kyoung; Park, Won Cheol; Lee, Kwang Man; Hwang, Hai-Li; Park, Seong-Yeol; Sorn, Sungbin; Chandra, Vishal; Kim, Kwang Gi; Yoon, Woong-Bae; Bae, Joon Seol; Shin, Hyoung Doo; Shin, Jong-Yeon; Seoh, Ju-Young; Kim, Jong-Il; Hong, Kyeong-Man
2014-01-01
The concept of the utilization of rearranged ends for development of personalized biomarkers has attracted much attention owing to its clinical applicability. Although targeted next-generation sequencing (NGS) for recurrent rearrangements has been successful in hematologic malignancies, its application to solid tumors is problematic due to the paucity of recurrent translocations. However, copy-number breakpoints (CNBs), which are abundant in solid tumors, can be utilized for identification of rearranged ends. As a proof of concept, we performed targeted next-generation sequencing at copy-number breakpoints (TNGS-CNB) in nine colon cancer cases including seven primary cancers and two cell lines, COLO205 and SW620. For deduction of CNBs, we developed a novel competitive single-nucleotide polymorphism (cSNP) microarray method entailing CNB-region refinement by competitor DNA. Using TNGS-CNB, 19 specific rearrangements out of 91 CNBs (20.9%) were identified, and two polymerase chain reaction (PCR)-amplifiable rearrangements were obtained in six cases (66.7%). And significantly, TNGS-CNB, with its high positive identification rate (82.6%) of PCR-amplifiable rearrangements at candidate sites (19/23), just from filtering of aligned sequences, requires little effort for validation. Our results indicate that TNGS-CNB, with its utility for identification of rearrangements in solid tumors, can be successfully applied in the clinical laboratory for cancer-relapse and therapy-response monitoring.
Hanchard, Neil A; Umana, Luis A; D'Alessandro, Lisa; Azamian, Mahshid; Poopola, Mojisola; Morris, Shaine A; Fernbach, Susan; Lalani, Seema R; Towbin, Jeffrey A; Zender, Gloria A; Fitzgerald-Butt, Sara; Garg, Vidu; Bowman, Jessica; Zapata, Gladys; Hernandez, Patricia; Arrington, Cammon B; Furthner, Dieter; Prakash, Siddharth K; Bowles, Neil E; McBride, Kim L; Belmont, John W
2017-08-01
Congenital left-sided cardiac lesions (LSLs) are a significant contributor to the mortality and morbidity of congenital heart disease (CHD). Structural copy number variants (CNVs) have been implicated in LSL without extra-cardiac features; however, non-penetrance and variable expressivity have created uncertainty over the use of CNV analyses in such patients. High-density SNP microarray genotyping data were used to infer large, likely-pathogenic, autosomal CNVs in a cohort of 1,139 probands with LSL and their families. CNVs were molecularly confirmed and the medical records of individual carriers reviewed. The gene content of novel CNVs was then compared with public CNV data from CHD patients. Large CNVs (>1 MB) were observed in 33 probands (∼3%). Six of these were de novo and 14 were not observed in the only available parent sample. Associated cardiac phenotypes spanned a broad spectrum without clear predilection. Candidate CNVs were largely non-recurrent, associated with heterozygous loss of copy number, and overlapped known CHD genomic regions. Novel CNV regions were enriched for cardiac development genes, including seven that have not been previously associated with human CHD. CNV analysis can be a clinically useful and molecularly informative tool in LSLs without obvious extra-cardiac defects, and may identify a clinically relevant genomic disorder in a small but important proportion of these individuals. © 2017 Wiley Periodicals, Inc.
Prader-Willi Syndrome due to an Unbalanced de novo Translocation t(15;19)(q12;p13.3).
Dang, Vy; Surampalli, Abhilasha; Manzardo, Ann M; Youn, Stephanie; Butler, Merlin G; Gold, June-Anne; Kimonis, Virginia E
2016-01-01
Prader-Willi syndrome (PWS) is a complex, multisystem genetic disorder characterized by endocrine, neurologic, and behavioral abnormalities. We report the first case of an unbalanced de novo reciprocal translocation of chromosomes 15 and 19, 45,XY,-15,der(19)t(15;19)(q12;p13.3), resulting in monosomy for the PWS critical chromosome region. Our patient had several typical features of PWS including infantile hypotonia, a poor suck and feeding difficulties, tantrums, skin picking, compulsions, small hands and feet, and food seeking, but not hypopigmentation, a micropenis, cryptorchidism or obesity as common findings seen in PWS at the time of examination at 6 years of age. He had seizures noted from 1 to 3 years of age and marked cognitive delay. High-resolution SNP microarray analysis identified an atypical PWS type I deletion in chromosome 15 involving the proximal breakpoint BP1. The deletion extended beyond the GABRB3 gene but was proximal to the usual distal breakpoint (BP3) within the 15q11q13 region, and GABRA5, GABRG3, and OCA2 genes were intact. No deletion of band 19p13.3 was detected; therefore, the patient was not at an increased risk of tumors from the Peutz-Jeghers syndrome associated with a deletion of the STK11 gene. © 2016 S. Karger AG, Basel.
The pitfalls of platform comparison: DNA copy number array technologies assessed
2009-01-01
Background The accurate and high resolution mapping of DNA copy number aberrations has become an important tool by which to gain insight into the mechanisms of tumourigenesis. There are various commercially available platforms for such studies, but there remains no general consensus as to the optimal platform. There have been several previous platform comparison studies, but they have either described older technologies, used less-complex samples, or have not addressed the issue of the inherent biases in such comparisons. Here we describe a systematic comparison of data from four leading microarray technologies (the Affymetrix Genome-wide SNP 5.0 array, Agilent High-Density CGH Human 244A array, Illumina HumanCNV370-Duo DNA Analysis BeadChip, and the Nimblegen 385 K oligonucleotide array). We compare samples derived from primary breast tumours and their corresponding matched normals, well-established cancer cell lines, and HapMap individuals. By careful consideration and avoidance of potential sources of bias, we aim to provide a fair assessment of platform performance. Results By performing a theoretical assessment of the reproducibility, noise, and sensitivity of each platform, notable differences were revealed. Nimblegen exhibited between-replicate array variances an order of magnitude greater than the other three platforms, with Agilent slightly outperforming the others, and a comparison of self-self hybridizations revealed similar patterns. An assessment of the single probe power revealed that Agilent exhibits the highest sensitivity. Additionally, we performed an in-depth visual assessment of the ability of each platform to detect aberrations of varying sizes. As expected, all platforms were able to identify large aberrations in a robust manner. However, some focal amplifications and deletions were only detected in a subset of the platforms. Conclusion Although there are substantial differences in the design, density, and number of replicate probes, the comparison indicates a generally high level of concordance between platforms, despite differences in the reproducibility, noise, and sensitivity. In general, Agilent tended to be the best aCGH platform and Affymetrix, the superior SNP-CGH platform, but for specific decisions the results described herein provide a guide for platform selection and study design, and the dataset a resource for more tailored comparisons. PMID:19995423
Honsa, Erin; Fricke, Thomas; Stephens, Alex J; Ko, Danny; Kong, Fanrong; Gilbert, Gwendolyn L; Huygens, Flavia; Giffard, Philip M
2008-08-19
Streptococcus agalactiae (Group B Streptococcus (GBS)) is an important human pathogen, particularly of newborns. Emerging evidence for a relationship between genotype and virulence has accentuated the need for efficient and well-defined typing methods. The objective of this study was to develop a single nucleotide polymorphism (SNP) based method for assigning GBS isolates to multilocus sequence typing (MLST)-defined clonal complexes. It was found that a SNP set derived from the MLST database on the basis of maximization of Simpsons Index of Diversity provided poor resolution and did not define groups concordant with the population structure as defined by eBURST analysis of the MLST database. This was interpreted as being a consequence of low diversity and high frequency horizontal gene transfer. Accordingly, a different approach to SNP identification was developed. This entailed use of the "Not-N" bioinformatic algorithm that identifies SNPs diagnostic for groups of known sequence variants, together with an empirical process of SNP testing. This yielded a four member SNP set that divides GBS into 10 groups that are concordant with the population structure. A fifth SNP was identified that increased the sensitivity for the clinically significant clonal complex 17 to 100%. Kinetic PCR methods for the interrogation of these SNPs were developed, and used to genotype 116 well characterized isolates. A five SNP method for dividing GBS into biologically valid groups has been developed. These SNPs are ideal for high throughput surveillance activities, and combining with more rapidly evolving loci when additional resolution is required.
Honsa, Erin; Fricke, Thomas; Stephens, Alex J; Ko, Danny; Kong, Fanrong; Gilbert, Gwendolyn L; Huygens, Flavia; Giffard, Philip M
2008-01-01
Background Streptococcus agalactiae (Group B Streptococcus (GBS)) is an important human pathogen, particularly of newborns. Emerging evidence for a relationship between genotype and virulence has accentuated the need for efficient and well-defined typing methods. The objective of this study was to develop a single nucleotide polymorphism (SNP) based method for assigning GBS isolates to multilocus sequence typing (MLST)-defined clonal complexes. Results It was found that a SNP set derived from the MLST database on the basis of maximisation of Simpsons Index of Diversity provided poor resolution and did not define groups concordant with the population structure as defined by eBURST analysis of the MLST database. This was interpreted as being a consequence of low diversity and high frequency horizontal gene transfer. Accordingly, a different approach to SNP identification was developed. This entailed use of the "Not-N" bioinformatic algorithm that identifies SNPs diagnostic for groups of known sequence variants, together with an empirical process of SNP testing. This yielded a four member SNP set that divides GBS into 10 groups that are concordant with the population structure. A fifth SNP was identified that increased the sensitivity for the clinically significant clonal complex 17 to 100%. Kinetic PCR methods for the interrogation of these SNPs were developed, and used to genotype 116 well characterized isolates. Conclusion A five SNP method for dividing GBS into biologically valid groups has been developed. These SNPs are ideal for high throughput surveillance activities, and combining with more rapidly evolving loci when additional resolution is required. PMID:18710585
Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array
USDA-ARS?s Scientific Manuscript database
Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases...
Machiela, Mitchell J; Zhou, Weiyin; Karlins, Eric; Sampson, Joshua N; Freedman, Neal D; Yang, Qi; Hicks, Belynda; Dagnall, Casey; Hautman, Christopher; Jacobs, Kevin B; Abnet, Christian C; Aldrich, Melinda C; Amos, Christopher; Amundadottir, Laufey T; Arslan, Alan A; Beane-Freeman, Laura E; Berndt, Sonja I; Black, Amanda; Blot, William J; Bock, Cathryn H; Bracci, Paige M; Brinton, Louise A; Bueno-de-Mesquita, H Bas; Burdett, Laurie; Buring, Julie E; Butler, Mary A; Canzian, Federico; Carreón, Tania; Chaffee, Kari G; Chang, I-Shou; Chatterjee, Nilanjan; Chen, Chu; Chen, Constance; Chen, Kexin; Chung, Charles C; Cook, Linda S; Crous Bou, Marta; Cullen, Michael; Davis, Faith G; De Vivo, Immaculata; Ding, Ti; Doherty, Jennifer; Duell, Eric J; Epstein, Caroline G; Fan, Jin-Hu; Figueroa, Jonine D; Fraumeni, Joseph F; Friedenreich, Christine M; Fuchs, Charles S; Gallinger, Steven; Gao, Yu-Tang; Gapstur, Susan M; Garcia-Closas, Montserrat; Gaudet, Mia M; Gaziano, J Michael; Giles, Graham G; Gillanders, Elizabeth M; Giovannucci, Edward L; Goldin, Lynn; Goldstein, Alisa M; Haiman, Christopher A; Hallmans, Goran; Hankinson, Susan E; Harris, Curtis C; Henriksson, Roger; Holly, Elizabeth A; Hong, Yun-Chul; Hoover, Robert N; Hsiung, Chao A; Hu, Nan; Hu, Wei; Hunter, David J; Hutchinson, Amy; Jenab, Mazda; Johansen, Christoffer; Khaw, Kay-Tee; Kim, Hee Nam; Kim, Yeul Hong; Kim, Young Tae; Klein, Alison P; Klein, Robert; Koh, Woon-Puay; Kolonel, Laurence N; Kooperberg, Charles; Kraft, Peter; Krogh, Vittorio; Kurtz, Robert C; LaCroix, Andrea; Lan, Qing; Landi, Maria Teresa; Marchand, Loic Le; Li, Donghui; Liang, Xiaolin; Liao, Linda M; Lin, Dongxin; Liu, Jianjun; Lissowska, Jolanta; Lu, Lingeng; Magliocco, Anthony M; Malats, Nuria; Matsuo, Keitaro; McNeill, Lorna H; McWilliams, Robert R; Melin, Beatrice S; Mirabello, Lisa; Moore, Lee; Olson, Sara H; Orlow, Irene; Park, Jae Yong; Patiño-Garcia, Ana; Peplonska, Beata; Peters, Ulrike; Petersen, Gloria M; Pooler, Loreall; Prescott, Jennifer; Prokunina-Olsson, Ludmila; Purdue, Mark P; Qiao, You-Lin; Rajaraman, Preetha; Real, Francisco X; Riboli, Elio; Risch, Harvey A; Rodriguez-Santiago, Benjamin; Ruder, Avima M; Savage, Sharon A; Schumacher, Fredrick; Schwartz, Ann G; Schwartz, Kendra L; Seow, Adeline; Wendy Setiawan, Veronica; Severi, Gianluca; Shen, Hongbing; Sheng, Xin; Shin, Min-Ho; Shu, Xiao-Ou; Silverman, Debra T; Spitz, Margaret R; Stevens, Victoria L; Stolzenberg-Solomon, Rachael; Stram, Daniel; Tang, Ze-Zhong; Taylor, Philip R; Teras, Lauren R; Tobias, Geoffrey S; Van Den Berg, David; Visvanathan, Kala; Wacholder, Sholom; Wang, Jiu-Cun; Wang, Zhaoming; Wentzensen, Nicolas; Wheeler, William; White, Emily; Wiencke, John K; Wolpin, Brian M; Wong, Maria Pik; Wu, Chen; Wu, Tangchun; Wu, Xifeng; Wu, Yi-Long; Wunder, Jay S; Xia, Lucy; Yang, Hannah P; Yang, Pan-Chyr; Yu, Kai; Zanetti, Krista A; Zeleniuch-Jacquotte, Anne; Zheng, Wei; Zhou, Baosen; Ziegler, Regina G; Perez-Jurado, Luis A; Caporaso, Neil E; Rothman, Nathaniel; Tucker, Margaret; Dean, Michael C; Yeager, Meredith; Chanock, Stephen J
2016-06-13
To investigate large structural clonal mosaicism of chromosome X, we analysed the SNP microarray intensity data of 38,303 women from cancer genome-wide association studies (20,878 cases and 17,425 controls) and detected 124 mosaic X events >2 Mb in 97 (0.25%) women. Here we show rates for X-chromosome mosaicism are four times higher than mean autosomal rates; X mosaic events more often include the entire chromosome and participants with X events more likely harbour autosomal mosaic events. X mosaicism frequency increases with age (0.11% in 50-year olds; 0.45% in 75-year olds), as reported for Y and autosomes. Methylation array analyses of 33 women with X mosaicism indicate events preferentially involve the inactive X chromosome. Our results provide further evidence that the sex chromosomes undergo mosaic events more frequently than autosomes, which could have implications for understanding the underlying mechanisms of mosaic events and their possible contribution to risk for chronic diseases.
CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.
Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H
2010-07-06
The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/
Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.
2010-01-01
Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635
Diagnosis of Van den Ende-Gupta syndrome: Approach to the Marden-Walker-like spectrum of disorders.
Niederhoffer, Karen Y; Fahiminiya, Somayyeh; Eydoux, Patrice; Mawson, John; Nishimura, Gen; Jerome-Majewska, Loydie A; Patel, Millan S
2016-09-01
Marden-Walker syndrome is challenging to diagnose, as there is significant overlap with other multi-system congenital contracture syndromes including Beals congenital contractural arachnodactyly, D4ST1-Deficient Ehlers-Danlos syndrome (adducted thumb-clubfoot syndrome), Schwartz-Jampel syndrome, Freeman-Sheldon syndrome, Cerebro-oculo-facio-skeletal syndrome, and Van den Ende-Gupta syndrome. We discuss this differential diagnosis in the context of a boy from a consanguineous union with Van den Ende-Gupta syndrome, a diagnosis initially confused by the atypical presence of intellectual disability. SNP microarray and whole exome sequencing identified a homozygous frameshift mutation (p.L870V) in SCARF2 and predicted damaging mutations in several genes, most notably DGCR2 (p.P75L) and NCAM2 (p.S147G), both possible candidates for this child's intellectual disability. We review distinguishing features for each Marden-Walker-like syndrome and propose a clinical algorithm for diagnosis among this spectrum of disorders. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Functional Characterization of Schizophrenia-Associated Variation in CACNA1C
Eckart, Nicole; Song, Qifeng; Yang, Rebecca; Wang, Ruihua; Zhu, Heng; McCallion, Andrew S.; Avramopoulos, Dimitrios
2016-01-01
Calcium channel subunits, including CACNA1C, have been associated with multiple psychiatric disorders. Specifically, genome wide association studies (GWAS) have repeatedly identified the single nucleotide polymorphism (SNP) rs1006737 in intron 3 of CACNA1C to be strongly associated with schizophrenia and bipolar disorder. Here, we show that rs1006737 marks a quantitative trait locus for CACNA1C transcript levels. We test 16 SNPs in high linkage disequilibrium with rs1007637 and find one, rs4765905, consistently showing allele-dependent regulatory function in reporter assays. We find allele-specific protein binding for 13 SNPs including rs4765905. Using protein microarrays, we identify several proteins binding ≥3 SNPs, but not control sequences, suggesting possible functional interactions and combinatorial haplotype effects. Finally, using circular chromatin conformation capture, we show interaction of the disease-associated region including the 16 SNPs with the CACNA1C promoter and other potential regulatory regions. Our results elucidate the pathogenic relevance of one of the best-supported risk loci for schizophrenia and bipolar disorder. PMID:27276213
Female chromosome X mosaicism is age-related and preferentially affects the inactivated X chromosome
Machiela, Mitchell J.; Zhou, Weiyin; Karlins, Eric; Sampson, Joshua N.; Freedman, Neal D.; Yang, Qi; Hicks, Belynda; Dagnall, Casey; Hautman, Christopher; Jacobs, Kevin B.; Abnet, Christian C.; Aldrich, Melinda C.; Amos, Christopher; Amundadottir, Laufey T.; Arslan, Alan A.; Beane-Freeman, Laura E.; Berndt, Sonja I.; Black, Amanda; Blot, William J.; Bock, Cathryn H.; Bracci, Paige M.; Brinton, Louise A.; Bueno-de-Mesquita, H Bas; Burdett, Laurie; Buring, Julie E.; Butler, Mary A.; Canzian, Federico; Carreón, Tania; Chaffee, Kari G.; Chang, I-Shou; Chatterjee, Nilanjan; Chen, Chu; Chen, Constance; Chen, Kexin; Chung, Charles C.; Cook, Linda S.; Crous Bou, Marta; Cullen, Michael; Davis, Faith G.; De Vivo, Immaculata; Ding, Ti; Doherty, Jennifer; Duell, Eric J.; Epstein, Caroline G.; Fan, Jin-Hu; Figueroa, Jonine D.; Fraumeni, Joseph F.; Friedenreich, Christine M.; Fuchs, Charles S.; Gallinger, Steven; Gao, Yu-Tang; Gapstur, Susan M.; Garcia-Closas, Montserrat; Gaudet, Mia M.; Gaziano, J. Michael; Giles, Graham G.; Gillanders, Elizabeth M.; Giovannucci, Edward L.; Goldin, Lynn; Goldstein, Alisa M.; Haiman, Christopher A.; Hallmans, Goran; Hankinson, Susan E.; Harris, Curtis C.; Henriksson, Roger; Holly, Elizabeth A.; Hong, Yun-Chul; Hoover, Robert N.; Hsiung, Chao A.; Hu, Nan; Hu, Wei; Hunter, David J.; Hutchinson, Amy; Jenab, Mazda; Johansen, Christoffer; Khaw, Kay-Tee; Kim, Hee Nam; Kim, Yeul Hong; Kim, Young Tae; Klein, Alison P.; Klein, Robert; Koh, Woon-Puay; Kolonel, Laurence N.; Kooperberg, Charles; Kraft, Peter; Krogh, Vittorio; Kurtz, Robert C.; LaCroix, Andrea; Lan, Qing; Landi, Maria Teresa; Marchand, Loic Le; Li, Donghui; Liang, Xiaolin; Liao, Linda M.; Lin, Dongxin; Liu, Jianjun; Lissowska, Jolanta; Lu, Lingeng; Magliocco, Anthony M.; Malats, Nuria; Matsuo, Keitaro; McNeill, Lorna H.; McWilliams, Robert R.; Melin, Beatrice S.; Mirabello, Lisa; Moore, Lee; Olson, Sara H.; Orlow, Irene; Park, Jae Yong; Patiño-Garcia, Ana; Peplonska, Beata; Peters, Ulrike; Petersen, Gloria M.; Pooler, Loreall; Prescott, Jennifer; Prokunina-Olsson, Ludmila; Purdue, Mark P.; Qiao, You-Lin; Rajaraman, Preetha; Real, Francisco X.; Riboli, Elio; Risch, Harvey A.; Rodriguez-Santiago, Benjamin; Ruder, Avima M.; Savage, Sharon A.; Schumacher, Fredrick; Schwartz, Ann G.; Schwartz, Kendra L.; Seow, Adeline; Wendy Setiawan, Veronica; Severi, Gianluca; Shen, Hongbing; Sheng, Xin; Shin, Min-Ho; Shu, Xiao-Ou; Silverman, Debra T.; Spitz, Margaret R.; Stevens, Victoria L.; Stolzenberg-Solomon, Rachael; Stram, Daniel; Tang, Ze-Zhong; Taylor, Philip R.; Teras, Lauren R.; Tobias, Geoffrey S.; Van Den Berg, David; Visvanathan, Kala; Wacholder, Sholom; Wang, Jiu-Cun; Wang, Zhaoming; Wentzensen, Nicolas; Wheeler, William; White, Emily; Wiencke, John K.; Wolpin, Brian M.; Wong, Maria Pik; Wu, Chen; Wu, Tangchun; Wu, Xifeng; Wu, Yi-Long; Wunder, Jay S.; Xia, Lucy; Yang, Hannah P.; Yang, Pan-Chyr; Yu, Kai; Zanetti, Krista A.; Zeleniuch-Jacquotte, Anne; Zheng, Wei; Zhou, Baosen; Ziegler, Regina G.; Perez-Jurado, Luis A.; Caporaso, Neil E.; Rothman, Nathaniel; Tucker, Margaret; Dean, Michael C.; Yeager, Meredith; Chanock, Stephen J.
2016-01-01
To investigate large structural clonal mosaicism of chromosome X, we analysed the SNP microarray intensity data of 38,303 women from cancer genome-wide association studies (20,878 cases and 17,425 controls) and detected 124 mosaic X events >2 Mb in 97 (0.25%) women. Here we show rates for X-chromosome mosaicism are four times higher than mean autosomal rates; X mosaic events more often include the entire chromosome and participants with X events more likely harbour autosomal mosaic events. X mosaicism frequency increases with age (0.11% in 50-year olds; 0.45% in 75-year olds), as reported for Y and autosomes. Methylation array analyses of 33 women with X mosaicism indicate events preferentially involve the inactive X chromosome. Our results provide further evidence that the sex chromosomes undergo mosaic events more frequently than autosomes, which could have implications for understanding the underlying mechanisms of mosaic events and their possible contribution to risk for chronic diseases. PMID:27291797
Canine sterile nodular panniculitis: a retrospective study of 39 dogs.
Contreary, Caitlin L; Outerbridge, Catherine A; Affolter, Verena K; Kass, Philip H; White, Stephen D
2015-12-01
Canine sterile nodular panniculitis (SNP) is an inflammatory disease of the panniculus that is typically managed with immunomodulatory or immunosuppressive treatments. It has been reported to be a cutaneous marker of an underlying systemic disease. To assess the presence or absence of concurrent systemic diseases associated with canine SNP and to document breed predispositions. Thirty nine dogs presented to a veterinary teaching hospital from 1990 to 2012 which met inclusion criteria. Inclusion in this retrospective study required a diagnosis of SNP via histopathological analysis and negative special stains for infectious organisms. Breed distributions of affected dogs were compared to all other dogs examined at this hospital during the study period. Correlations between the histological pattern of panniculitis and the histological presence of dermatitis, clinical presentation of lesions, dog breed and therapeutic outcomes were assessed. Australian shepherd dogs, Brittany spaniels, Dalmatians, Pomeranians and Chihuahuas were significantly over-represented, but correlations between inflammatory patterns of panniculitis and other histological and clinical factors were not identified. Based on the information available in medical records, 32 dogs (82.1%) had no concurrent systemic diseases identified. Four dogs had concurrent polyarthritis, which may be related to SNP through unknown mechanisms. This study identified several novel breed predilections for SNP; it failed to find any clear correlations with associated systemic diseases other than polyarthritis. The histological inflammatory pattern of SNP does not predict therapeutic outcome. © 2015 ESVD and ACVD.
Tong, B; Li, G P; Sasaki, S; Muramatsu, Y; Ohta, T; Kose, H; Yamada, T
2015-04-01
Growth performance, as well as marbling, is the main breeding objective in Japanese Black (JB) cattle, the major beef breed in Japan. The septin 7 (CDC10) gene, involved in cellular proliferation, is located within a genomic region of a quantitative trait locus for growth-related traits. In this study, we first showed that the expression levels of the CDC10 gene in the skeletal muscle were higher in JB steers with extremely high growth performance than in JB steers with extremely low growth, using real-time PCR. Further, a single nucleotide polymorphism (SNP), NC_007302.5:g.63264949G>C, was detected in the promoter region of the CDC10 gene and genotyped in three Japanese cattle breeds (known as 'Wagyu' in Japan) and the Brown Swiss dairy cattle breed. All four cattle populations showed a moderate genetic diversity at the SNP of the CDC10 gene. An association analysis indicated that the SNP was associated with growth-related traits in JB cattle. These findings suggest possible effects of the expression levels in the skeletal muscle and the SNP of the CDC10 gene on growth-related traits in JB cattle. The CDC10 SNP may be useful for effective marker-assisted selection to increase beef productivity in JB beef cattle. © 2015 Stichting International Foundation for Animal Genetics.
Xu, C; Yang, X; Wang, Y; Ding, N; Han, R; Sun, Y; Wang, Y
2017-07-01
Frequencies of two glucose transporter 1 (GLUT1) single-nucleotide polymorphisms (SNPs) (XbaI G>T and HaeIII T>C) were studied with urothelial cell carcinomas of the bladder (UCC) and 204 normal persons. And the expression of the p53, Ki67 and GLUT1 was assayed by immunohistochemistry. The frequency of the TT genotype and T allele of the XbaI G>T SNP was decreased in the patients with UCC. The frequency of the CC genotype and C allele of the HaeIII T>C SNP was decreased in the patients with UCC. The GLUT1 XbaI genotype GG was more frequent in higher tumor stage and higher tumor grade patients. In the XbaI G>T SNP, the GG genotype was significantly related to higher Remmele immunoreactive score (IRS) of Ki67 and higher IRS of GLUT1. In conclusion, the TT genotype in XbaI G>T SNP and CC genotype of HaeIII T>C SNP may have protective effect in the carcinogenesis process of UCC. In the XbaI G>T SNP, the GG genotype of was positively related to tumor proliferation, glucose metabolism, tumor grade and stage. Therefore, the variant might become a possible proliferation-related prognostic factor for UCC.
Castro-Martínez, Anna Gabriela; Sánchez-Corona, José; Vázquez-Vargas, Adriana Patricia; García-Zapién, Alejandra Guadalupe; López-Quintero, Andres; Villalpando-Velazco, Héctor Javier; Flores-Martínez, Silvia Esperanza
2018-02-28
Gestational diabetes mellitus (GDM) is a metabolically complex disease with major genetic determinants. GDM has been associated with insulin resistance and dysfunction of pancreatic beta cells, so the GDM candidate genes are those that encode proteins modulating the function and secretion of insulin, such as that for calpain 10 (CAPN10). This study aimed to assess whether single nucleotide polymorphism (SNP)-43, SNP-44, SNP-63, and the indel-19 variant, and specific haplotypes of the CAPN10 gene were associated with gestational diabetes mellitus. We studied 116 patients with gestational diabetes mellitus and 83 women with normal glucose tolerance. Measurements of anthropometric and biochemical parameters were performed. SNP-43, SNP-44, and SNP-63 were identified by polymerase chain reaction (PCR)-restriction fragment length polymorphisms, while the indel-19 variant was detected by TaqMan qPCR assays. The allele, genotype, and haplotype frequencies of the four variants did not differ significantly between women with gestational diabetes mellitus and controls. However, in women with gestational diabetes mellitus, glucose levels were significantly higher bearing the 3R/3R genotype than in carriers of the 3R/2R genotype of the indel-19 variant (p = 0.006). In conclusion, the 3R/3R genotype of the indel-19 variant of the CAPN-10 gene influenced increased glucose levels in these Mexican women with gestational diabetes mellitus.
Fontanesi, L; Galimberti, G; Calò, D G; Fronza, R; Martelli, P L; Scotti, E; Colombo, M; Schiavo, G; Casadio, R; Buttazzoni, L; Russo, V
2012-08-01
Combining different approaches (resequencing of portions of 54 obesity candidate genes, literature mining for pig markers associated with fat deposition or related traits in 77 genes, and in silico mining of porcine expressed sequence tags and other sequences available in databases), we identified and analyzed 736 SNP within candidate genes to identify markers associated with back fat thickness (BFT) in Italian Large White sows. Animals were chosen using a selective genotyping approach according to their EBV for BFT (276 with most negative and 279 with most positive EBV) within a population of ≈ 12,000 pigs. Association analysis between the SNP and BFT has been carried out using the MAX test proposed for case-control studies. The designed assays were successful for 656 SNP: 370 were excluded (low call rate or minor allele frequency <5%), whereas the remaining 286 in 212 genes were taken for subsequent analyses, among which 64 showed a P(nominal) value <0.1. To deal with the multiple testing problem in a candidate gene approach, we applied the proportion of false positives (PFP) method. Thirty-eight SNP were significant (P(PFP) < 0.20). The most significant SNP was the IGF2 intron3-g.3072G>A polymorphism (P(nominal) < 1.0E-50). The second most significant SNP was the MC4R c.1426A>G polymorphism (P(nominal) = 8.0E-05). The third top SNP (P(nominal) = 6.2E-04) was the intronic TBC1D1 g.219G>A polymorphic site, in agreement with our previous results obtained in an independent study. The list of significant markers also included SNP in additional genes (ABHD16A, ABHD5, ACP2, ALMS1, APOA2, ATP1A2, CALR, COL14A1, CTSF, DARS, DECR1, ENPP1, ESR1, GH1, GHRL, GNMT, IKBKB, JAK3, MTTP, NFKBIA, NT5E, PLAT, PPARG, PPP2R5D, PRLR, RRAGD, RFC2, SDHD, SERPINF1, UBE2H, VCAM1, and WAT). Functional relationships between genes were obtained using the Ingenuity Pathway Analysis (IPA) Knowledge Base. The top scoring pathway included 19 genes with a P(nominal) < 0.1, 2 of which (IKBKB and NFKBIA) are involved in the hypothalamic IKKβ/NFκB program that could represent a key axis to affect fat deposition traits in pigs. These results represent a starting point to plan marker-assisted selection in Italian Large White nuclei for BFT. Because of similarities between humans and pigs, this study might also provide useful clues to investigate genetic factors affecting human obesity.
Richard, Arianne C; Lyons, Paul A; Peters, James E; Biasci, Daniele; Flint, Shaun M; Lee, James C; McKinney, Eoin F; Siegel, Richard M; Smith, Kenneth G C
2014-08-04
Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study. Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this "gold-standard" comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues. Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.
Weniger, Markus; Engelmann, Julia C; Schultz, Jörg
2007-01-01
Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at . Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at . PMID:17543125
Development of DNA Microarrays for Metabolic Pathway and Bioprocess Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gregory Stephanopoulos
Transcriptional profiling experiments utilizing DNA microarrays to study the intracellular accumulation of PHB in Synechocystis has proved difficult in large part because strains that show significant differences in PHB which would justify global analysis of gene expression have not been isolated.
The SNPforID Assay as a Supplementary Method in Kinship and Trace Analysis
Schwark, Thorsten; Meyer, Patrick; Harder, Melanie; Modrow, Jan-Hendrick; von Wurmb-Schwark, Nicole
2012-01-01
Objective Short tandem repeat (STR) analysis using commercial multiplex PCR kits is the method of choice for kinship testing and trace analysis. However, under certain circumstances (deficiency testing, mutations, minute DNA amounts), STRs alone may not suffice. Methods We present a 50-plex single nucleotide polymorphism (SNP) assay based on the SNPs chosen by the SNPforID consortium as an additional method for paternity and for trace analysis. The new assay was applied to selected routine paternity and trace cases from our laboratory. Results and Conclusions Our investigation shows that the new SNP multiplex assay is a valuable method to supplement STR analysis, and is a powerful means to solve complicated genetic analyses. PMID:22851934
Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias
2015-01-01
Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Shekhar, M S; Gomathi, A; Gopikrishna, G; Ponniah, A G
2015-06-01
White spot syndrome virus (WSSV) continues to be the most devastating viral pathogen infecting penaeid shrimp the world over. The genome of WSSV has been deciphered and characterized from three geographical isolates and significant progress has been made in developing various molecular diagnostic methods to detect the virus. However, the information on host immune gene response to WSSV pathogenesis is limited. Microarray analysis was carried out as an approach to analyse the gene expression in black tiger shrimp Penaeus monodon in response to WSSV infection. Gill tissues collected from the WSSV infected shrimp at 6, 24, 48 h and moribund stage were analysed for differential gene expression. Shrimp cDNAs of 40,059 unique sequences were considered for designing the microarray chip. The Cy3-labeled cRNA derived from healthy and WSSV-infected shrimp was subjected to hybridization with all the DNA spots in the microarray which revealed 8,633 and 11,147 as up- and down-regulated genes respectively at different time intervals post infection. The altered expression of these numerous genes represented diverse functions such as immune response, osmoregulation, apoptosis, nucleic acid binding, energy and metabolism, signal transduction, stress response and molting. The changes in gene expression profiles observed by microarray analysis provides molecular insights and framework of genes which are up- and down-regulated at different time intervals during WSSV infection in shrimp. The microarray data was validated by Real Time analysis of four differentially expressed genes involved in apoptosis (translationally controlled tumor protein, inhibitor of apoptosis protein, ubiquitin conjugated enzyme E2 and caspase) for gene expression levels. The role of apoptosis related genes in WSSV infected shrimp is discussed herein.
Reuse of imputed data in microarray analysis increases imputation efficiency
Kim, Ki-Yeol; Kim, Byoung-Jin; Yi, Gwan-Su
2004-01-01
Background The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. Results We developed a new cluster-based imputation method called sequential K-nearest neighbor (SKNN) method. This imputes the missing values sequentially from the gene having least missing values, and uses the imputed values for the later imputation. Although it uses the imputed values, the efficiency of this new method is greatly improved in its accuracy and computational complexity over the conventional KNN-based method and other methods based on maximum likelihood estimation. The performance of SKNN was in particular higher than other imputation methods for the data with high missing rates and large number of experiments. Application of Expectation Maximization (EM) to the SKNN method improved the accuracy, but increased computational time proportional to the number of iterations. The Multiple Imputation (MI) method, which is well known but not applied previously to microarray data, showed a similarly high accuracy as the SKNN method, with slightly higher dependency on the types of data sets. Conclusions Sequential reuse of imputed data in KNN-based imputation greatly increases the efficiency of imputation. The SKNN method should be practically useful to save the data of some microarray experiments which have high amounts of missing entries. The SKNN method generates reliable imputed values which can be used for further cluster-based analysis of microarray data. PMID:15504240
Atassi, M Zouhair; Dolimbek, Behzod Z; Steward, Lance E; Aoki, K Roger
2007-01-01
In studies from this laboratory, we localized the regions on the H chain of botulinum neurotoxin A (BoNT/A) that are recognized by anti-BoNT/A antibodies (Abs) and block the activity of the toxin in vivo. These Abs were obtained from cervical dystonia patients who had been treated with BoNT/A and had become unresponsive to the treatment, as well as blocking Abs raised in mouse, horse, and chicken. We also localized the regions involved in BoNT/A binding to mouse brain synaptosomes (snp). Comparison of spatial proximities in the three-dimensional structure of the Ab-binding regions and the snp binding showed that except for one, the Ab-binding regions either coincide or overlap with the snp regions. It should be folly expected that protective Abs when bound to the toxin at sites that coincide or overlap with snp binding would prevent the toxin from binding to nerve synapse and therefore block toxin entry into the neuron. Thus, analysis of the locations of the Ab-binding and the snp-binding regions provides a molecular rationale for the ability of protecting Abs to block BoNT/A action in vivo.
Michailidou, S; Tsangaris, G; Fthenakis, G C; Tzora, A; Skoufos, I; Karkabounas, S C; Banos, G; Argiriou, A; Arsenos, G
2018-06-01
In the present study, genome-wide genotyping was applied to characterize the genetic diversity and population structure of three autochthonous Greek breeds: Boutsko, Karagouniko and Chios. Dairy sheep are among the most significant livestock species in Greece numbering approximately 9 million animals which are characterized by large phenotypic variation and reared under various farming systems. A total of 96 animals were genotyped with the Illumina's OvineSNP50K microarray beadchip, to study the population structure of the breeds and develop a specialized panel of single-nucleotide polymorphisms (SNPs), which could distinguish one breed from the others. Quality control on the dataset resulted in 46,125 SNPs, which were used to evaluate the genetic structure of the breeds. Population structure was assessed through principal component analysis (PCA) and admixture analysis, whereas inbreeding was estimated based on runs of homozygosity (ROHs) coefficients, genomic relationship matrix inbreeding coefficients (F GRM ) and patterns of linkage disequilibrium (LD). Associations between SNPs and breeds were analyzed with different inheritance models, to identify SNPs that distinguish among the breeds. Results showed high levels of genetic heterogeneity in the three breeds. Genetic distances among breeds were modest, despite their different ancestries. Chios and Karagouniko breeds were more genetically related to each other compared to Boutsko. Analysis revealed 3802 candidate SNPs that can be used to identify two-breed crosses and purebred animals. The present study provides, for the first time, data on the genetic background of three Greek indigenous dairy sheep breeds as well as a specialized marker panel that can be applied for traceability purposes as well as targeted genetic improvement schemes and conservation programs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric
2010-03-23
Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities tomore » known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.« less