human genome array: Topics by Science.gov

Sample records for human genome array

A Portrait of Ribosomal DNA Contacts with Hi-C Reveals 5S and 45S rDNA Anchoring Points in the Folded Human Genome

PubMed Central

Yu, Shoukai; Lemos, Bernardo

2016-01-01

Ribosomal RNAs (rRNAs) account for >60% of all RNAs in eukaryotic cells and are encoded in the ribosomal DNA (rDNA) arrays. The rRNAs are produced from two sets of loci: the 5S rDNA array resides exclusively on human chromosome 1, whereas the 45S rDNA array resides on the short arm of five human acrocentric chromosomes. The 45S rDNA gives origin to the nucleolus, the nuclear organelle that is the site of ribosome biogenesis. Intriguingly, 5S and 45S rDNA arrays exhibit correlated copy number variation in lymphoblastoid cells (LCLs). Here we examined the genomic architecture and repeat content of the 5S and 45S rDNA arrays in multiple human genome assemblies (including PacBio MHAP assembly) and ascertained contacts between the rDNA arrays and the rest of the genome using Hi-C datasets from two human cell lines (erythroleukemia K562 and lymphoblastoid cells). Our analyses revealed that 5S and 45S arrays each have thousands of contacts in the folded genome, with rDNA-associated regions and genes dispersed across all chromosomes. The rDNA contact map displayed conserved and disparate features between two cell lines, and pointed to specific chromosomes, genomic regions, and genes with evidence of spatial proximity to the rDNA arrays; the data also showed a lack of direct physical interaction between the 5S and 45S rDNA arrays. Finally, the analysis identified an intriguing organization in the 5S array with Alu and 5S elements adjacent to one another and organized in opposite orientation along the array. Portraits of genome folding centered on the ribosomal DNA array could help understand the emergence of concerted variation, the control of 5S and 45S expression, as well as provide insights into an organelle that contributes to the spatial localization of human chromosomes during interphase. PMID:27797956
Genome-Wide Mapping of Copy Number Variation in Humans: Comparative Analysis of High Resolution Array Platforms

PubMed Central

Haraksingh, Rajini R.; Abyzov, Alexej; Gerstein, Mark; Urban, Alexander E.; Snyder, Michael

2011-01-01

Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications. PMID:22140474
A Portrait of Ribosomal DNA Contacts with Hi-C Reveals 5S and 45S rDNA Anchoring Points in the Folded Human Genome.

PubMed

Yu, Shoukai; Lemos, Bernardo

2016-12-31

Ribosomal RNAs (rRNAs) account for >60% of all RNAs in eukaryotic cells and are encoded in the ribosomal DNA (rDNA) arrays. The rRNAs are produced from two sets of loci: the 5S rDNA array resides exclusively on human chromosome 1, whereas the 45S rDNA array resides on the short arm of five human acrocentric chromosomes. The 45S rDNA gives origin to the nucleolus, the nuclear organelle that is the site of ribosome biogenesis. Intriguingly, 5S and 45S rDNA arrays exhibit correlated copy number variation in lymphoblastoid cells (LCLs). Here we examined the genomic architecture and repeat content of the 5S and 45S rDNA arrays in multiple human genome assemblies (including PacBio MHAP assembly) and ascertained contacts between the rDNA arrays and the rest of the genome using Hi-C datasets from two human cell lines (erythroleukemia K562 and lymphoblastoid cells). Our analyses revealed that 5S and 45S arrays each have thousands of contacts in the folded genome, with rDNA-associated regions and genes dispersed across all chromosomes. The rDNA contact map displayed conserved and disparate features between two cell lines, and pointed to specific chromosomes, genomic regions, and genes with evidence of spatial proximity to the rDNA arrays; the data also showed a lack of direct physical interaction between the 5S and 45S rDNA arrays. Finally, the analysis identified an intriguing organization in the 5S array with Alu and 5S elements adjacent to one another and organized in opposite orientation along the array. Portraits of genome folding centered on the ribosomal DNA array could help understand the emergence of concerted variation, the control of 5S and 45S expression, as well as provide insights into an organelle that contributes to the spatial localization of human chromosomes during interphase. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Glossary

MedlinePlus

... array, and oligo/SNP combination array. Related terms: comparative genomic hybridization ; copy number variant ; SNP array chromosome ... for example, the AB blood groups in humans comparative genomic hybridization Method in which two DNA samples ( ...
A Single Multiplex crRNA Array for FnCpf1-Mediated Human Genome Editing.

PubMed

Sun, Huihui; Li, Fanfan; Liu, Jie; Yang, Fayu; Zeng, Zhenhai; Lv, Xiujuan; Tu, Mengjun; Liu, Yeqing; Ge, Xianglian; Liu, Changbao; Zhao, Junzhao; Zhang, Zongduan; Qu, Jia; Song, Zongming; Gu, Feng

2018-06-15

Cpf1 has been harnessed as a tool for genome manipulation in various species because of its simplicity and high efficiency. Our recent study demonstrated that FnCpf1 could be utilized for human genome editing with notable advantages for target sequence selection due to the flexibility of the protospacer adjacent motif (PAM) sequence. Multiplex genome editing provides a powerful tool for targeting members of multigene families, dissecting gene networks, modeling multigenic disorders in vivo, and applying gene therapy. However, there are no reports at present that show FnCpf1-mediated multiplex genome editing via a single customized CRISPR RNA (crRNA) array. In the present study, we utilize a single customized crRNA array to simultaneously target multiple genes in human cells. In addition, we also demonstrate that a single customized crRNA array to target multiple sites in one gene could be achieved. Collectively, FnCpf1, a powerful genome-editing tool for multiple genomic targets, can be harnessed for effective manipulation of the human genome. Copyright © 2018 The American Society of Gene and Cell Therapy. Published by Elsevier Inc. All rights reserved.
Imputation-Based Genomic Coverage Assessments of Current Human Genotyping Arrays

PubMed Central

Nelson, Sarah C.; Doheny, Kimberly F.; Pugh, Elizabeth W.; Romm, Jane M.; Ling, Hua; Laurie, Cecelia A.; Browning, Sharon R.; Weir, Bruce S.; Laurie, Cathy C.

2013-01-01

Microarray single-nucleotide polymorphism genotyping, combined with imputation of untyped variants, has been widely adopted as an efficient means to interrogate variation across the human genome. “Genomic coverage” is the total proportion of genomic variation captured by an array, either by direct observation or through an indirect means such as linkage disequilibrium or imputation. We have performed imputation-based genomic coverage assessments of eight current genotyping arrays that assay from ~0.3 to ~5 million variants. Coverage was determined separately in each of the four continental ancestry groups in the 1000 Genomes Project phase 1 release. We used the subset of 1000 Genomes variants present on each array to impute the remaining variants and assessed coverage based on correlation between imputed and observed allelic dosages. More than 75% of common variants (minor allele frequency > 0.05) are covered by all arrays in all groups except for African ancestry, and up to ~90% in all ancestries for the highest density arrays. In contrast, less than 40% of less common variants (0.01 < minor allele frequency < 0.05) are covered by low density arrays in all ancestries and 50–80% in high density arrays, depending on ancestry. We also calculated genome-wide power to detect variant-trait association in a case-control design, across varying sample sizes, effect sizes, and minor allele frequency ranges, and compare these array-based power estimates with a hypothetical array that would type all variants in 1000 Genomes. These imputation-based genomic coverage and power analyses are intended as a practical guide to researchers planning genetic studies. PMID:23979933
Centromere reference models for human chromosomes X and Y satellite arrays

PubMed Central

Miga, Karen H.; Newton, Yulia; Jain, Miten; Altemose, Nicolas; Willard, Huntington F.; Kent, W. James

2014-01-01

The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes. PMID:24501022
Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans.

PubMed

Haraksingh, Rajini R; Abyzov, Alexej; Urban, Alexander Eckehart

2017-04-24

High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies.
Novel mouse model recapitulates genome and transcriptome alterations in human colorectal carcinomas.

PubMed

McNeil, Nicole E; Padilla-Nash, Hesed M; Buishand, Floryne O; Hue, Yue; Ried, Thomas

2017-03-01

Human colorectal carcinomas are defined by a nonrandom distribution of genomic imbalances that are characteristic for this disease. Often, these imbalances affect entire chromosomes. Understanding the role of these aneuploidies for carcinogenesis is of utmost importance. Currently, established transgenic mice do not recapitulate the pathognonomic genome aberration profile of human colorectal carcinomas. We have developed a novel model based on the spontaneous transformation of murine colon epithelial cells. During this process, cells progress through stages of pre-immortalization, immortalization and, finally, transformation, and result in tumors when injected into immunocompromised mice. We analyzed our model for genome and transcriptome alterations using ArrayCGH, spectral karyotyping (SKY), and array based gene expression profiling. ArrayCGH revealed a recurrent pattern of genomic imbalances. These results were confirmed by SKY. Comparing these imbalances with orthologous maps of human chromosomes revealed a remarkable overlap. We observed focal deletions of the tumor suppressor genes Trp53 and Cdkn2a/p16. High-level focal genomic amplification included the locus harboring the oncogene Mdm2, which was confirmed by FISH in the form of double minute chromosomes. Array-based global gene expression revealed distinct differences between the sequential steps of spontaneous transformation. Gene expression changes showed significant similarities with human colorectal carcinomas. Pathways most prominently affected included genes involved in chromosomal instability and in epithelial to mesenchymal transition. Our novel mouse model therefore recapitulates the most prominent genome and transcriptome alterations in human colorectal cancer, and might serve as a valuable tool for understanding the dynamic process of tumorigenesis, and for preclinical drug testing. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
A Rapid Method of Genomic Array Analysis of Scaffold/Matrix Attachment Regions (S/MARs) Identifies a 2.5-Mb Region of Enhanced Scaffold/Matrix Attachment at a Human Neocentromere

PubMed Central

Sumer, Huseyin; Craig, Jeffrey M.; Sibson, Mandy; Choo, K.H. Andy

2003-01-01

Human neocentromeres are fully functional centromeres that arise at previously noncentromeric regions of the genome. We have tested a rapid procedure of genomic array analysis of chromosome scaffold/matrix attachment regions (S/MARs), involving the isolation of S/MAR DNA and hybridization of this DNA to a genomic BAC/PAC array. Using this procedure, we have defined a 2.5-Mb domain of S/MAR-enriched chromatin that fully encompasses a previously mapped centromere protein-A (CENP-A)-associated domain at a human neocentromere. We have independently verified this procedure using a previously established fluorescence in situ hybridization method on salt-treated metaphase chromosomes. In silico sequence analysis of the S/MAR-enriched and surrounding regions has revealed no outstanding sequence-related predisposition. This study defines the S/MAR-enriched domain of a higher eukaryotic centromere and provides a method that has broad application for the mapping of S/MAR attachment sites over large genomic regions or throughout a genome. PMID:12840048
Novel applications of array comparative genomic hybridization in molecular diagnostics.

PubMed

Cheung, Sau W; Bi, Weimin

2018-05-31

In 2004, the implementation of array comparative genomic hybridization (array comparative genome hybridization [CGH]) into clinical practice marked a new milestone for genetic diagnosis. Array CGH and single-nucleotide polymorphism (SNP) arrays enable genome-wide detection of copy number changes in a high resolution, and therefore microarray has been recognized as the first-tier test for patients with intellectual disability or multiple congenital anomalies, and has also been applied prenatally for detection of clinically relevant copy number variations in the fetus. Area covered: In this review, the authors summarize the evolution of array CGH technology from their diagnostic laboratory, highlighting exonic SNP arrays developed in the past decade which detect small intragenic copy number changes as well as large DNA segments for the region of heterozygosity. The applications of array CGH to human diseases with different modes of inheritance with the emphasis on autosomal recessive disorders are discussed. Expert commentary: An exonic array is a powerful and most efficient clinical tool in detecting genome wide small copy number variants in both dominant and recessive disorders. However, whole-genome sequencing may become the single integrated platform for detection of copy number changes, single-nucleotide changes as well as balanced chromosomal rearrangements in the near future.
SeeGH--a software tool for visualization of whole genome array comparative genomic hybridization data.

PubMed

Chi, Bryan; DeLeeuw, Ronald J; Coe, Bradley P; MacAulay, Calum; Lam, Wan L

2004-02-09

Array comparative genomic hybridization (CGH) is a technique which detects copy number differences in DNA segments. Complete sequencing of the human genome and the development of an array representing a tiling set of tens of thousands of DNA segments spanning the entire human genome has made high resolution copy number analysis throughout the genome possible. Since array CGH provides signal ratio for each DNA segment, visualization would require the reassembly of individual data points into chromosome profiles. We have developed a visualization tool for displaying whole genome array CGH data in the context of chromosomal location. SeeGH is an application that translates spot signal ratio data from array CGH experiments to displays of high resolution chromosome profiles. Data is imported from a simple tab delimited text file obtained from standard microarray image analysis software. SeeGH processes the signal ratio data and graphically displays it in a conventional CGH karyotype diagram with the added features of magnification and DNA segment annotation. In this process, SeeGH imports the data into a database, calculates the average ratio and standard deviation for each replicate spot, and links them to chromosome regions for graphical display. Once the data is displayed, users have the option of hiding or flagging DNA segments based on user defined criteria, and retrieve annotation information such as clone name, NCBI sequence accession number, ratio, base pair position on the chromosome, and standard deviation. SeeGH represents a novel software tool used to view and analyze array CGH data. The software gives users the ability to view the data in an overall genomic view as well as magnify specific chromosomal regions facilitating the precise localization of genetic alterations. SeeGH is easily installed and runs on Microsoft Windows 2000 or later environments.
arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

PubMed Central

Menten, Björn; Pattyn, Filip; De Preter, Katleen; Robbrecht, Piet; Michels, Evi; Buysse, Karen; Mortier, Geert; De Paepe, Anne; van Vooren, Steven; Vermeesch, Joris; Moreau, Yves; De Moor, Bart; Vermeulen, Stefan; Speleman, Frank; Vandesompele, Jo

2005-01-01

Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH). One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment) supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at . PMID:15910681
Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes

PubMed Central

Gibbons, John G.; Branco, Alan T.; Godinho, Susana A.; Yu, Shoukai; Lemos, Bernardo

2015-01-01

Tandemly repeated ribosomal DNA (rDNA) arrays are among the most evolutionary dynamic loci of eukaryotic genomes. The loci code for essential cellular components, yet exhibit extensive copy number (CN) variation within and between species. CN might be partly determined by the requirement of dosage balance between the 5S and 45S rDNA arrays. The arrays are nonhomologous, physically unlinked in mammals, and encode functionally interdependent RNA components of the ribosome. Here we show that the 5S and 45S rDNA arrays exhibit concerted CN variation (cCNV). Despite 5S and 45S rDNA elements residing on different chromosomes and lacking sequence similarity, cCNV between these loci is strong, evolutionarily conserved in humans and mice, and manifested across individual genotypes in natural populations and pedigrees. Finally, we observe that bisphenol A induces rapid and parallel modulation of 5S and 45S rDNA CN. Our observations reveal a novel mode of genome variation, indicate that natural selection contributed to the evolution and conservation of cCNV, and support the hypothesis that 5S CN is partly determined by the requirement of dosage balance with the 45S rDNA array. We suggest that human disease variation might be traced to disrupted rDNA dosage balance in the genome. PMID:25583482
PRDM9 variation strongly influences recombination hot-spot activity and meiotic instability in humans

PubMed Central

Berg, Ingrid L.; Neumann, Rita; Lam, Kwan-Wood G.; Sarbajna, Shriparna; Odenthal-Hesse, Linda; May, Celia A.; Jeffreys, Alec J.

2011-01-01

PRDM9 has recently been identified as a likely trans-regulator of meiotic recombination hot spots in humans and mice1-3. The protein contains a zinc finger array that in humans can recognise a short sequence motif associated with hot spots4, with binding to this motif possibly triggering hot-spot activity via chromatin remodelling5. We now show that variation in the zinc finger array in humans has a profound effect on sperm hot-spot activity, even at hot spots lacking the sequence motif. Very subtle changes within the array can create hot-spot non-activating and enhancing alleles, and even trigger the appearance of a new hot spot. PRDM9 thus appears to be the preeminent global regulator of hot spots in humans. Variation at this locus also influences aspects of genome instability, specifically a megabase-scale rearrangement underlying two genomic disorders6 as well as minisatellite instability7, implicating PRDM9 as a risk factor for some pathological genome rearrangements. PMID:20818382
α satellite DNA variation and function of the human centromere

PubMed Central

Sullivan, Lori L.; Chew, Kimberline

2017-01-01

ABSTRACT Genomic variation is a source of functional diversity that is typically studied in genic and non-coding regulatory regions. However, the extent of variation within noncoding portions of the human genome, particularly highly repetitive regions, and the functional consequences are not well understood. Satellite DNA, including α satellite DNA found at human centromeres, comprises up to 10% of the genome, but is difficult to study because its repetitive nature hinders contiguous sequence assemblies. We recently described variation within α satellite DNA that affects centromere function. On human chromosome 17 (HSA17), we showed that size and sequence polymorphisms within primary array D17Z1 are associated with chromosome aneuploidy and defective centromere architecture. However, HSA17 can counteract this instability by assembling the centromere at a second, “backup” array lacking variation. Here, we discuss our findings in a broader context of human centromere assembly, and highlight areas of future study to uncover links between genomic and epigenetic features of human centromeres. PMID:28406740
The Role of Constitutional Copy Number Variants in Breast Cancer

PubMed Central

Walker, Logan C.; Wiggins, George A.R.; Pearson, John F.

2015-01-01

Constitutional copy number variants (CNVs) include inherited and de novo deviations from a diploid state at a defined genomic region. These variants contribute significantly to genetic variation and disease in humans, including breast cancer susceptibility. Identification of genetic risk factors for breast cancer in recent years has been dominated by the use of genome-wide technologies, such as single nucleotide polymorphism (SNP)-arrays, with a significant focus on single nucleotide variants. To date, these large datasets have been underutilised for generating genome-wide CNV profiles despite offering a massive resource for assessing the contribution of these structural variants to breast cancer risk. Technical challenges remain in determining the location and distribution of CNVs across the human genome due to the accuracy of computational prediction algorithms and resolution of the array data. Moreover, better methods are required for interpreting the functional effect of newly discovered CNVs. In this review, we explore current and future application of SNP array technology to assess rare and common CNVs in association with breast cancer risk in humans. PMID:27600231
Multiplex PCR-based DNA array for simultaneous detection of three human herpesviruses, EVB, CMV and KSHV.

PubMed

Fujimuro, Masahiro; Nakaso, Kazuhiro; Nakashima, Kenji; Sadanari, Hidetaka; Hisanori, Inoue; Teishikata, Yasuhiro; Hayward, S Diane; Yokosawa, Hideyoshi

2006-04-01

Human lymphotropic herpesviruses, Epstein-Barr virus (EBV), cytomegalovirus (CMV) and Kaposi's sarcoma-associated herpesvirus (KSHV) are responsible for a wide variety of human diseases. Due to an increase in diseased states associated with immunosuppression, more instances of co-morbid infections with these herpesviruses have resulted in viral reactivations that have caused numerous fatalities. Therefore, the development of rapid and accurate method to detect these viruses in immunocompromised patients is vital for immediate treatment with antiviral prophylactic drugs. In this study, we developed a new multiplex PCR method coupled to DNA array hybridization, which can simultaneously detect all three human herpesviruses in one single cell sample. Multiplex PCR primers were designed to amplify specific regions of the EBV (EBER1), CMV (IE) and KSHV (LANA) viral genomes. Pre-clinical application of this method revealed that this approach is capable of detecting as few as 1 copy of the viral genomes for KSHV and CMV and 100 copies of the genome for EBV. Furthermore, this highly sensitive test showed no cross-reactivity among the three viruses and is capable of detecting both KSHV and EBV viral genomes simultaneously in the lymphoblastoid cells that have been double infected with both viruses. Thus, this array-based approach serves as a rapid and reliable diagnostic tool for clinical applications.
Characterization of canine osteosarcoma by array comparative genomic hybridization and RT-qPCR: signatures of genomic imbalance in canine osteosarcoma parallel the human counterpart.

PubMed

Angstadt, Andrea Y; Motsinger-Reif, Alison; Thomas, Rachael; Kisseberth, William C; Guillermo Couto, C; Duval, Dawn L; Nielsen, Dahlia M; Modiano, Jaime F; Breen, Matthew

2011-11-01

Osteosarcoma (OS) is the most commonly diagnosed malignant bone tumor in humans and dogs, characterized in both species by extremely complex karyotypes exhibiting high frequencies of genomic imbalance. Evaluation of genomic signatures in human OS using array comparative genomic hybridization (aCGH) has assisted in uncovering genetic mechanisms that result in disease phenotype. Previous low-resolution (10-20 Mb) aCGH analysis of canine OS identified a wide range of recurrent DNA copy number aberrations, indicating extensive genomic instability. In this study, we profiled 123 canine OS tumors by 1 Mb-resolution aCGH to generate a dataset for direct comparison with current data for human OS, concluding that several high frequency aberrations in canine and human OS are orthologous. To ensure complete coverage of gene annotation, we identified the human refseq genes that map to these orthologous aberrant dog regions and found several candidate genes warranting evaluation for OS involvement. Specifically, subsequenct FISH and qRT-PCR analysis of RUNX2, TUSC3, and PTEN indicated that expression levels correlated with genomic copy number status, showcasing RUNX2 as an OS associated gene and TUSC3 as a possible tumor suppressor candidate. Together these data demonstrate the ability of genomic comparative oncology to identify genetic abberations which may be important for OS progression. Large scale screening of genomic imbalance in canine OS further validates the use of the dog as a suitable model for human cancers, supporting the idea that dysregulation discovered in canine cancers will provide an avenue for complementary study in human counterparts. Copyright © 2011 Wiley-Liss, Inc.
Array-Based Comparative Genomic Hybridization for the Genomewide Detection of Submicroscopic Chromosomal Abnormalities

PubMed Central

Vissers, Lisenka E. L. M. ; de Vries, Bert B. A. ; Osoegawa, Kazutoyo ; Janssen, Irene M. ; Feuth, Ton ; Choy, Chik On ; Straatman, Huub ; van der Vliet, Walter ; Huys, Erik H. L. P. G. ; van Rijk, Anke ; Smeets, Dominique ; van Ravenswaaij-Arts, Conny M. A. ; Knoers, Nine V. ; van der Burgt, Ineke ; de Jong, Pieter J. ; Brunner, Han G. ; van Kessel, Ad Geurts ; Schoenmakers, Eric F. P. M. ; Veltman, Joris A.

2003-01-01

Microdeletions and microduplications, not visible by routine chromosome analysis, are a major cause of human malformation and mental retardation. Novel high-resolution, whole-genome technologies can improve the diagnostic detection rate of these small chromosomal abnormalities. Array-based comparative genomic hybridization allows such a high-resolution screening by hybridizing differentially labeled test and reference DNAs to arrays consisting of thousands of genomic clones. In this study, we tested the diagnostic capacity of this technology using ∼3,500 flourescent in situ hybridization–verified clones selected to cover the genome with an average of 1 clone per megabase (Mb). The sensitivity and specificity of the technology were tested in normal-versus-normal control experiments and through the screening of patients with known microdeletion syndromes. Subsequently, a series of 20 cytogenetically normal patients with mental retardation and dysmorphisms suggestive of a chromosomal abnormality were analyzed. In this series, three microdeletions and two microduplications were identified and validated. Two of these genomic changes were identified also in one of the parents, indicating that these are large-scale genomic polymorphisms. Deletions and duplications as small as 1 Mb could be reliably detected by our approach. The percentage of false-positive results was reduced to a minimum by use of a dye-swap-replicate analysis, all but eliminating the need for laborious validation experiments and facilitating implementation in a routine diagnostic setting. This high-resolution assay will facilitate the identification of novel genes involved in human mental retardation and/or malformation syndromes and will provide insight into the flexibility and plasticity of the human genome. PMID:14628292

Evaluation of the efficacy of constitutional array-based comparative genomic hybridization in the diagnosis of aneuploidy using genomic and amplified DNA.

PubMed

Tan, Niap H; Palmer, Rodger; Wang, Rubin

2010-02-01

Array-based comparative genomic hybridization (array CGH) is a new molecular technique that has the potential to revolutionize cytogenetics. However, use of high resolution array CGH in the clinical setting is plagued by the problem of widespread copy number variations (CNV) in the human genome. Constitutional microarray, containing only clones that interrogate regions of known constitutional syndromes, may circumvent the dilemma of detecting CNV of unknown clinical significance. The present study investigated the efficacy of constitutional microarray in the diagnosis of trisomy. Test samples included genomic DNA from trisomic cell lines, amplification products of 50 ng of genomic DNA and whole genome amplification products of single cells. DNA amplification was achieved by means of multiple displacement amplification (MDA) over 16 h. The trisomic and sex chromosomes copy number imbalances in the genomic DNA were correctly identified by the constitutional microarrays. However, there was a failure to detect the trisomy in the amplification products of 50 ng of genomic DNA and whole genome amplification products of single cells. Using carefully selected clones, Spectral Genomics constitutional microarray was able to detect the chromosomal copy number imbalances in genomic DNA without the confounding effects of CNV. The diagnostic failure in amplified DNA samples could be attributed to the amplification process. The MDA duration of 16 h generated excessive amount of biases and shortening the duration might minimize the problem.
Genome analysis of Daldinia eschscholtzii strains UM 1400 and UM 1020, wood-decaying fungi isolated from human hosts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chan, Chai Ling; Yew, Su Mei; Ngeow, Yun Fong

Background: Daldinia eschscholtzii is a wood-inhabiting fungus that causes wood decay under certain conditions. It has a broad host range and produces a large repertoire of potentially bioactive compounds. However, there is no extensive genome analysis on this fungal species. Results: Two fungal isolates (UM 1400 and UM 1020) from human specimens were identified as Daldinia eschscholtzii by morphological features and ITS-based phylogenetic analysis. Both genomes were similar in size with 10,822 predicted genes in UM 1400 (35.8 Mb) and 11,120 predicted genes in UM 1020 (35.5 Mb). A total of 751 gene families were shared among both UM isolates,more » including gene families associated with fungus-host interactions. In the CAZyme comparative analysis, both genomes were found to contain arrays of CAZyme related to plant cell wall degradation. Genes encoding secreted peptidases were found in the genomes, which encode for the peptidases involved in the degradation of structural proteins in plant cell wall. In addition, arrays of secondary metabolite backbone genes were identified in both genomes, indicating of their potential to produce bioactive secondary metabolites. Both genomes also contained an abundance of gene encoding signaling components, with three proposed MAPK cascades involved in cell wall integrity, osmoregulation, and mating/filamentation. Besides genomic evidence for degrading capability, both isolates also harbored an array of genes encoding stress response proteins that are potentially significant for adaptation to living in the hostile environments. In conclusion: Our genomic studies provide further information for the biological understanding of the D. eschscholtzii and suggest that these wood-decaying fungi are also equipped for adaptation to adverse environments in the human host.« less
Genome analysis of Daldinia eschscholtzii strains UM 1400 and UM 1020, wood-decaying fungi isolated from human hosts

DOE PAGES

Chan, Chai Ling; Yew, Su Mei; Ngeow, Yun Fong; ...

2015-11-18

Background: Daldinia eschscholtzii is a wood-inhabiting fungus that causes wood decay under certain conditions. It has a broad host range and produces a large repertoire of potentially bioactive compounds. However, there is no extensive genome analysis on this fungal species. Results: Two fungal isolates (UM 1400 and UM 1020) from human specimens were identified as Daldinia eschscholtzii by morphological features and ITS-based phylogenetic analysis. Both genomes were similar in size with 10,822 predicted genes in UM 1400 (35.8 Mb) and 11,120 predicted genes in UM 1020 (35.5 Mb). A total of 751 gene families were shared among both UM isolates,more » including gene families associated with fungus-host interactions. In the CAZyme comparative analysis, both genomes were found to contain arrays of CAZyme related to plant cell wall degradation. Genes encoding secreted peptidases were found in the genomes, which encode for the peptidases involved in the degradation of structural proteins in plant cell wall. In addition, arrays of secondary metabolite backbone genes were identified in both genomes, indicating of their potential to produce bioactive secondary metabolites. Both genomes also contained an abundance of gene encoding signaling components, with three proposed MAPK cascades involved in cell wall integrity, osmoregulation, and mating/filamentation. Besides genomic evidence for degrading capability, both isolates also harbored an array of genes encoding stress response proteins that are potentially significant for adaptation to living in the hostile environments. In conclusion: Our genomic studies provide further information for the biological understanding of the D. eschscholtzii and suggest that these wood-decaying fungi are also equipped for adaptation to adverse environments in the human host.« less
The Past, Present, and Future of Human Centromere Genomics

PubMed Central

Aldrup-MacDonald, Megan E.; Sullivan, Beth A.

2014-01-01

The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function. PMID:24683489
Array comparative genomic hybridization and computational genome annotation in constitutional cytogenetics: suggesting candidate genes for novel submicroscopic chromosomal imbalance syndromes.

PubMed

Van Vooren, Steven; Coessens, Bert; De Moor, Bart; Moreau, Yves; Vermeesch, Joris R

2007-09-01

Genome-wide array comparative genomic hybridization screening is uncovering pathogenic submicroscopic chromosomal imbalances in patients with developmental disorders. In those patients, imbalances appear now to be scattered across the whole genome, and most patients carry different chromosomal anomalies. Screening patients with developmental disorders can be considered a forward functional genome screen. The imbalances pinpoint the location of genes that are involved in human development. Because most imbalances encompass regions harboring multiple genes, the challenge is to (1) identify those genes responsible for the specific phenotype and (2) disentangle the role of the different genes located in an imbalanced region. In this review, we discuss novel tools and relevant databases that have recently been developed to aid this gene discovery process. Identification of the functional relevance of genes will not only deepen our understanding of human development but will, in addition, aid in the data interpretation and improve genetic counseling.
PRDM9 variation strongly influences recombination hot-spot activity and meiotic instability in humans.

PubMed

Berg, Ingrid L; Neumann, Rita; Lam, Kwan-Wood G; Sarbajna, Shriparna; Odenthal-Hesse, Linda; May, Celia A; Jeffreys, Alec J

2010-10-01

PRDM9 has recently been identified as a likely trans regulator of meiotic recombination hot spots in humans and mice. PRDM9 contains a zinc finger array that, in humans, can recognize a short sequence motif associated with hot spots, with binding to this motif possibly triggering hot-spot activity via chromatin remodeling. We now report that human genetic variation at the PRDM9 locus has a strong effect on sperm hot-spot activity, even at hot spots lacking the sequence motif. Subtle changes within the zinc finger array can create hot-spot nonactivating or enhancing variants and can even trigger the appearance of a new hot spot, suggesting that PRDM9 is a major global regulator of hot spots in humans. Variation at the PRDM9 locus also influences aspects of genome instability-specifically, a megabase-scale rearrangement underlying two genomic disorders as well as minisatellite instability-implicating PRDM9 as a risk factor for some pathological genome rearrangements.
Application of Nexus copy number software for CNV detection and analysis.

PubMed

Darvishi, Katayoon

2010-04-01

Among human structural genomic variation, copy number variants (CNVs) are the most frequently known component, comprised of gains/losses of DNA segments that are generally 1 kb in length or longer. Array-based comparative genomic hybridization (aCGH) has emerged as a powerful tool for detecting genomic copy number variants (CNVs). With the rapid increase in the density of array technology and with the adaptation of new high-throughput technology, a reliable and computationally scalable method for accurate mapping of recurring DNA copy number aberrations has become a main focus in research. Here we introduce Nexus Copy Number software, a platform-independent tool, to analyze the output files of all types of commercial and custom-made comparative genomic hybridization (CGH) and single-nucleotide polymorphism (SNP) arrays, such as those manufactured by Affymetrix, Agilent Technologies, Illumina, and Roche NimbleGen. It also supports data generated by various array image-analysis software tools such as GenePix, ImaGene, and BlueFuse. (c) 2010 by John Wiley & Sons, Inc.
ChIP-chip.

PubMed

Kim, Tae Hoon; Dekker, Job

2018-05-01

ChIP-chip can be used to analyze protein-DNA interactions in a region-wide and genome-wide manner. DNA microarrays contain PCR products or oligonucleotide probes that are designed to represent genomic sequences. Identification of genomic sites that interact with a specific protein is based on competitive hybridization of the ChIP-enriched DNA and the input DNA to DNA microarrays. The ChIP-chip protocol can be divided into two main sections: Amplification of ChIP DNA and hybridization of ChIP DNA to arrays. A large amount of DNA is required to hybridize to DNA arrays, and hybridization to a set of multiple commercial arrays that represent the entire human genome requires two rounds of PCR amplifications. The relative hybridization intensity of ChIP DNA and that of the input DNA is used to determine whether the probe sequence is a potential site of protein-DNA interaction. Resolution of actual genomic sites bound by the protein is dependent on the size of the chromatin and on the genomic distance between the probes on the array. As with expression profiling using gene chips, ChIP-chip experiments require multiple replicates for reliable statistical measure of protein-DNA interactions. © 2018 Cold Spring Harbor Laboratory Press.
Genomic variation within alpha satellite DNA influences centromere location on human chromosomes with metastable epialleles

PubMed Central

Aldrup-MacDonald, Megan E.; Kuo, Molly E.; Sullivan, Lori L.; Chew, Kimberline

2016-01-01

Alpha satellite is a tandemly organized type of repetitive DNA that comprises 5% of the genome and is found at all human centromeres. A defined number of 171-bp monomers are organized into chromosome-specific higher-order repeats (HORs) that are reiterated thousands of times. At least half of all human chromosomes have two or more distinct HOR alpha satellite arrays within their centromere regions. We previously showed that the two alpha satellite arrays of Homo sapiens Chromosome 17 (HSA17), D17Z1 and D17Z1-B, behave as centromeric epialleles, that is, the centromere, defined by chromatin containing the centromeric histone variant CENPA and recruitment of other centromere proteins, can form at either D17Z1 or D17Z1-B. Some individuals in the human population are functional heterozygotes in that D17Z1 is the active centromere on one homolog and D17Z1-B is active on the other. In this study, we aimed to understand the molecular basis for how centromere location is determined on HSA17. Specifically, we focused on D17Z1 genomic variation as a driver of epiallele formation. We found that D17Z1 arrays that are predominantly composed of HOR size and sequence variants were functionally less competent. They either recruited decreased amounts of the centromere-specific histone variant CENPA and the HSA17 was mitotically unstable, or alternatively, the centromere was assembled at D17Z1-B and the HSA17 was stable. Our study demonstrates that genomic variation within highly repetitive, noncoding DNA of human centromere regions has a pronounced impact on genome stability and basic chromosomal function. PMID:27510565
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tu, Q.; Deng, Ye; Lin, Lu

Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
Array-based assay detects genome-wide 5-mC and 5-hmC in the brains of humans, non-human primates, and mice.

PubMed

Chopra, Pankaj; Papale, Ligia A; White, Andrew T J; Hatch, Andrea; Brown, Ryan M; Garthwaite, Mark A; Roseboom, Patrick H; Golos, Thaddeus G; Warren, Stephen T; Alisch, Reid S

2014-02-13

Methylation on the fifth position of cytosine (5-mC) is an essential epigenetic mark that is linked to both normal neurodevelopment and neurological diseases. The recent identification of another modified form of cytosine, 5-hydroxymethylcytosine (5-hmC), in both stem cells and post-mitotic neurons, raises new questions as to the role of this base in mediating epigenetic effects. Genomic studies of these marks using model systems are limited, particularly with array-based tools, because the standard method of detecting DNA methylation cannot distinguish between 5-mC and 5-hmC and most methods have been developed to only survey the human genome. We show that non-human data generated using the optimization of a widely used human DNA methylation array, designed only to detect 5-mC, reproducibly distinguishes tissue types within and between chimpanzee, rhesus, and mouse, with correlations near the human DNA level (R(2) > 0.99). Genome-wide methylation analysis, using this approach, reveals 6,102 differentially methylated loci between rhesus placental and fetal tissues with pathways analysis significantly overrepresented for developmental processes. Restricting the analysis to oncogenes and tumor suppressor genes finds 76 differentially methylated loci, suggesting that rhesus placental tissue carries a cancer epigenetic signature. Similarly, adapting the assay to detect 5-hmC finds highly reproducible 5-hmC levels within human, rhesus, and mouse brain tissue that is species-specific with a hierarchical abundance among the three species (human > rhesus > mouse). Annotation of 5-hmC with respect to gene structure reveals a significant prevalence in the 3'UTR and an association with chromatin-related ontological terms, suggesting an epigenetic feedback loop mechanism for 5-hmC. Together, these data show that this array-based methylation assay is generalizable to all mammals for the detection of both 5-mC and 5-hmC, greatly improving the utility of mammalian model systems to study the role of epigenetics in human health, disease, and evolution.
Genome-to-genome analysis highlights the effect of the human innate and adaptive immune systems on the hepatitis C virus.

PubMed

Ansari, M Azim; Pedergnana, Vincent; L C Ip, Camilla; Magri, Andrea; Von Delft, Annette; Bonsall, David; Chaturvedi, Nimisha; Bartha, Istvan; Smith, David; Nicholson, George; McVean, Gilean; Trebes, Amy; Piazza, Paolo; Fellay, Jacques; Cooke, Graham; Foster, Graham R; Hudson, Emma; McLauchlan, John; Simmonds, Peter; Bowden, Rory; Klenerman, Paul; Barnes, Eleanor; Spencer, Chris C A

2017-05-01

Outcomes of hepatitis C virus (HCV) infection and treatment depend on viral and host genetic factors. Here we use human genome-wide genotyping arrays and new whole-genome HCV viral sequencing technologies to perform a systematic genome-to-genome study of 542 individuals who were chronically infected with HCV, predominantly genotype 3. We show that both alleles of genes encoding human leukocyte antigen molecules and genes encoding components of the interferon lambda innate immune system drive viral polymorphism. Additionally, we show that IFNL4 genotypes determine HCV viral load through a mechanism dependent on a specific amino acid residue in the HCV NS5A protein. These findings highlight the interplay between the innate immune system and the viral genome in HCV control.
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2013-10-01

Microarray intensities were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference. This...additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal of these studies is to expand our number of genomic profiles (DNA and...mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset with which to identify key candidate oncogenes, tumor suppressor genes
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2012-10-01

Microarray intensities were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference...identify candidate drug targets of CPC. Task 1: Generation of additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal...of these studies is to expand our number of genomic profiles (DNA and mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2011-10-01

were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference. This analysis highlights...Task 1: Generation of additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal of these studies is to expand our...number of genomic profiles (DNA and mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset with which to identify key candidate
Golden Gate Assembly of CRISPR gRNA expression array for simultaneously targeting multiple genes.

PubMed

Vad-Nielsen, Johan; Lin, Lin; Bolund, Lars; Nielsen, Anders Lade; Luo, Yonglun

2016-11-01

The engineered CRISPR/Cas9 technology has developed as the most efficient and broadly used genome editing tool. However, simultaneously targeting multiple genes (or genomic loci) in the same individual cells using CRISPR/Cas9 remain one technical challenge. In this article, we have developed a Golden Gate Assembly method for the generation of CRISPR gRNA expression arrays, thus enabling simultaneous gene targeting. Using this method, the generation of CRISPR gRNA expression array can be accomplished in 2 weeks, and contains up to 30 gRNA expression cassettes. We demonstrated in the study that simultaneously targeting 10 genomic loci or simultaneously inhibition of multiple endogenous genes could be achieved using the multiplexed gRNA expression array vector in human cells. The complete set of plasmids is available through the non-profit plasmid repository Addgene.
Linkage disequilibrium and signatures of positive selection around LINE-1 retrotransposons in the human genome.

PubMed

Kuhn, Alexandre; Ong, Yao Min; Cheng, Ching-Yu; Wong, Tien Yin; Quake, Stephen R; Burkholder, William F

2014-06-03

Insertions of the human-specific subfamily of LINE-1 (L1) retrotransposon are highly polymorphic across individuals and can critically influence the human transcriptome. We hypothesized that L1 insertions could represent genetic variants determining important human phenotypic traits, and performed an integrated analysis of L1 elements and single nucleotide polymorphisms (SNPs) in several human populations. We found that a large fraction of L1s were in high linkage disequilibrium with their surrounding genomic regions and that they were well tagged by SNPs. However, L1 variants were only partially captured by SNPs on standard SNP arrays, so that their potential phenotypic impact would be frequently missed by SNP array-based genome-wide association studies. We next identified potential phenotypic effects of L1s by looking for signatures of natural selection linked to L1 insertions; significant extended haplotype homozygosity was detected around several L1 insertions. This finding suggests that some of these L1 insertions may have been the target of recent positive selection.
TOXICOGENOMICS AND HUMAN DISEASE RISK ASSESSMENT

EPA Science Inventory

Toxicogenomics and Human Disease Risk Assessment.

Complete sequencing of human and other genomes, availability of large-scale gene
expression arrays with ever-increasing numbers of genes displayed, and steady
improvements in protein expression technology can hav...
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Learning about human population history from ancient and modern genomes.

PubMed

Stoneking, Mark; Krause, Johannes

2011-08-18

Genome-wide data, both from SNP arrays and from complete genome sequencing, are becoming increasingly abundant and are now even available from extinct hominins. These data are providing new insights into population history; in particular, when combined with model-based analytical approaches, genome-wide data allow direct testing of hypotheses about population history. For example, genome-wide data from both contemporary populations and extinct hominins strongly support a single dispersal of modern humans from Africa, followed by two archaic admixture events: one with Neanderthals somewhere outside Africa and a second with Denisovans that (so far) has only been detected in New Guinea. These new developments promise to reveal new stories about human population history, without having to resort to storytelling.

Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations.

PubMed

Pengelly, Reuben J; Tapper, William; Gibson, Jane; Knut, Marcin; Tearle, Rick; Collins, Andrew; Ennis, Sarah

2015-09-03

An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution. We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure. WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.
GENE EXPRESSION PATTERNS ASSOCIATED WITH INFERTILITY IN HUMAN AND RODENT MODELS

EPA Science Inventory

Modern genomic technologies such as DNA arrays provide the means to investigate molecular interactions at an unprecedented level, and arrays have been used to carry out gene expression profiling as a means of identifying candidate genes involved in molecular mechanisms underlying...
Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

PubMed

Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan

2016-01-01

Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. Copyright © 2016 by the Genetics Society of America.
Methylation array data can simultaneously identify individuals and convey protected health information: an unrecognized ethical concern.

PubMed

Philibert, Robert A; Terry, Nicolas; Erwin, Cheryl; Philibert, Winter J; Beach, Steven Rh; Brody, Gene H

2014-01-01

Genome-wide methylation arrays are increasingly used tools in studies of complex medical disorders. Because of their expense and potential utility to the scientific community, current federal policy dictates that data from these arrays, like those from genome-wide genotyping arrays, be deposited in publicly available databases. Unlike the genotyping information, access to the expression data is not restricted. An underlying supposition in the current nonrestricted access to methylation data is the belief that protected health and personal identifying information cannot be simultaneously extracted from these arrays. In this communication, we analyze methylation data from the Illumina HumanMethylation450 array and show that genotype at 1,069 highly informative loci, and both alcohol and smoking consumption information, can be derived from the array data. We conclude that both potentially personally identifying information and substance-use histories can be simultaneously derived from methylation array data. Because access to genetic information about a database subject or one of their relatives is critical to the de-identification process, this risk of de-identification is limited at the current time. We propose that access to genome-wide methylation data be restricted to institutionally approved investigators who accede to data use agreements prohibiting re-identification.
Prenatal Diagnosis of DNA Copy Number Variations by Genomic Single-Nucleotide Polymorphism Array in Fetuses with Congenital Heart Defects.

PubMed

Tang, Shaohua; Lv, Jiaojiao; Chen, Xiangnan; Bai, Lili; Li, Huanzheng; Chen, Chong; Wang, Ping; Xu, Xueqin; Lu, Jianxin

2016-01-01

To evaluate the usefulness of single-nucleotide polymorphism (SNP) array for prenatal genetic diagnosis of congenital heart defect (CHD), we used this approach to detect clinically significant copy number variants (CNVs) in fetuses with CHDs. A HumanCytoSNP-12 array was used to detect genomic samples obtained from 39 fetuses that exhibited cardiovascular abnormalities on ultrasound and had a normal karyotype. The relationship between CNVs and CHDs was identified by using genotype-phenotype comparisons and searching of chromosomal databases. All clinically significant CNVs were confirmed by real-time PCR. CNVs were detected in 38/39 (97.4%) fetuses: variants of unknown significance were detected in 2/39 (5.1%), and clinically significant CNVs were identified in 7/39 (17.9%). In 3 of the 7 fetuses with clinically significant CNVs, 3 rare and previously undescribed CNVs were detected, and these CNVs encompassed the CHD candidate genes FLNA (Xq28 dup), BCOR (Xp11.4 dup), and RBL2 (16q12.2 del). Compared with conventional cytogenetic genomics, SNP array analysis provides significantly improved detection of submicroscopic genomic aberrations in pregnancies with CHDs. Based on these results, we propose that genomic SNP array is an effective method which could be used in the prenatal diagnostic test to assist genetic counseling for pregnancies with CHDs. © 2015 S. Karger AG, Basel.
Identification of novel non-coding small RNAs from Streptococcus pneumoniae TIGR4 using high-resolution genome tiling arrays

PubMed Central

2010-01-01

Background The identification of non-coding transcripts in human, mouse, and Escherichia coli has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Streptococcus pneumoniae (pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution. Results Here, we describe a high-resolution transcription map of the S. pneumoniae clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to S. pneumoniae genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome. Conclusions In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing S. pneumoniae physiology and virulence. PMID:20525227
Genomic Hypomethylation in the Human Germline Associates with Selective Structural Mutability in the Human Genome

PubMed Central

Li, Jian; Harris, R. Alan; Cheung, Sau Wai; Coarfa, Cristian; Jeong, Mira; Goodell, Margaret A.; White, Lisa D.; Patel, Ankita; Kang, Sung-Hae; Shaw, Chad; Chinault, A. Craig; Gambin, Tomasz; Gambin, Anna; Lupski, James R.; Milosavljevic, Aleksandar

2012-01-01

The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR–mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease. PMID:22615578
Isolation of human simple repeat loci by hybridization selection.

PubMed

Armour, J A; Neumann, R; Gobert, S; Jeffreys, A J

1994-04-01

We have isolated short tandem repeat arrays from the human genome, using a rapid method involving filter hybridization to enrich for tri- or tetranucleotide tandem repeats. About 30% of clones from the enriched library cross-hybridize with probes containing trimeric or tetrameric tandem arrays, facilitating the rapid isolation of large numbers of clones. In an initial analysis of 54 clones, 46 different tandem arrays were identified. Analysis of these tandem repeat loci by PCR showed that 24 were polymorphic in length; substantially higher levels of polymorphism were displayed by the tetrameric repeat loci isolated than by the trimeric repeats. Primary mapping of these loci by linkage analysis showed that they derive from 17 chromosomes, including the X chromosome. We anticipate the use of this strategy for the efficient isolation of tandem repeats from other sources of genomic DNA, including DNA from flow-sorted chromosomes, and from other species.
A comprehensive transcript index of the human genome generated using microarrays and computational approaches

PubMed Central

Schadt, Eric E; Edwards, Stephen W; GuhaThakurta, Debraj; Holder, Dan; Ying, Lisa; Svetnik, Vladimir; Leonardson, Amy; Hart, Kyle W; Russell, Archie; Li, Guoya; Cavet, Guy; Castle, John; McDonagh, Paul; Kan, Zhengyan; Chen, Ronghua; Kasarskis, Andrew; Margarint, Mihai; Caceres, Ramon M; Johnson, Jason M; Armour, Christopher D; Garrett-Engele, Philip W; Tsinoremas, Nicholas F; Shoemaker, Daniel D

2004-01-01

Background Computational and microarray-based experimental approaches were used to generate a comprehensive transcript index for the human genome. Oligonucleotide probes designed from approximately 50,000 known and predicted transcript sequences from the human genome were used to survey transcription from a diverse set of 60 tissues and cell lines using ink-jet microarrays. Further, expression activity over at least six conditions was more generally assessed using genomic tiling arrays consisting of probes tiled through a repeat-masked version of the genomic sequence making up chromosomes 20 and 22. Results The combination of microarray data with extensive genome annotations resulted in a set of 28,456 experimentally supported transcripts. This set of high-confidence transcripts represents the first experimentally driven annotation of the human genome. In addition, the results from genomic tiling suggest that a large amount of transcription exists outside of annotated regions of the genome and serves as an example of how this activity could be measured on a genome-wide scale. Conclusions These data represent one of the most comprehensive assessments of transcriptional activity in the human genome and provide an atlas of human gene expression over a unique set of gene predictions. Before the annotation of the human genome is considered complete, however, the previously unannotated transcriptional activity throughout the genome must be fully characterized. PMID:15461792
Surface invasive cleavage assay on a maskless light-directed diamond DNA microarray for genome-wide human SNP mapping.

PubMed

Nie, Bei; Yang, Min; Fu, Weiling; Liang, Zhiqing

2015-07-07

The surface invasive cleavage assay, because of its innate accuracy and ability for self-signal amplification, provides a potential route for the mapping of hundreds of thousands of human SNP sites. However, its performance on a high density DNA array has not yet been established, due to the unusual "hairpin" probe design on the microarray and the lack of chemical stability of commercially available substrates. Here we present an applicable method to implement a nanocrystalline diamond thin film as an alternative substrate for fabricating an addressable DNA array using maskless light-directed photochemistry, producing the most chemically stable and biocompatible system for genetic analysis and enzymatic reactions. The surface invasive cleavage reaction, followed by degenerated primer ligation and post-rolling circle amplification is consecutively performed on the addressable diamond DNA array, accurately mapping SNP sites from PCR-amplified human genomic target DNA. Furthermore, a specially-designed DNA array containing dual probes in the same pixel is fabricated by following a reverse light-directed DNA synthesis protocol. This essentially enables us to decipher thousands of SNP alleles in a single-pot reaction by the simple addition of enzyme, target and reaction buffers.
Ribosomal DNA copy number amplification and loss in human cancers is linked to tumor genetic context, nucleolus activity, and proliferation

PubMed Central

2017-01-01

Ribosomal RNAs (rRNAs) are transcribed from two multicopy DNA arrays: the 5S ribosomal DNA (rDNA) array residing in a single human autosome and the 45S rDNA array residing in five human autosomes. The arrays are among the most variable segments of the genome, exhibit concerted copy number variation (cCNV), encode essential components of the ribosome, and modulate global gene expression. Here we combined whole genome data from >700 tumors and paired normal tissues to provide a portrait of rDNA variation in human tissues and cancers of diverse mutational signatures, including stomach and lung adenocarcinomas, ovarian cancers, and others of the TCGA panel. We show that cancers undergo coupled 5S rDNA array expansion and 45S rDNA loss that is accompanied by increased estimates of proliferation rate and nucleolar activity. These somatic changes in rDNA CN occur in a background of over 10-fold naturally occurring rDNA CN variation across individuals and cCNV of 5S-45S arrays in some but not all tissues. Analysis of genetic context revealed associations between cancer rDNA CN amplification or loss and the presence of specific somatic alterations, including somatic SNPs and copy number gain/losses in protein coding genes across the cancer genome. For instance, somatic inactivation of the tumor suppressor gene TP53 emerged with a strong association with coupled 5S expansion / 45S loss in several cancers. Our results uncover frequent and contrasting changes in the 5S and 45S rDNA along rapidly proliferating cell lineages with high nucleolar activity. We suggest that 5S rDNA amplification facilitates increased proliferation, nucleolar activity, and ribosomal synthesis in cancer, whereas 45S rDNA loss emerges as a byproduct of transcription-replication conflict in rapidly replicating tumor cells. The observations raise the prospects of using the rDNA arrays as re-emerging targets for the design of novel strategies in cancer therapy. PMID:28880866
HIA: a genome mapper using hybrid index-based sequence alignment.

PubMed

Choi, Jongpill; Park, Kiejung; Cho, Seong Beom; Chung, Myungguen

2015-01-01

A number of alignment tools have been developed to align sequencing reads to the human reference genome. The scale of information from next-generation sequencing (NGS) experiments, however, is increasing rapidly. Recent studies based on NGS technology have routinely produced exome or whole-genome sequences from several hundreds or thousands of samples. To accommodate the increasing need of analyzing very large NGS data sets, it is necessary to develop faster, more sensitive and accurate mapping tools. HIA uses two indices, a hash table index and a suffix array index. The hash table performs direct lookup of a q-gram, and the suffix array performs very fast lookup of variable-length strings by exploiting binary search. We observed that combining hash table and suffix array (hybrid index) is much faster than the suffix array method for finding a substring in the reference sequence. Here, we defined the matching region (MR) is a longest common substring between a reference and a read. And, we also defined the candidate alignment regions (CARs) as a list of MRs that is close to each other. The hybrid index is used to find candidate alignment regions (CARs) between a reference and a read. We found that aligning only the unmatched regions in the CAR is much faster than aligning the whole CAR. In benchmark analysis, HIA outperformed in mapping speed compared with the other aligners, without significant loss of mapping accuracy. Our experiments show that the hybrid of hash table and suffix array is useful in terms of speed for mapping NGS sequencing reads to the human reference genome sequence. In conclusion, our tool is appropriate for aligning massive data sets generated by NGS sequencing.
Development and application of a novel genome-wide SNP array reveals domestication history in soybean

PubMed Central

Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

2016-01-01

Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884
Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

PubMed

Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

2016-02-09

Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.
Oligonucleotide arrays vs. metaphase-comparative genomic hybridisation and BAC arrays for single-cell analysis: first applications to preimplantation genetic diagnosis for Robertsonian translocation carriers.

PubMed

Ramos, Laia; del Rey, Javier; Daina, Gemma; García-Aragonés, Manel; Armengol, Lluís; Fernandez-Encinas, Alba; Parriego, Mònica; Boada, Montserrat; Martinez-Passarell, Olga; Martorell, Maria Rosa; Casagran, Oriol; Benet, Jordi; Navarro, Joaquima

2014-01-01

Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH) and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈ 20 kb). Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14)(q10;q10). Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers.
Oligonucleotide Arrays vs. Metaphase-Comparative Genomic Hybridisation and BAC Arrays for Single-Cell Analysis: First Applications to Preimplantation Genetic Diagnosis for Robertsonian Translocation Carriers

PubMed Central

Ramos, Laia; del Rey, Javier; Daina, Gemma; García-Aragonés, Manel; Armengol, Lluís; Fernandez-Encinas, Alba; Parriego, Mònica; Boada, Montserrat; Martinez-Passarell, Olga; Martorell, Maria Rosa; Casagran, Oriol; Benet, Jordi; Navarro, Joaquima

2014-01-01

Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH) and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈20 kb). Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14)(q10;q10). Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers. PMID:25415307
Characterization of Human Cancer Cell Lines by Reverse-phase Protein Arrays* | Office of Cancer Genomics

Cancer.gov

Cancer cell lines are major model systems for mechanistic investigation and drug development. However, protein expression data linked to high-quality DNA, RNA, and drug-screening data have not been available across a large number of cancer cell lines. Using reverse-phase protein arrays, we measured expression levels of ∼230 key cancer-related proteins in >650 independent cell lines, many of which have publically available genomic, transcriptomic, and drug-screening data.
Exploiting sequence similarity to validate the sensitivity of SNP arrays in detecting fine-scaled copy number variations.

PubMed

Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam

2010-04-15

High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.
Unexplored therapeutic opportunities in the human genome.

PubMed

Oprea, Tudor I; Bologa, Cristian G; Brunak, Søren; Campbell, Allen; Gan, Gregory N; Gaulton, Anna; Gomez, Shawn M; Guha, Rajarshi; Hersey, Anne; Holmes, Jayme; Jadhav, Ajit; Jensen, Lars Juhl; Johnson, Gary L; Karlson, Anneli; Leach, Andrew R; Ma'ayan, Avi; Malovannaya, Anna; Mani, Subramani; Mathias, Stephen L; McManus, Michael T; Meehan, Terrence F; von Mering, Christian; Muthas, Daniel; Nguyen, Dac-Trung; Overington, John P; Papadatos, George; Qin, Jun; Reich, Christian; Roth, Bryan L; Schürer, Stephan C; Simeonov, Anton; Sklar, Larry A; Southall, Noel; Tomita, Susumu; Tudose, Ilinca; Ursu, Oleg; Vidovic, Dušica; Waller, Anna; Westergaard, David; Yang, Jeremy J; Zahoránszky-Köhalmi, Gergely

2018-05-01

A large proportion of biomedical research and the development of therapeutics is focused on a small fraction of the human genome. In a strategic effort to map the knowledge gaps around proteins encoded by the human genome and to promote the exploration of currently understudied, but potentially druggable, proteins, the US National Institutes of Health launched the Illuminating the Druggable Genome (IDG) initiative in 2014. In this article, we discuss how the systematic collection and processing of a wide array of genomic, proteomic, chemical and disease-related resource data by the IDG Knowledge Management Center have enabled the development of evidence-based criteria for tracking the target development level (TDL) of human proteins, which indicates a substantial knowledge deficit for approximately one out of three proteins in the human proteome. We then present spotlights on the TDL categories as well as key drug target classes, including G protein-coupled receptors, protein kinases and ion channels, which illustrate the nature of the unexplored opportunities for biomedical research and therapeutic development.
Genome-wide comparative diversity uncovers multiple targets of selection for improvement in hexaploid wheat landrace and cultivars

USDA-ARS?s Scientific Manuscript database

Domesticated crops have experienced strong human-driven selection aimed at the development of improved varieties adapted to local conditions. To detect regions of the wheat genome subject to selection during improvement, we developed a high-throughput array to interrogate 9,000 gene-associated DNA m...

Detection of pathogenic copy number variants in children with idiopathic intellectual disability using 500 K SNP array genomic hybridization

PubMed Central

2009-01-01

Background Array genomic hybridization is being used clinically to detect pathogenic copy number variants in children with intellectual disability and other birth defects. However, there is no agreement regarding the kind of array, the distribution of probes across the genome, or the resolution that is most appropriate for clinical use. Results We performed 500 K Affymetrix GeneChip® array genomic hybridization in 100 idiopathic intellectual disability trios, each comprised of a child with intellectual disability of unknown cause and both unaffected parents. We found pathogenic genomic imbalance in 16 of these 100 individuals with idiopathic intellectual disability. In comparison, we had found pathogenic genomic imbalance in 11 of 100 children with idiopathic intellectual disability in a previous cohort who had been studied by 100 K GeneChip® array genomic hybridization. Among 54 intellectual disability trios selected from the previous cohort who were re-tested with 500 K GeneChip® array genomic hybridization, we identified all 10 previously-detected pathogenic genomic alterations and at least one additional pathogenic copy number variant that had not been detected with 100 K GeneChip® array genomic hybridization. Many benign copy number variants, including one that was de novo, were also detected with 500 K array genomic hybridization, but it was possible to distinguish the benign and pathogenic copy number variants with confidence in all but 3 (1.9%) of the 154 intellectual disability trios studied. Conclusion Affymetrix GeneChip® 500 K array genomic hybridization detected pathogenic genomic imbalance in 10 of 10 patients with idiopathic developmental disability in whom 100 K GeneChip® array genomic hybridization had found genomic imbalance, 1 of 44 patients in whom 100 K GeneChip® array genomic hybridization had found no abnormality, and 16 of 100 patients who had not previously been tested. Effective clinical interpretation of these studies requires considerable skill and experience. PMID:19917086
The Diversity of REcent and Ancient huMan (DREAM): A New Microarray for Genetic Anthropology and Genealogy, Forensics, and Personalized Medicine

PubMed Central

Yusuf, Leeban; Anderson, Ainan I J; Pirooznia, Mehdi; Arnellos, Dimitrios; Vilshansky, Gregory; Ercal, Gunes; Lu, Yontao; Webster, Teresa; Baird, Michael L; Esposito, Umberto

2017-01-01

Abstract The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation, drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, much of that knowledge was gained from population survey projects. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism genotyping, their design specifications are limited and they do not allow a full exploration of biodiversity. We thereby aimed to design the Diversity of REcent and Ancient huMan (DREAM)—an all-inclusive microarray that would allow both identification of known associations and exploration of standing questions in genetic anthropology, forensics, and personalized medicine. DREAM includes probes to interrogate ancestry informative markers obtained from over 450 human populations, over 200 ancient genomes, and 10 archaic hominins. DREAM can identify 94% and 61% of all known Y and mitochondrial haplogroups, respectively, and was vetted to avoid interrogation of clinically relevant markers. To demonstrate its capabilities, we compared its FST distributions with those of the 1000 Genomes Project and commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, DREAM’s autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. DREAM performances are further illustrated in biogeographical, identical by descent, and copy number variation analyses. In summary, with approximately 800,000 markers spanning nearly 2,000 genes, DREAM is a useful tool for genetic anthropology, forensic, and personalized medicine studies. PMID:29165562
Identification of a unique library of complex, but ordered, arrays of repetitive elements in the human genome and implication of their potential involvement in pathobiology.

PubMed

Lee, Kang-Hoon; Lee, Young-Kwan; Kwon, Deug-Nam; Chiu, Sophia; Chew, Victoria; Rah, Hyungchul; Kujawski, Gregory; Melhem, Ramzi; Hsu, Karen; Chung, Cecilia; Greenhalgh, David G; Cho, Kiho

2011-06-01

Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit. Copyright © 2011 Elsevier Inc. All rights reserved.
DNA isolation protocol effects on nuclear DNA analysis by microarrays, droplet digital PCR, and whole genome sequencing, and on mitochondrial DNA copy number estimation.

PubMed

Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H; Proukakis, Christos

2017-01-01

Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array "waves", and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance.
DNA isolation protocol effects on nuclear DNA analysis by microarrays, droplet digital PCR, and whole genome sequencing, and on mitochondrial DNA copy number estimation

PubMed Central

Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M.; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H.

2017-01-01

Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array “waves”, and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance. PMID:28683077
A Mismatch EndoNuclease Array-Based Methodology (MENA) for Identifying Known SNPs or Novel Point Mutations.

PubMed

Comeron, Josep M; Reed, Jordan; Christie, Matthew; Jacobs, Julia S; Dierdorff, Jason; Eberl, Daniel F; Manak, J Robert

2016-04-05

Accurate and rapid identification or confirmation of single nucleotide polymorphisms (SNPs), point mutations and other human genomic variation facilitates understanding the genetic basis of disease. We have developed a new methodology (called MENA (Mismatch EndoNuclease Array)) pairing DNA mismatch endonuclease enzymology with tiling microarray hybridization in order to genotype both known point mutations (such as SNPs) as well as identify previously undiscovered point mutations and small indels. We show that our assay can rapidly genotype known SNPs in a human genomic DNA sample with 99% accuracy, in addition to identifying novel point mutations and small indels with a false discovery rate as low as 10%. Our technology provides a platform for a variety of applications, including: (1) genotyping known SNPs as well as confirming newly discovered SNPs from whole genome sequencing analyses; (2) identifying novel point mutations and indels in any genomic region from any organism for which genome sequence information is available; and (3) screening panels of genes associated with particular diseases and disorders in patient samples to identify causative mutations. As a proof of principle for using MENA to discover novel mutations, we report identification of a novel allele of the beethoven (btv) gene in Drosophila, which encodes a ciliary cytoplasmic dynein motor protein important for auditory mechanosensation.
Integrated analysis of copy number alteration and RNA expression profiles of cancer using a high-resolution whole-genome oligonucleotide array.

PubMed

Jung, Seung-Hyun; Shin, Seung-Hun; Yim, Seon-Hee; Choi, Hye-Sun; Lee, Sug-Hyung; Chung, Yeun-Jun

2009-07-31

Recently, microarray-based comparative genomic hybridization (array-CGH) has emerged as a very efficient technology with higher resolution for the genome-wide identification of copy number alterations (CNA). Although CNAs are thought to affect gene expression, there is no platform currently available for the integrated CNA-expression analysis. To achieve high-resolution copy number analysis integrated with expression profiles, we established human 30k oligoarray-based genome-wide copy number analysis system and explored the applicability of this system for integrated genome and transcriptome analysis using MDA-MB-231 cell line. We compared the CNAs detected by the oligoarray with those detected by the 3k BAC array for validation. The oligoarray identified the single copy difference more accurately and sensitively than the BAC array. Seventeen CNAs detected by both platforms in MDA-MB-231 such as gains of 5p15.33-13.1, 8q11.22-8q21.13, 17p11.2, and losses of 1p32.3, 8p23.3-8p11.21, and 9p21 were consistently identified in previous studies on breast cancer. There were 122 other small CNAs (mean size 1.79 mb) that were detected by oligoarray only, not by BAC-array. We performed genomic qPCR targeting 7 CNA regions, detected by oligoarray only, and one non-CNA region to validate the oligoarray CNA detection. All qPCR results were consistent with the oligoarray-CGH results. When we explored the possibility of combined interpretation of both DNA copy number and RNA expression profiles, mean DNA copy number and RNA expression levels showed a significant correlation. In conclusion, this 30k oligoarray-CGH system can be a reasonable choice for analyzing whole genome CNAs and RNA expression profiles at a lower cost.
A Complex 6p25 Rearrangement in a Child With Multiple Epiphyseal Dysplasia

PubMed Central

Bedoyan, Jirair K.; Lesperance, Marci M.; Ackley, Todd; Iyer, Ramaswamy K.; Innis, Jeffrey W.; Misra, Vinod K.

2015-01-01

Genomic rearrangements are increasingly recognized as important contributors to human disease. Here we report on an 11½-year-old child with myopia, Duane retraction syndrome, bilateral mixed hearing loss, skeletal anomalies including multiple epiphyseal dysplasia, and global developmental delay, and a complex 6p25 genomic rearrangement. We have employed oligonucleotide-based comparative genomic hybridization arrays (aCGH) of different resolutions (44 and 244K) as well as a 1 M single nucleotide polymorphism (SNP) array to analyze this complex rearrangement. Our analyses reveal a complex rearrangement involving a ~2.21 Mb interstitial deletion, a ~240 kb terminal deletion, and a 70–80 kb region in between these two deletions that shows maintenance of genomic copy number. The interstitial deletion contains eight known genes, including three Forkhead box containing (FOX) transcription factors (FOXQ1, FOXF2, and FOXC1). The region maintaining genomic copy number partly overlaps the dual specificity protein phosphatase 22 (DUSP22) gene. Array analyses suggest a homozygous loss of genomic material at the 5′ end of DUSP22, which was corroborated using TaqMan® copy number analysis. It is possible that this homozygous genomic loss may render both copies of DUSP22 or its products non-functional. Our analysis suggests a rearrangement mechanism distinct from a previously reported replication-based error-prone mechanism without template switching for a specific 6p25 rearrangement with a 1.22 Mb interstitial deletion. Our study demonstrates the utility and limitations of using oligonucleotide-based aCGH and SNP array technologies of increasing resolutions in order to identify complex DNA rearrangements and gene disruptions. PMID:21204225
Analysis of Chinese women with primary ovarian insufficiency by high resolution array-comparative genomic hybridization.

PubMed

Liao, Can; Fu, Fang; Yang, Xin; Sun, Yi-Min; Li, Dong-Zhi

2011-06-01

Primary ovarian insufficiency (POI) is defined as a primary ovarian defect characterized by absent menarche (primary amenorrhea) or premature depletion of ovarian follicles before the age of 40 years. The etiology of primary ovarian insufficiency in human female patients is still unclear. The purpose of this study is to investigate the potential genetic causes in primary amenorrhea patients by high resolution array based comparative genomic hybridization (array-CGH) analysis. Following the standard karyotyping analysis, genomic DNA from whole blood of 15 primary amenorrhea patients and 15 normal control women was hybridized with Affymetrix cytogenetic 2.7M arrays following the standard protocol. Copy number variations identified by array-CGH were confirmed by real time polymerase chain reaction. All the 30 samples were negative by conventional karyotyping analysis. Microdeletions on chromosome 17q21.31-q21.32 with approximately 1.3 Mb were identified in four patients by high resolution array-CGH analysis. This included the female reproductive secretory pathway related factor N-ethylmaleimide-sensitive factor (NSF) gene. The results of the present study suggest that there may be critical regions regulating primary ovarian insufficiency in women with a 17q21.31-q21.32 microdeletion. This effect might be due to the loss of function of the NSF gene/genes within the deleted region or to effects on contiguous genes.
Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms.

PubMed

Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro

2010-04-27

To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be caused by a recent human selection in rice breeding. The definition of pedigree haplotypes by means of genome-wide SNPs will facilitate next-generation breeding of rice and other crops.
Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

PubMed Central

Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

2015-01-01

Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089
An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray.

PubMed

Salas, Lucas A; Koestler, Devin C; Butler, Rondi A; Hansen, Helen M; Wiencke, John K; Kelsey, Karl T; Christensen, Brock C

2018-05-29

Genome-wide methylation arrays are powerful tools for assessing cell composition of complex mixtures. We compare three approaches to select reference libraries for deconvoluting neutrophil, monocyte, B-lymphocyte, natural killer, and CD4+ and CD8+ T-cell fractions based on blood-derived DNA methylation signatures assayed using the Illumina HumanMethylationEPIC array. The IDOL algorithm identifies a library of 450 CpGs, resulting in an average R 2 = 99.2 across cell types when applied to EPIC methylation data collected on artificial mixtures constructed from the above cell types. Of the 450 CpGs, 69% are unique to EPIC. This library has the potential to reduce unintended technical differences across array platforms.
Human mitochondrial DNA: roles of inherited and somatic mutations

PubMed Central

Schon, Eric A.; DiMauro, Salvatore; Hirano, Michio

2014-01-01

Mutations in the human mitochondrial genome are known to cause an array of diverse disorders, most of which are maternally inherited, and all of which are associated with defects in oxidative energy metabolism. It is now emerging that somatic mutations in mitochondrial DNA (mtDNA) are also linked to other complex traits, including neurodegenerative diseases, ageing and cancer. Here we discuss insights into the roles of mtDNA mutations in a wide variety of diseases, highlighting the interesting genetic characteristics of the mitochondrial genome and challenges in studying its contribution to pathogenesis. PMID:23154810
dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

PubMed

Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

2016-01-01

The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.
The Diversity of REcent and Ancient huMan (DREAM): A New Microarray for Genetic Anthropology and Genealogy, Forensics, and Personalized Medicine.

PubMed

Elhaik, Eran; Yusuf, Leeban; Anderson, Ainan I J; Pirooznia, Mehdi; Arnellos, Dimitrios; Vilshansky, Gregory; Ercal, Gunes; Lu, Yontao; Webster, Teresa; Baird, Michael L; Esposito, Umberto

2017-12-01

The human population displays wide variety in demographic history, ancestry, content of DNA derived from hominins or ancient populations, adaptation, traits, copy number variation, drug response, and more. These polymorphisms are of broad interest to population geneticists, forensics investigators, and medical professionals. Historically, much of that knowledge was gained from population survey projects. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism genotyping, their design specifications are limited and they do not allow a full exploration of biodiversity. We thereby aimed to design the Diversity of REcent and Ancient huMan (DREAM)-an all-inclusive microarray that would allow both identification of known associations and exploration of standing questions in genetic anthropology, forensics, and personalized medicine. DREAM includes probes to interrogate ancestry informative markers obtained from over 450 human populations, over 200 ancient genomes, and 10 archaic hominins. DREAM can identify 94% and 61% of all known Y and mitochondrial haplogroups, respectively, and was vetted to avoid interrogation of clinically relevant markers. To demonstrate its capabilities, we compared its FST distributions with those of the 1000 Genomes Project and commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, DREAM's autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. DREAM performances are further illustrated in biogeographical, identical by descent, and copy number variation analyses. In summary, with approximately 800,000 markers spanning nearly 2,000 genes, DREAM is a useful tool for genetic anthropology, forensic, and personalized medicine studies. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias

PubMed Central

2012-01-01

Background High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. Results We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. Conclusion The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses. PMID:22260749
Diversity arrays technology: a generic genome profiling technology on open platforms.

PubMed

Kilian, Andrzej; Wenzl, Peter; Huttner, Eric; Carling, Jason; Xia, Ling; Blois, Hélène; Caig, Vanessa; Heller-Uszynska, Katarzyna; Jaccoud, Damian; Hopper, Colleen; Aschenbrenner-Kilian, Malgorzata; Evers, Margaret; Peng, Kaiman; Cayla, Cyril; Hok, Puthick; Uszynski, Grzegorz

2012-01-01

In the last 20 years, we have observed an exponential growth of the DNA sequence data and simular increase in the volume of DNA polymorphism data generated by numerous molecular marker technologies. Most of the investment, and therefore progress, concentrated on human genome and genomes of selected model species. Diversity Arrays Technology (DArT), developed over a decade ago, was among the first "democratizing" genotyping technologies, as its performance was primarily driven by the level of DNA sequence variation in the species rather than by the level of financial investment. DArT also proved more robust to genome size and ploidy-level differences among approximately 60 organisms for which DArT was developed to date compared to other high-throughput genotyping technologies. The success of DArT in a number of organisms, including a wide range of "orphan crops," can be attributed to the simplicity of underlying concepts: DArT combines genome complexity reduction methods enriching for genic regions with a highly parallel assay readout on a number of "open-access" microarray platforms. The quantitative nature of the assay enabled a number of applications in which allelic frequencies can be estimated from DArT arrays. A typical DArT assay tests for polymorphism tens of thousands of genomic loci with the final number of markers reported (hundreds to thousands) reflecting the level of DNA sequence variation in the tested loci. Detailed DArT methods, protocols, and a range of their application examples as well as DArT's evolution path are presented.
A new age in functional genomics using CRISPR/Cas9 in arrayed library screening.

PubMed

Agrotis, Alexander; Ketteler, Robin

2015-01-01

CRISPR technology has rapidly changed the face of biological research, such that precise genome editing has now become routine for many labs within several years of its initial development. What makes CRISPR/Cas9 so revolutionary is the ability to target a protein (Cas9) to an exact genomic locus, through designing a specific short complementary nucleotide sequence, that together with a common scaffold sequence, constitute the guide RNA bridging the protein and the DNA. Wild-type Cas9 cleaves both DNA strands at its target sequence, but this protein can also be modified to exert many other functions. For instance, by attaching an activation domain to catalytically inactive Cas9 and targeting a promoter region, it is possible to stimulate the expression of a specific endogenous gene. In principle, any genomic region can be targeted, and recent efforts have successfully generated pooled guide RNA libraries for coding and regulatory regions of human, mouse and Drosophila genomes with high coverage, thus facilitating functional phenotypic screening. In this review, we will highlight recent developments in the area of CRISPR-based functional genomics and discuss potential future directions, with a special focus on mammalian cell systems and arrayed library screening.
Restriction Site Tiling Analysis: accurate discovery and quantitative genotyping of genome-wide polymorphisms using nucleotide arrays

PubMed Central

2010-01-01

High-throughput genotype data can be used to identify genes important for local adaptation in wild populations, phenotypes in lab stocks, or disease-related traits in human medicine. Here we advance microarray-based genotyping for population genomics with Restriction Site Tiling Analysis. The approach simultaneously discovers polymorphisms and provides quantitative genotype data at 10,000s of loci. It is highly accurate and free from ascertainment bias. We apply the approach to uncover genomic differentiation in the purple sea urchin. PMID:20403197
Synthetic Genetic Arrays: Automation of Yeast Genetics.

PubMed

Kuzmin, Elena; Costanzo, Michael; Andrews, Brenda; Boone, Charles

2016-04-01

Genome-sequencing efforts have led to great strides in the annotation of protein-coding genes and other genomic elements. The current challenge is to understand the functional role of each gene and how genes work together to modulate cellular processes. Genetic interactions define phenotypic relationships between genes and reveal the functional organization of a cell. Synthetic genetic array (SGA) methodology automates yeast genetics and enables large-scale and systematic mapping of genetic interaction networks in the budding yeast,Saccharomyces cerevisiae SGA facilitates construction of an output array of double mutants from an input array of single mutants through a series of replica pinning steps. Subsequent analysis of genetic interactions from SGA-derived mutants relies on accurate quantification of colony size, which serves as a proxy for fitness. Since its development, SGA has given rise to a variety of other experimental approaches for functional profiling of the yeast genome and has been applied in a multitude of other contexts, such as genome-wide screens for synthetic dosage lethality and integration with high-content screening for systematic assessment of morphology defects. SGA-like strategies can also be implemented similarly in a number of other cell types and organisms, includingSchizosaccharomyces pombe,Escherichia coli, Caenorhabditis elegans, and human cancer cell lines. The genetic networks emerging from these studies not only generate functional wiring diagrams but may also play a key role in our understanding of the complex relationship between genotype and phenotype. © 2016 Cold Spring Harbor Laboratory Press.

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.

PubMed

Delaneau, Olivier; Marchini, Jonathan

2014-06-13

A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.
Genome-wide single-nucleotide polymorphism arrays demonstrate high fidelity of multiple displacement-based whole-genome amplification.

PubMed

Tzvetkov, Mladen V; Becker, Christian; Kulle, Bettina; Nürnberg, Peter; Brockmöller, Jürgen; Wojnowski, Leszek

2005-02-01

Whole-genome DNA amplification by multiple displacement (MD-WGA) is a promising tool to obtain sufficient DNA amounts from samples of limited quantity. Using Affymetrix' GeneChip Human Mapping 10K Arrays, we investigated the accuracy and allele amplification bias in DNA samples subjected to MD-WGA. We observed an excellent concordance (99.95%) between single-nucleotide polymorphisms (SNPs) called both in the nonamplified and the corresponding amplified DNA. This concordance was only 0.01% lower than the intra-assay reproducibility of the genotyping technique used. However, MD-WGA failed to amplify an estimated 7% of polymorphic loci. Due to the algorithm used to call genotypes, this was detected only for heterozygous loci. We achieved a 4.3-fold reduction of noncalled SNPs by combining the results from two independent MD-WGA reactions. This indicated that inter-reaction variations rather than specific chromosomal loci reduced the efficiency of MD-WGA. Consistently, we detected no regions of reduced amplification, with the exception of several SNPs located near chromosomal ends. Altogether, despite a substantial loss of polymorphic sites, MD-WGA appears to be the current method of choice to amplify genomic DNA for array-based SNP analyses. The number of nonamplified loci can be substantially reduced by amplifying each DNA sample in duplicate.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.

Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Computationally expanding infinium HumanMethylation450 BeadChip array data to reveal distinct DNA methylation patterns of rheumatoid arthritis

PubMed Central

Li, Chengzhe; Ai, Rizi; Wang, Mengchi; Firestein, Gary S.; Wang, Wei

2016-01-01

Motivation: DNA methylation signatures in rheumatoid arthritis (RA) have been identified in fibroblast-like synoviocytes (FLS) with Illumina HumanMethylation450 array. Since <2% of CpG sites are covered by the Illumina 450K array and whole genome bisulfite sequencing is still too expensive for many samples, computationally predicting DNA methylation levels based on 450K data would be valuable to discover more RA-related genes. Results: We developed a computational model that is trained on 14 tissues with both whole genome bisulfite sequencing and 450K array data. This model integrates information derived from the similarity of local methylation pattern between tissues, the methylation information of flanking CpG sites and the methylation tendency of flanking DNA sequences. The predicted and measured methylation values were highly correlated with a Pearson correlation coefficient of 0.9 in leave-one-tissue-out cross-validations. Importantly, the majority (76%) of the top 10% differentially methylated loci among the 14 tissues was correctly detected using the predicted methylation values. Applying this model to 450K data of RA, osteoarthritis and normal FLS, we successfully expanded the coverage of CpG sites 18.5-fold and accounts for about 30% of all the CpGs in the human genome. By integrative omics study, we identified genes and pathways tightly related to RA pathogenesis, among which 12 genes were supported by triple evidences, including 6 genes already known to perform specific roles in RA and 6 genes as new potential therapeutic targets. Availability and implementation: The source code, required data for prediction, and demo data for test are freely available at: http://wanglab.ucsd.edu/star/LR450K/. Contact: wei-wang@ucsd.edu or gfirestein@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26883487
Genomic Approaches to Zebrafish Cancer

PubMed Central

2017-01-01

The zebrafish has emerged as an important model for studying cancer biology. Identification of DNA, RNA and chromatin abnormalities can give profound insight into the mechanisms of tumorigenesis and the there are many techniques for analyzing the genomes of these tumors. Here, I present an overview of the available technologies for analyzing tumor genomes in the zebrafish, including array based methods as well as next-generation sequencing technologies. I also discuss the ways in which zebrafish tumor genomes can be compared to human genomes using cross-species oncogenomics, which act to filter genomic noise and ultimately uncover central drivers of malignancy. Finally, I discuss downstream analytic tools, including network analysis, that can help to organize the alterations into coherent biological frameworks that can then be investigated further. PMID:27165352
Amplification and chromosomal dispersion of human endogenous retroviral sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Steele, P.E.; Martin, M.A.; Rabson, A.B.

1986-09-01

Endogenous retroviral sequences have undergone amplification events involving both viral and flanking cellular sequences. The authors cloned members of an amplified family of full-length endogenous retroviral sequences. Genomic blotting, employing a flanking cellular DNA probe derived from a member of this family, revealed a similar array of reactive bands in both humans and chimpanzees, indicating that an amplification event involving retroviral and associated cellular DNA sequences occurred before the evolutionary separation of these two primates. Southern analyses of restricted somatic cell hybrid DNA preparations suggested that endogenous retroviral segments are widely dispersed in the human genome and that amplification andmore » dispersion events may be linked.« less
Genomic profiling of plasma cell disorders in a clinical setting: integration of microarray and FISH, after CD138 selection of bone marrow

PubMed Central

Berry, Nadine Kaye; Bain, Nicole L; Enjeti, Anoop K; Rowlings, Philip

2014-01-01

Aim To evaluate the role of whole genome comparative genomic hybridisation microarray (array-CGH) in detecting genomic imbalances as compared to conventional karyotype (GTG-analysis) or myeloma specific fluorescence in situ hybridisation (FISH) panel in a diagnostic setting for plasma cell dyscrasia (PCD). Methods A myeloma-specific interphase FISH (i-FISH) panel was carried out on CD138 PC-enriched bone marrow (BM) from 20 patients having BM biopsies for evaluation of PCD. Whole genome array-CGH was performed on reference (control) and neoplastic (test patient) genomic DNA extracted from CD138 PC-enriched BM and analysed. Results Comparison of techniques demonstrated a much higher detection rate of genomic imbalances using array-CGH. Genomic imbalances were detected in 1, 19 and 20 patients using GTG-analysis, i-FISH and array-CGH, respectively. Genomic rearrangements were detected in one patient using GTG-analysis and seven patients using i-FISH, while none were detected using array-CGH. I-FISH was the most sensitive method for detecting gene rearrangements and GTG-analysis was the least sensitive method overall. All copy number aberrations observed in GTG-analysis were detected using array-CGH and i-FISH. Conclusions We show that array-CGH performed on CD138-enriched PCs significantly improves the detection of clinically relevant and possibly novel genomic abnormalities in PCD, and thus could be considered as a standard diagnostic technique in combination with IGH rearrangement i-FISH. PMID:23969274
Genomic profiling of plasma cell disorders in a clinical setting: integration of microarray and FISH, after CD138 selection of bone marrow.

PubMed

Berry, Nadine Kaye; Bain, Nicole L; Enjeti, Anoop K; Rowlings, Philip

2014-01-01

To evaluate the role of whole genome comparative genomic hybridisation microarray (array-CGH) in detecting genomic imbalances as compared to conventional karyotype (GTG-analysis) or myeloma specific fluorescence in situ hybridisation (FISH) panel in a diagnostic setting for plasma cell dyscrasia (PCD). A myeloma-specific interphase FISH (i-FISH) panel was carried out on CD138 PC-enriched bone marrow (BM) from 20 patients having BM biopsies for evaluation of PCD. Whole genome array-CGH was performed on reference (control) and neoplastic (test patient) genomic DNA extracted from CD138 PC-enriched BM and analysed. Comparison of techniques demonstrated a much higher detection rate of genomic imbalances using array-CGH. Genomic imbalances were detected in 1, 19 and 20 patients using GTG-analysis, i-FISH and array-CGH, respectively. Genomic rearrangements were detected in one patient using GTG-analysis and seven patients using i-FISH, while none were detected using array-CGH. I-FISH was the most sensitive method for detecting gene rearrangements and GTG-analysis was the least sensitive method overall. All copy number aberrations observed in GTG-analysis were detected using array-CGH and i-FISH. We show that array-CGH performed on CD138-enriched PCs significantly improves the detection of clinically relevant and possibly novel genomic abnormalities in PCD, and thus could be considered as a standard diagnostic technique in combination with IGH rearrangement i-FISH.
Analysis of sensitivity and rapid hybridization of a multiplexed Microbial Detection Microarray

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thissen, James B.; McLoughlin, Kevin; Gardner, Shea

Microarrays have proven to be useful in rapid detection of many viruses and bacteria. Pathogen detection microarrays have been used to diagnose viral and bacterial infections in clinical samples and to evaluate the safety of biological drug materials. A multiplexed version of the Lawrence Livermore Microbial Detection Array (LLMDA) was developed and evaluated with minimum detectable concentrations for pure unamplified DNA viruses, along with mixtures of viral and bacterial DNA subjected to different whole genome amplification protocols. In addition the performance of the array was tested when hybridization time was reduced from 17 h to 1 h. The LLMDA wasmore » able to detect unamplified vaccinia virus DNA at a concentration of 14 fM, or 100,000 genome copies in 12 μL of sample. With amplification, positive identification was made with only 100 genome copies of input material. When tested against human stool samples from patients with acute gastroenteritis, the microarray detected common gastroenteritis viral and bacterial infections such as rotavirus and E. coli. Accurate detection was found but with a 4-fold drop in sensitivity for a 1 h compared to a 17 h hybridization. The array detected 2 ng (equivalent concentration of 15.6 fM) of labeled DNA from a virus with 1 h hybridization without any amplification, and was able to identify the components of a mixture of viruses and bacteria at species and in some cases strain level resolution. Sensitivity improved by three orders of magnitude with random whole genome amplification prior to hybridization; for instance, the array detected a DNA virus with only 20 fg or 100 genome copies as input. This multiplexed microarray is an efficient tool to analyze clinical and environmental samples for the presence of multiple viral and bacterial pathogens rapidly.« less
Analysis of sensitivity and rapid hybridization of a multiplexed Microbial Detection Microarray

DOE PAGES

Thissen, James B.; McLoughlin, Kevin; Gardner, Shea; ...

2014-06-01

Microarrays have proven to be useful in rapid detection of many viruses and bacteria. Pathogen detection microarrays have been used to diagnose viral and bacterial infections in clinical samples and to evaluate the safety of biological drug materials. A multiplexed version of the Lawrence Livermore Microbial Detection Array (LLMDA) was developed and evaluated with minimum detectable concentrations for pure unamplified DNA viruses, along with mixtures of viral and bacterial DNA subjected to different whole genome amplification protocols. In addition the performance of the array was tested when hybridization time was reduced from 17 h to 1 h. The LLMDA wasmore » able to detect unamplified vaccinia virus DNA at a concentration of 14 fM, or 100,000 genome copies in 12 μL of sample. With amplification, positive identification was made with only 100 genome copies of input material. When tested against human stool samples from patients with acute gastroenteritis, the microarray detected common gastroenteritis viral and bacterial infections such as rotavirus and E. coli. Accurate detection was found but with a 4-fold drop in sensitivity for a 1 h compared to a 17 h hybridization. The array detected 2 ng (equivalent concentration of 15.6 fM) of labeled DNA from a virus with 1 h hybridization without any amplification, and was able to identify the components of a mixture of viruses and bacteria at species and in some cases strain level resolution. Sensitivity improved by three orders of magnitude with random whole genome amplification prior to hybridization; for instance, the array detected a DNA virus with only 20 fg or 100 genome copies as input. This multiplexed microarray is an efficient tool to analyze clinical and environmental samples for the presence of multiple viral and bacterial pathogens rapidly.« less
Genome wide analysis in a discordant monozygotic twin with caudal appendage and multiple congenital anomalies.

PubMed

Cogulu, O; Pariltay, E; Koroglu, O A; Aykut, A; Ozyurek, R; Levent, E; Kultursay, N; Ozkinay, F

2013-01-01

Caudal appendage is a rare dysmorphic feature of which etiologic mechanisms are not well understood. Here we report monozygotic (MZ) twin brothers who are discordant for the caudal appendage and multiple congenital anomalies. Twins were the product of a 33 weeks of gestation, monochorionic-diamniotic pregnancy. On admission the proband had micrognathia, beaked nose, hypospadias, caudal appendage and juxtaductal aorta coarctation. At birth, he was small for gestational age and he had transient hypothyroidism which was detected in the newborn period. Karyotype analysis showed 46,XY. Monozygosity was shown by 15 microsatellite markers plus amelogenin (AmpFlSTR Identifiler PCR Amplification Kit, Applied Biosystems). Genome-wide copy number analysis of the twins by DNA-DNA hybridization of whole genomic DNA (NimbleGen Human CGH 385K WG-T v2.0 array) showed a significant difference at two neighboring probes with Log2 ratio: 0.72088 which are located on chromosome 3p12.3. Further analysis by high resolution of chromosome 3 array (Roche NimbleGen Human HG18 CHR3 FT Median Probe Spacing 475 bp) and quantitative PCR analysis did not confirm the deletion.
Segmental Duplications and Copy-Number Variation in the Human Genome

PubMed Central

Sharp, Andrew J. ; Locke, Devin P. ; McGrath, Sean D. ; Cheng, Ze ; Bailey, Jeffrey A. ; Vallente, Rhea U. ; Pertz, Lisa M. ; Clark, Royden A. ; Schwartz, Stuart ; Segraves, Rick ; Oseroff, Vanessa V. ; Albertson, Donna G. ; Pinkel, Daniel ; Eichler, Evan E.

2005-01-01

The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders. PMID:15918152
X-chromosome tiling path array detection of copy number variants in patients with chromosome X-linked mental retardation

PubMed Central

Madrigal, I; Rodríguez-Revenga, L; Armengol, L; González, E; Rodriguez, B; Badenas, C; Sánchez, A; Martínez, F; Guitart, M; Fernández, I; Arranz, JA; Tejada, MI; Pérez-Jurado, LA; Estivill, X; Milà, M

2007-01-01

Background Aproximately 5–10% of cases of mental retardation in males are due to copy number variations (CNV) on the X chromosome. Novel technologies, such as array comparative genomic hybridization (aCGH), may help to uncover cryptic rearrangements in X-linked mental retardation (XLMR) patients. We have constructed an X-chromosome tiling path array using bacterial artificial chromosomes (BACs) and validated it using samples with cytogenetically defined copy number changes. We have studied 54 patients with idiopathic mental retardation and 20 controls subjects. Results Known genomic aberrations were reliably detected on the array and eight novel submicroscopic imbalances, likely causative for the mental retardation (MR) phenotype, were detected. Putatively pathogenic rearrangements included three deletions and five duplications (ranging between 82 kb to one Mb), all but two affecting genes previously known to be responsible for XLMR. Additionally, we describe different CNV regions with significant different frequencies in XLMR and control subjects (44% vs. 20%). Conclusion This tiling path array of the human X chromosome has proven successful for the detection and characterization of known rearrangements and novel CNVs in XLMR patients. PMID:18047645
A Novel High-Throughput Method for Molecular Detection of Human Pathogenic Viruses Using a Nanofluidic Real-Time PCR System

PubMed Central

Coudray-Meunier, Coralie; Fraisse, Audrey; Martin-Latil, Sandra; Delannoy, Sabine; Fach, Patrick; Perelle, Sylvie

2016-01-01

Human enteric viruses are recognized as the main causes of food- and waterborne diseases worldwide. Sensitive and quantitative detection of human enteric viruses is typically achieved through quantitative RT-PCR (RT-qPCR). A nanofluidic real-time PCR system was used to develop novel high-throughput methods for qualitative molecular detection (RT-qPCR array) and quantification of human pathogenic viruses by digital RT-PCR (RT-dPCR). The performance of high-throughput PCR methods was investigated for detecting 19 human pathogenic viruses and two main process controls used in food virology. The conventional real-time PCR system was compared to the RT-dPCR and RT-qPCR array. Based on the number of genome copies calculated by spectrophotometry, sensitivity was found to be slightly better with RT-qPCR than with RT-dPCR for 14 viruses by a factor range of from 0.3 to 1.6 log10. Conversely, sensitivity was better with RT-dPCR than with RT-qPCR for seven viruses by a factor range of from 0.10 to 1.40 log10. Interestingly, the number of genome copies determined by RT-dPCR was always from 1 to 2 log10 lower than the expected copy number calculated by RT-qPCR standard curve. The sensitivity of the RT-qPCR and RT-qPCR array assays was found to be similar for two viruses, and better with RT-qPCR than with RT-qPCR array for eighteen viruses by a factor range of from 0.7 to 3.0 log10. Conversely, sensitivity was only 0.30 log10 better with the RT-qPCR array than with conventional RT-qPCR assays for norovirus GIV detection. Finally, the RT-qPCR array and RT-dPCR assays were successfully used together to screen clinical samples and quantify pathogenic viruses. Additionally, this method made it possible to identify co-infection in clinical samples. In conclusion, given the rapidity and potential for large numbers of viral targets, this nanofluidic RT-qPCR assay should have a major impact on human pathogenic virus surveillance and outbreak investigations and is likely to be of benefit to public health. PMID:26824897
Targeting Histone Abnormality in Triple Negative Breast Cancer

DTIC Science & Technology

2016-08-01

Davidson NE. The Health Consequences of Smoking—50 Years of Progress. A Report of the Surgeon General. US Department of Health and Human Service...LSD1 proteins in human primary breast tumor specimens. By using in vitro and in vivo models, we identified that sulforaphane (SFN), a natural bioactive...characterized genes in the human genome. Raw intensity data were normalized by the Robust Multi-array Average (RMA). Student’s t- tests were
Genome-scale approaches to the epigenetics of common human disease

PubMed Central

2011-01-01

Traditionally, the pathology of human disease has been focused on microscopic examination of affected tissues, chemical and biochemical analysis of biopsy samples, other available samples of convenience, such as blood, and noninvasive or invasive imaging of varying complexity, in order to classify disease and illuminate its mechanistic basis. The molecular age has complemented this armamentarium with gene expression arrays and selective analysis of individual genes. However, we are entering a new era of epigenomic profiling, i.e., genome-scale analysis of cell-heritable nonsequence genetic change, such as DNA methylation. The epigenome offers access to stable measurements of cellular state and to biobanked material for large-scale epidemiological studies. Some of these genome-scale technologies are beginning to be applied to create the new field of epigenetic epidemiology. PMID:19844740
Target genes discovery through copy number alteration analysis in human hepatocellular carcinoma.

PubMed

Gu, De-Leung; Chen, Yen-Hsieh; Shih, Jou-Ho; Lin, Chi-Hung; Jou, Yuh-Shan; Chen, Chian-Feng

2013-12-21

High-throughput short-read sequencing of exomes and whole cancer genomes in multiple human hepatocellular carcinoma (HCC) cohorts confirmed previously identified frequently mutated somatic genes, such as TP53, CTNNB1 and AXIN1, and identified several novel genes with moderate mutation frequencies, including ARID1A, ARID2, MLL, MLL2, MLL3, MLL4, IRF2, ATM, CDKN2A, FGF19, PIK3CA, RPS6KA3, JAK1, KEAP1, NFE2L2, C16orf62, LEPR, RAC2, and IL6ST. Functional classification of these mutated genes suggested that alterations in pathways participating in chromatin remodeling, Wnt/β-catenin signaling, JAK/STAT signaling, and oxidative stress play critical roles in HCC tumorigenesis. Nevertheless, because there are few druggable genes used in HCC therapy, the identification of new therapeutic targets through integrated genomic approaches remains an important task. Because a large amount of HCC genomic data genotyped by high density single nucleotide polymorphism arrays is deposited in the public domain, copy number alteration (CNA) analyses of these arrays is a cost-effective way to reveal target genes through profiling of recurrent and overlapping amplicons, homozygous deletions and potentially unbalanced chromosomal translocations accumulated during HCC progression. Moreover, integration of CNAs with other high-throughput genomic data, such as aberrantly coding transcriptomes and non-coding gene expression in human HCC tissues and rodent HCC models, provides lines of evidence that can be used to facilitate the identification of novel HCC target genes with the potential of improving the survival of HCC patients.
Variant calling in low-coverage whole genome sequencing of a Native American population sample.

PubMed

Bizon, Chris; Spiegel, Michael; Chasse, Scott A; Gizer, Ian R; Li, Yun; Malc, Ewa P; Mieczkowski, Piotr A; Sailsbery, Josh K; Wang, Xiaoshu; Ehlers, Cindy L; Wilhelmsen, Kirk C

2014-01-30

The reduction in the cost of sequencing a human genome has led to the use of genotype sampling strategies in order to impute and infer the presence of sequence variants that can then be tested for associations with traits of interest. Low-coverage Whole Genome Sequencing (WGS) is a sampling strategy that overcomes some of the deficiencies seen in fixed content SNP array studies. Linkage-disequilibrium (LD) aware variant callers, such as the program Thunder, may provide a calling rate and accuracy that makes a low-coverage sequencing strategy viable. We examined the performance of an LD-aware variant calling strategy in a population of 708 low-coverage whole genome sequences from a community sample of Native Americans. We assessed variant calling through a comparison of the sequencing results to genotypes measured in 641 of the same subjects using a fixed content first generation exome array. The comparison was made using the variant calling routines GATK Unified Genotyper program and the LD-aware variant caller Thunder. Thunder was found to improve concordance in a coverage dependent fashion, while correctly calling nearly all of the common variants as well as a high percentage of the rare variants present in the sample. Low-coverage WGS is a strategy that appears to collect genetic information intermediate in scope between fixed content genotyping arrays and deep-coverage WGS. Our data suggests that low-coverage WGS is a viable strategy with a greater chance of discovering novel variants and associations than fixed content arrays for large sample association analyses.
Experimental annotation of the human genome using microarray technology.

PubMed

Shoemaker, D D; Schadt, E E; Armour, C D; He, Y D; Garrett-Engele, P; McDonagh, P D; Loerch, P M; Leonardson, A; Lum, P Y; Cavet, G; Wu, L F; Altschuler, S J; Edwards, S; King, J; Tsang, J S; Schimmack, G; Schelter, J M; Koch, J; Ziman, M; Marton, M J; Li, B; Cundiff, P; Ward, T; Castle, J; Krolewski, M; Meyer, M R; Mao, M; Burchard, J; Kidd, M J; Dai, H; Phillips, J W; Linsley, P S; Stoughton, R; Scherer, S; Boguski, M S

2001-02-15

The most important product of the sequencing of a genome is a complete, accurate catalogue of genes and their products, primarily messenger RNA transcripts and their cognate proteins. Such a catalogue cannot be constructed by computational annotation alone; it requires experimental validation on a genome scale. Using 'exon' and 'tiling' arrays fabricated by ink-jet oligonucleotide synthesis, we devised an experimental approach to validate and refine computational gene predictions and define full-length transcripts on the basis of co-regulated expression of their exons. These methods can provide more accurate gene numbers and allow the detection of mRNA splice variants and identification of the tissue- and disease-specific conditions under which genes are expressed. We apply our technique to chromosome 22q under 69 experimental condition pairs, and to the entire human genome under two experimental conditions. We discuss implications for more comprehensive, consistent and reliable genome annotation, more efficient, full-length complementary DNA cloning strategies and application to complex diseases.
Exome sequencing of a multigenerational human pedigree.

PubMed

Hedges, Dale J; Hedges, Dale; Burges, Dan; Powell, Eric; Almonte, Cherylyn; Huang, Jia; Young, Stuart; Boese, Benjamin; Schmidt, Mike; Pericak-Vance, Margaret A; Martin, Eden; Zhang, Xinmin; Harkins, Timothy T; Züchner, Stephan

2009-12-14

Over the next few years, the efficient use of next-generation sequencing (NGS) in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or approximately 180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of > or = 3, 86% at a read depth of > or = 10, and over 50% of all targets were covered with > or = 20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at > or = 10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered > or = 8x. Our results offer guidance for "real-world" applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.

Comparative genomic hybridization.

PubMed

Pinkel, Daniel; Albertson, Donna G

2005-01-01

Altering DNA copy number is one of the many ways that gene expression and function may be modified. Some variations are found among normal individuals ( 14, 35, 103 ), others occur in the course of normal processes in some species ( 33 ), and still others participate in causing various disease states. For example, many defects in human development are due to gains and losses of chromosomes and chromosomal segments that occur prior to or shortly after fertilization, whereas DNA dosage alterations that occur in somatic cells are frequent contributors to cancer. Detecting these aberrations, and interpreting them within the context of broader knowledge, facilitates identification of critical genes and pathways involved in biological processes and diseases, and provides clinically relevant information. Over the past several years array comparative genomic hybridization (array CGH) has demonstrated its value for analyzing DNA copy number variations. In this review we discuss the state of the art of array CGH and its applications in medical genetics and cancer, emphasizing general concepts rather than specific results.
Characterization of genetic variability of Venezuelan equine encephalitis viruses

DOE PAGES

Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...

2016-04-07

Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Role of PELP1 in EGFR-ER Signaling Crosstalk in Ovarian Cancer Cells

DTIC Science & Technology

2007-04-01

known about PELP1 role in ovarian cancer progression. Analysis of human genome databases and SAGE data suggested deregulation of PELP1 expression in ...Tulane University, New Orleans, LA Introduction PELP1 down regulation reduces tumorigenic potential in vivo PELP1 expression is deregulated in human ...decreases the tumorigenic potential of OVCAR3 cancer cells in nude mice model IHC studies using human ovarian cancer tissue array (n=123) showed that PELP1
Human minisatellite alleles detectable only after PCR amplification.

PubMed

Armour, J A; Crosier, M; Jeffreys, A J

1992-01-01

We present evidence that a proportion of alleles at two human minisatellite loci is undetected by standard Southern blot hybridization. In each case the missing allele(s) can be identified after PCR amplification and correspond to tandem arrays too short to detect by hybridization. At one locus, there is only one undetected allele (population frequency 0.3), which contains just three repeat units. At the second locus, there are at least five undetected alleles (total population frequency 0.9) containing 60-120 repeats; they are not detected because these tandem repeats give very poor signals when used as a probe in standard Southern blot hybridization, and also cross-hybridize with other sequences in the genome. Under these circumstances only signals from the longest tandemly repeated alleles are detectable above the nonspecific background. The structures of these loci have been compared in human and primate DNA, and at one locus the short human allele containing three repeat units is shown to be an intermediate state in the expansion of a monomeric precursor allele in primates to high copy number in the longer human arrays. We discuss the implications of such loci for studies of human populations, minisatellite isolation by cloning, and the evolution of highly variable tandem arrays.
Identification of susceptibility genes and genetic modifiers of human diseases

NASA Astrophysics Data System (ADS)

Abel, Kenneth; Kammerer, Stefan; Hoyal, Carolyn; Reneland, Rikard; Marnellos, George; Nelson, Matthew R.; Braun, Andreas

2005-03-01

The completion of the human genome sequence enables the discovery of genes involved in common human disorders. The successful identification of these genes is dependent on the availability of informative sample sets, validated marker panels, a high-throughput scoring technology, and a strategy for combining these resources. We have developed a universal platform technology based on mass spectrometry (MassARRAY) for analyzing nucleic acids with high precision and accuracy. To fuel this technology, we generated more than 100,000 validated assays for single nucleotide polymorphisms (SNPs) covering virtually all known and predicted human genes. We also established a large DNA sample bank comprised of more than 50,000 consented healthy and diseased individuals. This combination of reagents and technology allows the execution of large-scale genome-wide association studies. Taking advantage of MassARRAY"s capability for quantitative analysis of nucleic acids, allele frequencies are estimated in sample pools containing large numbers of individual DNAs. To compare pools as a first-pass "filtering" step is a tremendous advantage in throughput and cost over individual genotyping. We employed this approach in numerous genome-wide, hypothesis-free searches to identify genes associated with common complex diseases, such as breast cancer, osteoporosis, and osteoarthritis, and genes involved in quantitative traits like high density lipoproteins cholesterol (HDL-c) levels and central fat. Access to additional well-characterized patient samples through collaborations allows us to conduct replication studies that validate true disease genes. These discoveries will expand our understanding of genetic disease predisposition, and our ability for early diagnosis and determination of specific disease subtype or progression stage.
The strategy, organization, and progress of the HUPO Human Proteome Project.

PubMed

Omenn, Gilbert S

2014-04-04

The Human Proteome Project is a major, comprehensive initiative of the Human Proteome Organization. This global collaborative effort aims to identify and characterize at least one protein product and many PTM, SAP, and splice variant isoforms from the 20,300 human protein-coding genes. The deliverables are an extensive parts list and an array of technology platforms, reagents, spectral libraries, and linked knowledge bases that advance the field and facilitate the use of proteomics by a much wider community of life scientists. Such enablement will help address the Grand Challenge of using proteomics to bridge major gaps between evidence of genomic variation and diverse phenotypes. The HUPO Human Proteome Project (HPP) has made an outstanding launch, including a special issue of the Journal of Proteome Research on the Chromosome-centric HPP with a total of 48 articles. This article is part of a Special Issue: Can Proteomics Fill the Gap Between Genomics and Phenotypes? © 2013.
Syntenic conservation of HSP70 genes in cattle and humans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grosz, M.D.; Womack, J.E.; Skow, L.C.

1992-12-01

A phage library of bovine genomic DNA was screened for hybridization with a human HSP70 cDNA probe, and 21 positive plaques were identified and isolated. Restriction mapping and blot hybridization analysis of DNA from the recombinant plaques demonstrated that the cloned DNAs were derived from three different regions of the bovine genome. Ore region contains two tandemly arrayed HSP70 sequences, designated HSP70-1 and HSP70-2, separated by approximately 8 kb of DNA. Single HSP70 sequences, designated HSP70-3 and HSP70-4, were found in two other genomic regions. Locus-specific probes of unique flanking sequences from representative HSP70 clones were hybridized to restriction endonuclease-digestedmore » DNA from bovine-hamster and bovine-mouse somatic cell hybrid panels to determine the chromosomal location of the HSP70 sequences. The probe for the tandemly arrayed HSP70-1 and HSP70-2 sequences mapped to bovine chromosome 23, syntenic with glyoxalase 1, 21 steroid hydroxylase, and major histocompatibility class I loci. HSP70-3 sequences mapped to bovine chromosome 10, syntenic with nucleoside phosphorylase and murine osteosarcoma viral oncogene (v-fos), and HSP70-4 mapped to bovine syntenic group U6, syntenic with amylase 1 and phosphoglucomutase 1. On the basis of these data, the authors propose that bovine HSP70-1,2 are homologous to human HSPA1 and HSPA1L on chromosome 6p21.3, bovine HSP70-3 is the homolog of an unnamed human HSP70 gene on chromosome 14q22-q24, and bovine HSP70-4 is homologous to one of the human HSPA-6,-7 genes on chromosome 1. 34 refs., 2 figs., 1 tab.« less
The Sequencing Bead Array (SBA), a Next-Generation Digital Suspension Array

PubMed Central

Akhras, Michael S.; Pettersson, Erik; Diamond, Lisa; Unemo, Magnus; Okamoto, Jennifer; Davis, Ronald W.; Pourmand, Nader

2013-01-01

Here we describe the novel Sequencing Bead Array (SBA), a complete assay for molecular diagnostics and typing applications. SBA is a digital suspension array using Next-Generation Sequencing (NGS), to replace conventional optical readout platforms. The technology allows for reducing the number of instruments required in a laboratory setting, where the same NGS instrument could be employed from whole-genome and targeted sequencing to SBA broad-range biomarker detection and genotyping. As proof-of-concept, a model assay was designed that could distinguish ten Human Papillomavirus (HPV) genotypes associated with cervical cancer progression. SBA was used to genotype 20 cervical tumor samples and, when compared with amplicon pyrosequencing, was able to detect two additional co-infections due to increased sensitivity. We also introduce in-house software Sphix, enabling easy accessibility and interpretation of results. The technology offers a multi-parallel, rapid, robust, and scalable system that is readily adaptable for a multitude of microarray diagnostic and typing applications, e.g. genetic signatures, single nucleotide polymorphisms (SNPs), structural variations, and immunoassays. SBA has the potential to dramatically change the way we perform probe-based applications, and allow for a smooth transition towards the technology offered by genomic sequencing. PMID:24116138
The GenoChip: A New Tool for Genetic Anthropology

PubMed Central

Elhaik, Eran; Greenspan, Elliott; Staats, Sean; Krahn, Thomas; Tyler-Smith, Chris; Xue, Yali; Tofanelli, Sergio; Francalacci, Paolo; Cucca, Francesco; Pagani, Luca; Jin, Li; Li, Hui; Schurr, Theodore G.; Greenspan, Bennett; Spencer Wells, R.

2013-01-01

The Genographic Project is an international effort aimed at charting human migratory history. The project is nonprofit and nonmedical, and, through its Legacy Fund, supports locally led efforts to preserve indigenous and traditional cultures. Although the first phase of the project was focused on uniparentally inherited markers on the Y-chromosome and mitochondrial DNA (mtDNA), the current phase focuses on markers from across the entire genome to obtain a more complete understanding of human genetic variation. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism (SNP) genotyping, they were designed for medical genetic studies and contain medically related markers that are inappropriate for global population genetic studies. GenoChip, the Genographic Project’s new genotyping array, was designed to resolve these issues and enable higher resolution research into outstanding questions in genetic anthropology. The GenoChip includes ancestry informative markers obtained for over 450 human populations, an ancient human (Saqqaq), and two archaic hominins (Neanderthal and Denisovan) and was designed to identify all known Y-chromosome and mtDNA haplogroups. The chip was carefully vetted to avoid inclusion of medically relevant markers. To demonstrate its capabilities, we compared the FST distributions of GenoChip SNPs to those of two commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, the GenoChip autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. The chip performances are illustrated in a principal component analysis for 14 worldwide populations. In summary, the GenoChip is a dedicated genotyping platform for genetic anthropology. With an unprecedented number of approximately 12,000 Y-chromosomal and approximately 3,300 mtDNA SNPs and over 130,000 autosomal and X-chromosomal SNPs without any known health, medical, or phenotypic relevance, the GenoChip is a useful tool for genetic anthropology and population genetics. PMID:23666864
The GenoChip: a new tool for genetic anthropology.

PubMed

Elhaik, Eran; Greenspan, Elliott; Staats, Sean; Krahn, Thomas; Tyler-Smith, Chris; Xue, Yali; Tofanelli, Sergio; Francalacci, Paolo; Cucca, Francesco; Pagani, Luca; Jin, Li; Li, Hui; Schurr, Theodore G; Greenspan, Bennett; Spencer Wells, R

2013-01-01

The Genographic Project is an international effort aimed at charting human migratory history. The project is nonprofit and nonmedical, and, through its Legacy Fund, supports locally led efforts to preserve indigenous and traditional cultures. Although the first phase of the project was focused on uniparentally inherited markers on the Y-chromosome and mitochondrial DNA (mtDNA), the current phase focuses on markers from across the entire genome to obtain a more complete understanding of human genetic variation. Although many commercial arrays exist for genome-wide single-nucleotide polymorphism (SNP) genotyping, they were designed for medical genetic studies and contain medically related markers that are inappropriate for global population genetic studies. GenoChip, the Genographic Project's new genotyping array, was designed to resolve these issues and enable higher resolution research into outstanding questions in genetic anthropology. The GenoChip includes ancestry informative markers obtained for over 450 human populations, an ancient human (Saqqaq), and two archaic hominins (Neanderthal and Denisovan) and was designed to identify all known Y-chromosome and mtDNA haplogroups. The chip was carefully vetted to avoid inclusion of medically relevant markers. To demonstrate its capabilities, we compared the FST distributions of GenoChip SNPs to those of two commercial arrays. Although all arrays yielded similarly shaped (inverse J) FST distributions, the GenoChip autosomal and X-chromosomal distributions had the highest mean FST, attesting to its ability to discern subpopulations. The chip performances are illustrated in a principal component analysis for 14 worldwide populations. In summary, the GenoChip is a dedicated genotyping platform for genetic anthropology. With an unprecedented number of approximately 12,000 Y-chromosomal and approximately 3,300 mtDNA SNPs and over 130,000 autosomal and X-chromosomal SNPs without any known health, medical, or phenotypic relevance, the GenoChip is a useful tool for genetic anthropology and population genetics.
Integrating the genomic architecture of human nucleolar organizer regions with the biophysical properties of nucleoli.

PubMed

Mangan, Hazel; Gailín, Michael Ó; McStay, Brian

2017-12-01

Nucleoli are the sites of ribosome biogenesis and the largest membraneless subnuclear structures. They are intimately linked with growth and proliferation control and function as sensors of cellular stress. Nucleoli form around arrays of ribosomal gene (rDNA) repeats also called nucleolar organizer regions (NORs). In humans, NORs are located on the short arms of all five human acrocentric chromosomes. Multiple NORs contribute to the formation of large heterochromatin-surrounded nucleoli observed in most human cells. Here we will review recent findings about their genomic architecture. The dynamic nature of nucleoli began to be appreciated with the advent of photodynamic experiments using fluorescent protein fusions. We review more recent data on nucleoli in Xenopus germinal vesicles (GVs) which has revealed a liquid droplet-like behavior that facilitates nucleolar fusion. Further analysis in both XenopusGVs and Drosophila embryos indicates that the internal organization of nucleoli is generated by a combination of liquid-liquid phase separation and active processes involving rDNA. We will attempt to integrate these recent findings with the genomic architecture of human NORs to advance our understanding of how nucleoli form and respond to stress in human cells. © 2017 Federation of European Biochemical Societies.
Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups

PubMed Central

Lou, Haiyi; Li, Shilin; Jin, Wenfei; Fu, Ruiqing; Lu, Dongsheng; Pan, Xinwei; Zhou, Huaigu; Ping, Yuan; Jin, Li; Xu, Shuhua

2015-01-01

Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies. PMID:25026903
Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups.

PubMed

Lou, Haiyi; Li, Shilin; Jin, Wenfei; Fu, Ruiqing; Lu, Dongsheng; Pan, Xinwei; Zhou, Huaigu; Ping, Yuan; Jin, Li; Xu, Shuhua

2015-04-01

Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies.
Comparison between fluorescent in-situ hybridisation and array comparative genomic hybridisation in preimplantation genetic diagnosis in translocation carriers.

PubMed

Lee, Vivian C Y; Chow, Judy F C; Lau, Estella Y L; Yeung, William S B; Ho, P C; Ng, Ernest H Y

2015-02-01

To compare the pregnancy outcome of the fluorescent in-situ hybridisation and array comparative genomic hybridisation in preimplantation genetic diagnosis of translocation carriers. Historical cohort. A teaching hospital in Hong Kong. All preimplantation genetic diagnosis treatment cycles performed for translocation carriers from 2001 to 2013. Overall, 101 treatment cycles for preimplantation genetic diagnosis in translocation were included: 77 cycles for reciprocal translocation and 24 cycles for Robertsonian translocation. Fluorescent in-situ hybridisation and array comparative genomic hybridisation were used in 78 and 11 cycles, respectively. The ongoing pregnancy rate per initiated cycle after array comparative genomic hybridisation was significantly higher than that after fluorescent in-situ hybridisation in all translocation carriers (36.4% vs 9.0%; P=0.010). The miscarriage rate was comparable with both techniques. The testing method (array comparative genomic hybridisation or fluorescent in-situ hybridisation) was the only significant factor affecting the ongoing pregnancy rate after controlling for the women's age, type of translocation, and clinical information of the preimplantation genetic diagnosis cycles by logistic regression (odds ratio=1.875; P=0.023; 95% confidence interval, 1.090-3.226). This local retrospective study confirmed that comparative genomic hybridisation is associated with significantly higher pregnancy rates versus fluorescent in-situ hybridisation in translocation carriers. Array comparative genomic hybridisation should be the technique of choice in preimplantation genetic diagnosis cycles in translocation carriers.
Functional Analysis of Porphyromonas gingivalis W83 CRISPR-Cas Systems.

PubMed

Burmistrz, Michał; Dudek, Bartosz; Staniec, Dominika; Rodriguez Martinez, Jose Ignacio; Bochtler, Matthias; Potempa, Jan; Pyrc, Krzysztof

2015-08-01

The CRISPR-Cas (clustered regularly interspaced short palindromic repeats/CRISPR-associated genes) system provides prokaryotic cells with an adaptive and heritable immune response to foreign genetic elements, such as viruses, plasmids, and transposons. It is present in the majority of Archaea and almost half of species of Bacteria. Porphyromonas gingivalis is an important human pathogen that has been proven to be an etiological agent of periodontitis and has been linked to systemic conditions, such as rheumatoid arthritis and cardiovascular disease. At least 95% of clinical strains of P. gingivalis carry CRISPR arrays, suggesting that these arrays play an important function in vivo. Here we show that all four CRISPR arrays present in the P. gingivalis W83 genome are transcribed. For one of the arrays, we demonstrate in vivo activity against double-stranded DNA constructs containing protospacer sequences accompanied at the 3' end by an NGG protospacer-adjacent motif (PAM). Most of the 44 spacers present in the genome of P. gingivalis W83 share no significant similarity with any known sequences, although 4 spacers are similar to sequences from bacteria found in the oral cavity and the gastrointestinal tract. Four spacers match genomic sequences of the host; however, none of these is flanked at its 3' terminus by the appropriate PAM element. The CRISPR-Cas (clustered regularly interspaced short palindromic repeats/CRISPR-associated genes) system is a unique system that provides prokaryotic cells with an adaptive and heritable immunity. In this report, we show that the CRISPR-Cas system of P. gingivalis, an important human pathogen associated with periodontitis and possibly also other conditions, such as rheumatoid arthritis and cardiovascular disease, is active and provides protection from foreign genetic elements. Importantly, the data presented here may be useful for better understanding the communication between cells in larger bacterial communities and, consequently, the process of disease development and progression. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Functional Analysis of Porphyromonas gingivalis W83 CRISPR-Cas Systems

PubMed Central

Burmistrz, Michał; Dudek, Bartosz; Staniec, Dominika; Rodriguez Martinez, Jose Ignacio; Bochtler, Matthias; Potempa, Jan

2015-01-01

ABSTRACT The CRISPR-Cas (clustered regularly interspaced short palindromic repeats/CRISPR-associated genes) system provides prokaryotic cells with an adaptive and heritable immune response to foreign genetic elements, such as viruses, plasmids, and transposons. It is present in the majority of Archaea and almost half of species of Bacteria. Porphyromonas gingivalis is an important human pathogen that has been proven to be an etiological agent of periodontitis and has been linked to systemic conditions, such as rheumatoid arthritis and cardiovascular disease. At least 95% of clinical strains of P. gingivalis carry CRISPR arrays, suggesting that these arrays play an important function in vivo. Here we show that all four CRISPR arrays present in the P. gingivalis W83 genome are transcribed. For one of the arrays, we demonstrate in vivo activity against double-stranded DNA constructs containing protospacer sequences accompanied at the 3′ end by an NGG protospacer-adjacent motif (PAM). Most of the 44 spacers present in the genome of P. gingivalis W83 share no significant similarity with any known sequences, although 4 spacers are similar to sequences from bacteria found in the oral cavity and the gastrointestinal tract. Four spacers match genomic sequences of the host; however, none of these is flanked at its 3′ terminus by the appropriate PAM element. IMPORTANCE The CRISPR-Cas (clustered regularly interspaced short palindromic repeats/CRISPR-associated genes) system is a unique system that provides prokaryotic cells with an adaptive and heritable immunity. In this report, we show that the CRISPR-Cas system of P. gingivalis, an important human pathogen associated with periodontitis and possibly also other conditions, such as rheumatoid arthritis and cardiovascular disease, is active and provides protection from foreign genetic elements. Importantly, the data presented here may be useful for better understanding the communication between cells in larger bacterial communities and, consequently, the process of disease development and progression. PMID:26013482
Combined array CGH plus SNP genome analyses in a single assay for optimized clinical testing

PubMed Central

Wiszniewska, Joanna; Bi, Weimin; Shaw, Chad; Stankiewicz, Pawel; Kang, Sung-Hae L; Pursley, Amber N; Lalani, Seema; Hixson, Patricia; Gambin, Tomasz; Tsai, Chun-hui; Bock, Hans-Georg; Descartes, Maria; Probst, Frank J; Scaglia, Fernando; Beaudet, Arthur L; Lupski, James R; Eng, Christine; Wai Cheung, Sau; Bacino, Carlos; Patel, Ankita

2014-01-01

In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60 000 SNP probes, referred to as Chromosomal Microarray Analysis – Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner. PMID:23695279
Role of PELP1 in EGFR-ER Signaling Crosstalk in Ovarian Cancer Cells

DTIC Science & Technology

2009-04-01

expression of genes involved in metastasis using a focused microarray approach. We have used Human Tumor Metastasis Microarray (Oligo GE array from...ovarian cancer progression. Analysis of human genome databases and SAGE data suggested deregulation of PELP1 expression in ovarian cancer cells...PI3K, and STAT3 in the cytosol. PELP1/MNAR regulates meiosis via its interactions with heterotimeric Gbc protein, androgen receptor (AR), and by
Streptococcus pneumoniae Supragenome Hybridization Arrays for Profiling of Genetic Content and Gene Expression.

PubMed

Kadam, Anagha; Janto, Benjamin; Eutsey, Rory; Earl, Joshua P; Powell, Evan; Dahlgren, Margaret E; Hu, Fen Z; Ehrlich, Garth D; Hiller, N Luisa

2015-02-02

There is extensive genomic diversity among Streptococcus pneumoniae isolates. Approximately half of the comprehensive set of genes in the species (the supragenome or pangenome) is present in all the isolates (core set), and the remaining is unevenly distributed among strains (distributed set). The Streptococcus pneumoniae Supragenome Hybridization (SpSGH) array provides coverage for an extensive set of genes and polymorphisms encountered within this species, capturing this genomic diversity. Further, the capture is quantitative. In this manner, the SpSGH array allows for both genomic and transcriptomic analyses of diverse S. pneumoniae isolates on a single platform. In this unit, we present the SpSGH array, and describe in detail its design and implementation for both genomic and transcriptomic analyses. The methodology can be applied to construction and modification of SpSGH array platforms, as well to other bacterial species as long as multiple whole-genome sequences are available that collectively capture the vast majority of the species supragenome. Copyright © 2015 John Wiley & Sons, Inc.
Small cell ovarian carcinoma: genomic stability and responsiveness to therapeutics.

PubMed

Gamwell, Lisa F; Gambaro, Karen; Merziotis, Maria; Crane, Colleen; Arcand, Suzanna L; Bourada, Valerie; Davis, Christopher; Squire, Jeremy A; Huntsman, David G; Tonin, Patricia N; Vanderhyden, Barbara C

2013-02-21

The biology of small cell ovarian carcinoma of the hypercalcemic type (SCCOHT), which is a rare and aggressive form of ovarian cancer, is poorly understood. Tumourigenicity, in vitro growth characteristics, genetic and genomic anomalies, and sensitivity to standard and novel chemotherapeutic treatments were investigated in the unique SCCOHT cell line, BIN-67, to provide further insight in the biology of this rare type of ovarian cancer. The tumourigenic potential of BIN-67 cells was determined and the tumours formed in a xenograft model was compared to human SCCOHT. DNA sequencing, spectral karyotyping and high density SNP array analysis was performed. The sensitivity of the BIN-67 cells to standard chemotherapeutic agents and to vesicular stomatitis virus (VSV) and the JX-594 vaccinia virus was tested. BIN-67 cells were capable of forming spheroids in hanging drop cultures. When xenografted into immunodeficient mice, BIN-67 cells developed into tumours that reflected the hypercalcemia and histology of human SCCOHT, notably intense expression of WT-1 and vimentin, and lack of expression of inhibin. Somatic mutations in TP53 and the most common activating mutations in KRAS and BRAF were not found in BIN-67 cells by DNA sequencing. Spectral karyotyping revealed a largely normal diploid karyotype (in greater than 95% of cells) with a visibly shorter chromosome 20 contig. High density SNP array analysis also revealed few genomic anomalies in BIN-67 cells, which included loss of heterozygosity of an estimated 16.7 Mb interval on chromosome 20. SNP array analyses of four SCCOHT samples also indicated a low frequency of genomic anomalies in the majority of cases. Although resistant to platinum chemotherapeutic drugs, BIN-67 cell viability in vitro was reduced by > 75% after infection with oncolytic viruses. These results show that SCCOHT differs from high-grade serous carcinomas by exhibiting few chromosomal anomalies and lacking TP53 mutations. Although BIN-67 cells are resistant to standard chemotherapeutic agents, their sensitivity to oncolytic viruses suggests that their therapeutic use in SCCOHT should be considered.

Kaposi’s sarcoma–associated herpesvirus stably clusters its genomes across generations to maintain itself extrachromosomally

PubMed Central

Chiu, Ya-Fang; Sugden, Arthur U.

2017-01-01

Genetic elements that replicate extrachromosomally are rare in mammals; however, several human tumor viruses, including the papillomaviruses and the gammaherpesviruses, maintain their plasmid genomes by tethering them to cellular chromosomes. We have uncovered an unprecedented mechanism of viral replication: Kaposi’s sarcoma–associated herpesvirus (KSHV) stably clusters its genomes across generations to maintain itself extrachromosomally. To identify and characterize this mechanism, we developed two complementary, independent approaches: live-cell imaging and a predictive computational model. The clustering of KSHV requires the viral protein, LANA1, to bind viral genomes to nucleosomes arrayed on both cellular and viral DNA. Clustering affects both viral partitioning and viral genome numbers of KSHV. The clustering of KSHV plasmids provides it with an effective evolutionary strategy to rapidly increase copy numbers of genomes per cell at the expense of the total numbers of cells infected. PMID:28696226
Kaposi’s sarcoma–associated herpesvirus stably clusters its genomes across generations to maintain itself extrachromosomally

DOE PAGES

Chiu, Ya-Fang; Sugden, Arthur U.; Fox, Kathryn; ...

2017-07-10

Genetic elements that replicate extrachromosomally are rare in mammals; however, several human tumor viruses, including the papillomaviruses and the gammaherpesviruses, maintain their plasmid genomes by tethering them to cellular chromosomes. We have uncovered an unprecedented mechanism of viral replication: Kaposi’s sarcoma–associated herpesvirus (KSHV) stably clusters its genomes across generations to maintain itself extrachromosomally. To identify and characterize this mechanism, we developed two complementary, independent approaches: live-cell imaging and a predictive computational model. The clustering of KSHV requires the viral protein, LANA1, to bind viral genomes to nucleosomes arrayed on both cellular and viral DNA. Clustering affects both viral partitioning andmore » viral genome numbers of KSHV. The clustering of KSHV plasmids provides it with an effective evolutionary strategy to rapidly increase copy numbers of genomes per cell at the expense of the total numbers of cells infected.« less
Kaposi’s sarcoma–associated herpesvirus stably clusters its genomes across generations to maintain itself extrachromosomally

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chiu, Ya-Fang; Sugden, Arthur U.; Fox, Kathryn

Genetic elements that replicate extrachromosomally are rare in mammals; however, several human tumor viruses, including the papillomaviruses and the gammaherpesviruses, maintain their plasmid genomes by tethering them to cellular chromosomes. We have uncovered an unprecedented mechanism of viral replication: Kaposi’s sarcoma–associated herpesvirus (KSHV) stably clusters its genomes across generations to maintain itself extrachromosomally. To identify and characterize this mechanism, we developed two complementary, independent approaches: live-cell imaging and a predictive computational model. The clustering of KSHV requires the viral protein, LANA1, to bind viral genomes to nucleosomes arrayed on both cellular and viral DNA. Clustering affects both viral partitioning andmore » viral genome numbers of KSHV. The clustering of KSHV plasmids provides it with an effective evolutionary strategy to rapidly increase copy numbers of genomes per cell at the expense of the total numbers of cells infected.« less
Toolbox for Antibiotics Discovery from Microorganisms.

PubMed

Fisch, Katja M; Schäberle, Till F

2016-09-01

Microorganisms produce a vast array of biologically active metabolites. Such compounds are applied by humans to positively influence their health and, therefore, natural products serve as drug leads for pharmaceutical and medicinal chemistry. In this minireview, tools for the discovery and the production of potential drug leads are explained. A snapshot is provided, starting from the isolation of new producer strains, across genomic mining of (meta)genomes to identify biosynthetic gene clusters corresponding to natural products, toward heterologous expression to produce potential drug leads. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Application of bacterial artificial chromosome array-based comparative genomic hybridization and spectral karyotyping to the analysis of glioblastoma multiforme.

PubMed

Cowell, John K; Matsui, Sei-Ichi; Wang, Yong D; LaDuca, Jeffrey; Conroy, Jeffrey; McQuaid, Devin; Nowak, Norma J

2004-05-01

Identification of genetic losses and gains is valuable in analysis of brain tumors. Locus-by-locus analyses have revealed correlations between prognosis and response to chemotherapy and loss or gain of specific genes and loci. These approaches are labor intensive and do not provide a global view of the genetic changes within the tumor cells. Bacterial artificial chromosome (BAC) arrays, which cover the genome with an average resolution of less than 1 MbP, allow defining the sum total of these genetic changes in a single comparative genomic hybridization (CGH) experiment. These changes are directly overlaid on the human genome sequence, thus providing the extent of the amplification or deletion, reflected by a megabase position, and gene content of the abnormal region. Although this array-based CGH approach (CGHa) seems to detect the extent of the genetic changes in tumors reliably, it has not been robustly tested. We compared genetic changes in four newly derived, early-passage glioma cell lines, using spectral karyotyping (SKY) and CGHa. Chromosome changes seen in cell lines under SKY analysis were also detected with CGHa. In addition, CGHa detected cryptic genetic gains and losses and resolved the nature of subtle marker chromosomes that could not be resolved with SKY, thus providing distinct advantages over previous technologies. There was remarkable general concordance between the CGHa results comparing the cell lines to the original tumor, except that the magnitude of the changes seen in the tumor sample was generally suppressed compared with the cell lines, a consequence of normal cells contaminating the tumor sample. CGHa revealed changes in cell lines that were not present in the original tumors and vice versa, even when analyzed at the earliest passage possible, which highlights the adaptation of the cells to in vitro culture. CGHa proved to be highly accurate and efficient for identifying genetic changes in tumor cells. This approach can accurately identify subtle, novel genetic abnormalities in tumors directly linked to the human genome sequence. CGHa far surpasses the resolution and information provided by conventional metaphase CGH, without relying on in vitro culture of tumors for metaphase spreads.
Analysis of copy number variations among cattle breeds

USDA-ARS?s Scientific Manuscript database

Genomic structural variation is an important and abundant source of genetic and phenotypic variation. Here we describe the first systematic and genome-wide analysis of copy number variations (CNVs) in the modern domesticated cattle using array comparative genomic hybridization (array CGH) and quanti...
Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses

PubMed Central

2011-01-01

Background Integration of retroviral DNA into a germ cell may lead to a provirus that is transmitted vertically to that host's offspring as an endogenous retrovirus (ERV). In humans, ERVs (HERVs) comprise about 8% of the genome, the vast majority of which are truncated and/or highly mutated and no longer encode functional genes. The most recently active retroviruses that integrated into the human germ line are members of the Betaretrovirus-like HERV-K (HML-2) group, many of which contain intact open reading frames (ORFs) in some or all genes, sometimes encoding functional proteins that are expressed in various tissues. Interestingly, this expression is upregulated in many tumors ranging from breast and ovarian tissues to lymphomas and melanomas, as well as schizophrenia, rheumatoid arthritis, and other disorders. Results No study to date has characterized all HML-2 elements in the genome, an essential step towards determining a possible functional role of HML-2 expression in disease. We present here the most comprehensive and accurate catalog of all full-length and partial HML-2 proviruses, as well as solo LTR elements, within the published human genome to date. Furthermore, we provide evidence for preferential maintenance of proviruses and solo LTR elements on gene-rich chromosomes of the human genome and in proximity to gene regions. Conclusions Our analysis has found and corrected several errors in the annotation of HML-2 elements in the human genome, including mislabeling of a newly identified group called HML-11. HML-elements have been implicated in a wide array of diseases, and characterization of these elements will play a fundamental role to understand the relationship between endogenous retrovirus expression and disease. PMID:22067224
Analysis of copy number variations reveals differences among cattle breeds

USDA-ARS?s Scientific Manuscript database

Genomic structural variation is an important and abundant source of genetic and phenotypic variation. Here we describe the first systematic and genome-wide analysis of copy number variations (CNVs) in the modern domesticated cattle using array comparative genomic hybridization (array CGH) and quanti...
Single-cell copy number variation detection

PubMed Central

2011-01-01

Detection of chromosomal aberrations from a single cell by array comparative genomic hybridization (single-cell array CGH), instead of from a population of cells, is an emerging technique. However, such detection is challenging because of the genome artifacts and the DNA amplification process inherent to the single cell approach. Current normalization algorithms result in inaccurate aberration detection for single-cell data. We propose a normalization method based on channel, genome composition and recurrent genome artifact corrections. We demonstrate that the proposed channel clone normalization significantly improves the copy number variation detection in both simulated and real single-cell array CGH data. PMID:21854607
Genomic Mapping of Human DNA provides Evidence of Difference in Stretch between AT and GC rich regions

NASA Astrophysics Data System (ADS)

Reifenberger, Jeffrey; Dorfman, Kevin; Cao, Han

Human DNA is a not a polymer consisting of a uniform distribution of all 4 nucleic acids, but rather contains regions of high AT and high GC content. When confined, these regions could have different stretch due to the extra hydrogen bond present in the GC basepair. To measure this potential difference, human genomic DNA was nicked with NtBspQI, labeled with a cy3 like fluorophore at the nick site, stained with YOYO, loaded into a device containing an array of nanochannels, and imaged. Over 473,000 individual molecules of DNA, corresponding to roughly 30x coverage of a human genome, were collected and aligned to the human reference. Based on the known AT/GC content between aligned pairs of labels, the stretch was measured for regions of similar size but different AT/GC content. We found that regions of high GC content were consistently more stretched than regions of high AT content between pairs of labels varying in size between 2.5 kbp and 500 kbp. We measured that for every 1% increase in GC content there was roughly a 0.06% increase in stretch. While this effect is small, it is important to take into account differences in stretch between AT and GC rich regions to improve the sensitivity of detection of structural variations from genomic variations. NIH Grant: R01-HG006851.
Prenatal diagnosis of chromosomal abnormalities using array-based comparative genomic hybridization

USDA-ARS?s Scientific Manuscript database

This study was designed to evaluate the feasibility of using a targeted array-CGH strategy for prenatal diagnosis of genomic imbalances in a clinical setting of current pregnancies. Women undergoing prenatal diagnosis were counseled and offered array-CGH (BCM V4.0) in addition to routine chromosome ...
Conservation of human chromosome 13 polymorphic microsatellite (CA){sub n} repeats in chimpanzees

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deka, R.; Shriver, M.D.; Yu, L.M.

Tandemly repeated (dC-dA){sub n} {center_dot} (dG-dT){sub n} sequences occur abundantly and are found in most eukaryotic genomes. To investigate the level of conservation of these repeat sequences in nonhuman primates, the authors have analyzed seven human chromosome 13 dinucleotide (CA){sub n} repeat loci in chimpanzees by DNA amplification using primers designed for analysis of human loci. Comparable levels of polymorphism at these loci in the two species, revealed by the number of alleles, heterozygosity, and allele sizes, suggest that the (CA){sub n} repeat arrays and their genomic locations are highly conserved. Even though the proportion of shared alleles between themore » two species varies enormously and the modal alleles are not the same, allelic lengths at each locus in the chimpanzees are detected within the bounds of the allele size range observed in humans. A similar observation has been noted in a limited number of gorillas and orangutans. Using a new measure of genetic distance that takes into account the size of alleles, they have compared the genetic distance between humans and chimpanzees. The genetic distance between these two species was found to be ninefold smaller than expected assuming there is no selection or mutational bias toward retention of (CA){sub n} repeat arrays. These findings suggest a functional significance for these microsatellite loci. 34 refs., 1 fig., 2 tabs.« less
Cross-species comparison of aCGH data from mouse and human BRCA1- and BRCA2-mutated breast cancers

PubMed Central

2010-01-01

Background Genomic gains and losses are a result of genomic instability in many types of cancers. BRCA1- and BRCA2-mutated breast cancers are associated with increased amounts of chromosomal aberrations, presumably due their functions in genome repair. Some of these genomic aberrations may harbor genes whose absence or overexpression may give rise to cellular growth advantage. So far, it has not been easy to identify the driver genes underlying gains and losses. A powerful approach to identify these driver genes could be a cross-species comparison of array comparative genomic hybridization (aCGH) data from cognate mouse and human tumors. Orthologous regions of mouse and human tumors that are commonly gained or lost might represent essential genomic regions selected for gain or loss during tumor development. Methods To identify genomic regions that are associated with BRCA1- and BRCA2-mutated breast cancers we compared aCGH data from 130 mouse Brca1Δ/Δ;p53Δ/Δ, Brca2Δ/Δ;p53Δ/Δ and p53Δ/Δ mammary tumor groups with 103 human BRCA1-mutated, BRCA2-mutated and non-hereditary breast cancers. Results Our genome-wide cross-species analysis yielded a complete collection of loci and genes that are commonly gained or lost in mouse and human breast cancer. Principal common CNAs were the well known MYC-associated gain and RB1/INTS6-associated loss that occurred in all mouse and human tumor groups, and the AURKA-associated gain occurred in BRCA2-related tumors from both species. However, there were also important differences between tumor profiles of both species, such as the prominent gain on chromosome 10 in mouse Brca2Δ/Δ;p53Δ/Δ tumors and the PIK3CA associated 3q gain in human BRCA1-mutated tumors, which occurred in tumors from one species but not in tumors from the other species. This disparity in recurrent aberrations in mouse and human tumors might be due to differences in tumor cell type or genomic organization between both species. Conclusions The selection of the oncogenome during mouse and human breast tumor development is markedly different, apart from the MYC gain and RB1-associated loss. These differences should be kept in mind when using mouse models for preclinical studies. PMID:20735817
The vacuolar protein sorting genes in insects: A comparative genome view.

PubMed

Li, Zhaofei; Blissard, Gary

2015-07-01

In eukaryotic cells, regulated vesicular trafficking is critical for directing protein transport and for recycling and degradation of membrane lipids and proteins. Through carefully regulated transport vesicles, the endomembrane system performs a large and important array of dynamic cellular functions while maintaining the integrity of the cellular membrane system. Genetic studies in yeast Saccharomyces cerevisiae have identified approximately 50 vacuolar protein sorting (VPS) genes involved in vesicle trafficking, and most of these genes are also characterized in mammals. The VPS proteins form distinct functional complexes, which include complexes known as ESCRT, retromer, CORVET, HOPS, GARP, and PI3K-III. Little is known about the orthologs of VPS proteins in insects. Here, with the newly annotated Manduca sexta genome, we carried out genomic comparative analysis of VPS proteins in yeast, humans, and 13 sequenced insect genomes representing the Orders Hymenoptera, Diptera, Hemiptera, Phthiraptera, Lepidoptera, and Coleoptera. Amino acid sequence alignments and domain/motif structure analyses reveal that most of the components of ESCRT, retromer, CORVET, HOPS, GARP, and PI3K-III are evolutionarily conserved across yeast, insects, and humans. However, in contrast to the VPS gene expansions observed in the human genome, only four VPS genes (VPS13, VPS16, VPS33, and VPS37) were expanded in the six insect Orders. Additionally, VPS2 was expanded only in species from Phthiraptera, Lepidoptera, and Coleoptera. These studies provide a baseline for understanding the evolution of vesicular trafficking across yeast, insect, and human genomes, and also provide a basis for further addressing specific functional roles of VPS proteins in insects. Copyright © 2014 Elsevier Ltd. All rights reserved.
Evaluation of Genomic Instability in the Abnormal Prostate

DTIC Science & Technology

2006-12-01

array CGH maps copy number aberrations relative to the genome sequence by using arrays of BAC or cDNA clones as the hybridization target instead of...data produced from these analyses complicate the interpretation of results . For these reasons, and as outlined by Davies et al., 22 it is desirable...There have been numerous studies of these abnormalities and several techniques, including 9 chromosome painting, array CGH and SNP arrays , have
Preparation and screening of an arrayed human genomic library generated with the P1 cloning system.

PubMed Central

Shepherd, N S; Pfrogner, B D; Coulby, J N; Ackerman, S L; Vaidyanathan, G; Sauer, R H; Balkenhol, T C; Sternberg, N

1994-01-01

We describe here the construction and initial characterization of a 3-fold coverage genomic library of the human haploid genome that was prepared using the bacteriophage P1 cloning system. The cloned DNA inserts were produced by size fractionation of a Sau3AI partial digest of high molecular weight genomic DNA isolated from primary cells of human foreskin fibroblasts. The inserts were cloned into the pAd10sacBII vector and packaged in vitro into P1 phage. These were used to generate recombinant bacterial clones, each of which was picked robotically from an agar plate into a well of a 96-well microtiter dish, grown overnight, and stored at -70 degrees C. The resulting library, designated DMPC-HFF#1 series A, consists of approximately 130,000-140,000 recombinant clones that were stored in 1500 microtiter dishes. To screen the library, clones were combined in a pooling strategy and specific loci were identified by PCR analysis. On average, the library contains two or three different clones for each locus screened. To date we have identified a total of 17 clones containing the hypoxanthine-guanine phosphoribosyltransferase, human serum albumin-human alpha-fetoprotein, p53, cyclooxygenase I, human apurinic endonuclease, beta-polymerase, and DNA ligase I genes. The cloned inserts average 80 kb in size and range from 70 to 95 kb, with one 49-kb insert and one 62-kb insert. Images PMID:8146166
Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.

PubMed Central

Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W

1996-01-01

Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227
Role of PELP1 in EGFR-ER Signaling Crosstalk in Ovarian Cancer Cells

DTIC Science & Technology

2008-04-01

IHC studies using human ovarian cancer tissue arrays (n=123) showed that PELP1/MNAR is 2 to 3 fold over expressed in 60% of ovarian tumors To...cancers, however little is known about PELP1 role in ovarian cancer progression. Analysis of human genome databases and SAGE data suggested...PELP1/MNAR can facilitate ER nonge- nomic signaling via Src kinase, PI3K, and STAT3 in the cytosol. PELP1/MNAR regulates meiosis via its interactions
Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

PubMed

Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

2014-01-01

High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
Development and Validation of a 20K Single Nucleotide Polymorphism (SNP) Whole Genome Genotyping Array for Apple (Malus × domestica Borkh)

PubMed Central

Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

2014-01-01

High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs. PMID:25303088

Human Y chromosome copy number variation in the next generation sequencing era and beyond.

PubMed

Massaia, Andrea; Xue, Yali

2017-05-01

The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.
Meta-analysis of human genome-microbiome association studies: the MiBioGen consortium initiative.

PubMed

Wang, Jun; Kurilshikov, Alexander; Radjabzadeh, Djawad; Turpin, Williams; Croitoru, Kenneth; Bonder, Marc Jan; Jackson, Matthew A; Medina-Gomez, Carolina; Frost, Fabian; Homuth, Georg; Rühlemann, Malte; Hughes, David; Kim, Han-Na; Spector, Tim D; Bell, Jordana T; Steves, Claire J; Timpson, Nicolas; Franke, Andre; Wijmenga, Cisca; Meyer, Katie; Kacprowski, Tim; Franke, Lude; Paterson, Andrew D; Raes, Jeroen; Kraaij, Robert; Zhernakova, Alexandra

2018-06-08

In recent years, human microbiota, especially gut microbiota, have emerged as an important yet complex trait influencing human metabolism, immunology, and diseases. Many studies are investigating the forces underlying the observed variation, including the human genetic variants that shape human microbiota. Several preliminary genome-wide association studies (GWAS) have been completed, but more are necessary to achieve a fuller picture. Here, we announce the MiBioGen consortium initiative, which has assembled 18 population-level cohorts and some 19,000 participants. Its aim is to generate new knowledge for the rapidly developing field of microbiota research. Each cohort has surveyed the gut microbiome via 16S rRNA sequencing and genotyped their participants with full-genome SNP arrays. We have standardized the analytical pipelines for both the microbiota phenotypes and genotypes, and all the data have been processed using identical approaches. Our analysis of microbiome composition shows that we can reduce the potential artifacts introduced by technical differences in generating microbiota data. We are now in the process of benchmarking the association tests and performing meta-analyses of genome-wide associations. All pipeline and summary statistics results will be shared using public data repositories. We present the largest consortium to date devoted to microbiota-GWAS. We have adapted our analytical pipelines to suit multi-cohort analyses and expect to gain insight into host-microbiota cross-talk at the genome-wide level. And, as an open consortium, we invite more cohorts to join us (by contacting one of the corresponding authors) and to follow the analytical pipeline we have developed.
The pathological consequences of impaired genome integrity in humans; disorders of the DNA replication machinery.

PubMed

O'Driscoll, Mark

2017-01-01

Accurate and efficient replication of the human genome occurs in the context of an array of constitutional barriers, including regional topological constraints imposed by chromatin architecture and processes such as transcription, catenation of the helical polymer and spontaneously generated DNA lesions, including base modifications and strand breaks. DNA replication is fundamentally important for tissue development and homeostasis; differentiation programmes are intimately linked with stem cell division. Unsurprisingly, impairments of the DNA replication machinery can have catastrophic consequences for genome stability and cell division. Functional impacts on DNA replication and genome stability have long been known to play roles in malignant transformation through a variety of complex mechanisms, and significant further insights have been gained from studying model organisms in this context. Congenital hypomorphic defects in components of the DNA replication machinery have been and continue to be identified in humans. These disorders present with a wide range of clinical features. Indeed, in some instances, different mutations in the same gene underlie different clinical presentations. Understanding the origin and molecular basis of these features opens a window onto the range of developmental impacts of suboptimal DNA replication and genome instability in humans. Here, I will briefly overview the basic steps involved in DNA replication and the key concepts that have emerged from this area of research, before switching emphasis to the pathological consequences of defects within the DNA replication network; the human disorders. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
DNA Sequencing by Capillary Electrophoresis

PubMed Central

Karger, Barry L.; Guttman, Andras

2009-01-01

Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496
Coverage and efficiency in current SNP chips

PubMed Central

Ha, Ngoc-Thuy; Freytag, Saskia; Bickeboeller, Heike

2014-01-01

To answer the question as to which commercial high-density SNP chip covers most of the human genome given a fixed budget, we compared the performance of 12 chips of different sizes released by Affymetrix and Illumina for the European, Asian, and African populations. These include Affymetrix' relatively new population-optimized arrays, whose SNP sets are each tailored toward a specific ethnicity. Our evaluation of the chips included the use of two measures, efficiency and cost–benefit ratio, which we developed as supplements to genetic coverage. Unlike coverage, these measures factor in the price of a chip or its substitute size (number of SNPs on chip), allowing comparisons to be drawn between differently priced chips. In this fashion, we identified the Affymetrix population-optimized arrays as offering the most cost-effective coverage for the Asian and African population. For the European population, we established the Illumina Human Omni 2.5-8 as the preferred choice. Interestingly, the Affymetrix chip tailored toward an Eastern Asian subpopulation performed well for all three populations investigated. However, our coverage estimates calculated for all chips proved much lower than those advertised by the producers. All our analyses were based on the 1000 Genome Project as reference population. PMID:24448550
Genome-wide comparison of medieval and modern Mycobacterium leprae.

PubMed

Schuenemann, Verena J; Singh, Pushpendra; Mendum, Thomas A; Krause-Kyora, Ben; Jäger, Günter; Bos, Kirsten I; Herbig, Alexander; Economou, Christos; Benjak, Andrej; Busso, Philippe; Nebel, Almut; Boldsen, Jesper L; Kjellström, Anna; Wu, Huihai; Stewart, Graham R; Taylor, G Michael; Bauer, Peter; Lee, Oona Y-C; Wu, Houdini H T; Minnikin, David E; Besra, Gurdyal S; Tucker, Katie; Roffey, Simon; Sow, Samba O; Cole, Stewart T; Nieselt, Kay; Krause, Johannes

2013-07-12

Leprosy was endemic in Europe until the Middle Ages. Using DNA array capture, we have obtained genome sequences of Mycobacterium leprae from skeletons of five medieval leprosy cases from the United Kingdom, Sweden, and Denmark. In one case, the DNA was so well preserved that full de novo assembly of the ancient bacterial genome could be achieved through shotgun sequencing alone. The ancient M. leprae sequences were compared with those of 11 modern strains, representing diverse genotypes and geographic origins. The comparisons revealed remarkable genomic conservation during the past 1000 years, a European origin for leprosy in the Americas, and the presence of an M. leprae genotype in medieval Europe now commonly associated with the Middle East. The exceptional preservation of M. leprae biomarkers, both DNA and mycolic acids, in ancient skeletons has major implications for palaeomicrobiology and human pathogen evolution.
Diversity of Prdm9 Zinc Finger Array in Wild Mice Unravels New Facets of the Evolutionary Turnover of this Coding Minisatellite

PubMed Central

Buard, Jérôme; Rivals, Eric; Dunoyer de Segonzac, Denis; Garres, Charlotte; Caminade, Pierre; de Massy, Bernard; Boursot, Pierre

2014-01-01

In humans and mice, meiotic recombination events cluster into narrow hotspots whose genomic positions are defined by the PRDM9 protein via its DNA binding domain constituted of an array of zinc fingers (ZnFs). High polymorphism and rapid divergence of the Prdm9 gene ZnF domain appear to involve positive selection at DNA-recognition amino-acid positions, but the nature of the underlying evolutionary pressures remains a puzzle. Here we explore the variability of the Prdm9 ZnF array in wild mice, and uncovered a high allelic diversity of both ZnF copy number and identity with the caracterization of 113 alleles. We analyze features of the diversity of ZnF identity which is mostly due to non-synonymous changes at codons −1, 3 and 6 of each ZnF, corresponding to amino-acids involved in DNA binding. Using methods adapted to the minisatellite structure of the ZnF array, we infer a phylogenetic tree of these alleles. We find the sister species Mus spicilegus and M. macedonicus as well as the three house mouse (Mus musculus) subspecies to be polyphyletic. However some sublineages have expanded independently in Mus musculus musculus and M. m. domesticus, the latter further showing phylogeographic substructure. Compared to random genomic regions and non-coding minisatellites, none of these patterns appears exceptional. In silico prediction of DNA binding sites for each allele, overlap of their alignments to the genome and relative coverage of the different families of interspersed repeated elements suggest a large diversity between PRDM9 variants with a potential for highly divergent distributions of recombination events in the genome with little correlation to evolutionary distance. By compiling PRDM9 ZnF protein sequences in Primates, Muridae and Equids, we find different diversity patterns among the three amino-acids most critical for the DNA-recognition function, suggesting different diversification timescales. PMID:24454780
[Comparative results of preimplantation genetic screening by array comparative genomic hybridization and new-generation sequencing].

PubMed

Aleksandrova, N V; Shubina, E S; Ekimov, A N; Kodyleva, T A; Mukosey, I S; Makarova, N P; Kulakova, E V; Levkov, L A; Barkov, I Yu; Trofimov, D Yu; Sukhikh, G T

2017-01-01

Aneuploidies as quantitative chromosome abnormalities are a main cause of failed development of morphologically normal embryos, implantation failures, and early reproductive losses. Preimplantation genetic screening (PGS) allows a preselection of embryos with a normal karyotype, thus increasing the implantation rate and reducing the frequency of early pregnancy loss after IVF. Modern PGS technologies are based on a genome-wide analysis of the embryo. The first pilot study in Russia was performed to assess the possibility of using semiconductor new-generation sequencing (NGS) as a PGS method. NGS data were collected for 38 biopsied embryos and compared with the data from array comparative genomic hybridization (array-CGH). The concordance between the NGS and array-CGH data was 94.8%. Two samples showed the karyotype 47,XXY by array-CGH and a normal karyotype by NGS. The discrepancies may be explained by loss of efficiency of array-CGH amplicon labeling.
Genome-wide comparison of paired fresh frozen and formalin-fixed paraffin-embedded gliomas by custom BAC and oligonucleotide array comparative genomic hybridization: facilitating analysis of archival gliomas.

PubMed

Mohapatra, Gayatry; Engler, David A; Starbuck, Kristen D; Kim, James C; Bernay, Derek C; Scangas, George A; Rousseau, Audrey; Batchelor, Tracy T; Betensky, Rebecca A; Louis, David N

2011-04-01

Array comparative genomic hybridization (aCGH) is a powerful tool for detecting DNA copy number alterations (CNA). Because diffuse malignant gliomas are often sampled by small biopsies, formalin-fixed paraffin-embedded (FFPE) blocks are often the only tissue available for genetic analysis; FFPE tissues are also needed to study the intratumoral heterogeneity that characterizes these neoplasms. In this paper, we present a combination of evaluations and technical advances that provide strong support for the ready use of oligonucleotide aCGH on FFPE diffuse gliomas. We first compared aCGH using bacterial artificial chromosome (BAC) arrays in 45 paired frozen and FFPE gliomas, and demonstrate a high concordance rate between FFPE and frozen DNA in an individual clone-level analysis of sensitivity and specificity, assuring that under certain array conditions, frozen and FFPE DNA can perform nearly identically. However, because oligonucleotide arrays offer advantages to BAC arrays in genomic coverage and practical availability, we next developed a method of labeling DNA from FFPE tissue that allows efficient hybridization to oligonucleotide arrays. To demonstrate utility in FFPE tissues, we applied this approach to biphasic anaplastic oligoastrocytomas and demonstrate CNA differences between DNA obtained from the two components. Therefore, BAC and oligonucleotide aCGH can be sensitive and specific tools for detecting CNAs in FFPE DNA, and novel labeling techniques enable the routine use of oligonucleotide arrays for FFPE DNA. In combination, these advances should facilitate genome-wide analysis of rare, small and/or histologically heterogeneous gliomas from FFPE tissues.
Using Full Genomic Information to Predict Disease: Breaking Down the Barriers Between Complex and Mendelian Diseases.

PubMed

Jordan, Daniel M; Do, Ron

2018-04-11

While sequence-based genetic tests have long been available for specific loci, especially for Mendelian disease, the rapidly falling costs of genome-wide genotyping arrays, whole-exome sequencing, and whole-genome sequencing are moving us toward a future where full genomic information might inform the prognosis and treatment of a variety of diseases, including complex disease. Similarly, the availability of large populations with full genomic information has enabled new insights about the etiology and genetic architecture of complex disease. Insights from the latest generation of genomic studies suggest that our categorization of diseases as complex may conceal a wide spectrum of genetic architectures and causal mechanisms that ranges from Mendelian forms of complex disease to complex regulatory structures underlying Mendelian disease. Here, we review these insights, along with advances in the prediction of disease risk and outcomes from full genomic information. Expected final online publication date for the Annual Review of Genomics and Human Genetics Volume 19 is August 31, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Evaluation of SNP Data from the Malus Infinium Array Identifies Challenges for Genetic Analysis of Complex Genomes of Polyploid Origin.

PubMed

Troggio, Michela; Surbanovski, Nada; Bianco, Luca; Moretto, Marco; Giongo, Lara; Banchi, Elisa; Viola, Roberto; Fernández, Felicdad Fernández; Costa, Fabrizio; Velasco, Riccardo; Cestaro, Alessandro; Sargent, Daniel James

2013-01-01

High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the 'Golden Delicious' genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies.
Genome-wide analysis of macrosatellite repeat copy number variation in worldwide populations: evidence for differences and commonalities in size distributions and size restrictions

PubMed Central

2013-01-01

Background Macrosatellite repeats (MSRs), usually spanning hundreds of kilobases of genomic DNA, comprise a significant proportion of the human genome. Because of their highly polymorphic nature, MSRs represent an extreme example of copy number variation, but their structure and function is largely understudied. Here, we describe a detailed study of six autosomal and two X chromosomal MSRs among 270 HapMap individuals from Central Europe, Asia and Africa. Copy number variation, stability and genetic heterogeneity of the autosomal macrosatellite repeats RS447 (chromosome 4p), MSR5p (5p), FLJ40296 (13q), RNU2 (17q) and D4Z4 (4q and 10q) and X chromosomal DXZ4 and CT47 were investigated. Results Repeat array size distribution analysis shows that all of these MSRs are highly polymorphic with the most genetic variation among Africans and the least among Asians. A mitotic mutation rate of 0.4-2.2% was observed, exceeding meiotic mutation rates and possibly explaining the large size variability found for these MSRs. By means of a novel Bayesian approach, statistical support for a distinct multimodal rather than a uniform allele size distribution was detected in seven out of eight MSRs, with evidence for equidistant intervals between the modes. Conclusions The multimodal distributions with evidence for equidistant intervals, in combination with the observation of MSR-specific constraints on minimum array size, suggest that MSRs are limited in their configurations and that deviations thereof may cause disease, as is the case for facioscapulohumeral muscular dystrophy. However, at present we cannot exclude that there are mechanistic constraints for MSRs that are not directly disease-related. This study represents the first comprehensive study of MSRs in different human populations by applying novel statistical methods and identifies commonalities and differences in their organization and function in the human genome. PMID:23496858
At-TAX: a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana

PubMed Central

Laubinger, Sascha; Zeller, Georg; Henz, Stefan R; Sachsenberg, Timo; Widmer, Christian K; Naouar, Naïra; Vuylsteke, Marnik; Schölkopf, Bernhard; Rätsch, Gunnar; Weigel, Detlef

2008-01-01

Gene expression maps for model organisms, including Arabidopsis thaliana, have typically been created using gene-centric expression arrays. Here, we describe a comprehensive expression atlas, Arabidopsis thaliana Tiling Array Express (At-TAX), which is based on whole-genome tiling arrays. We demonstrate that tiling arrays are accurate tools for gene expression analysis and identified more than 1,000 unannotated transcribed regions. Visualizations of gene expression estimates, transcribed regions, and tiling probe measurements are accessible online at the At-TAX homepage. PMID:18613972
Tandem repeat regions within the Burkholderia pseudomallei genome and their application for high resolution genotyping.

PubMed

U'Ren, Jana M; Schupp, James M; Pearson, Talima; Hornstra, Heidie; Friedman, Christine L Clark; Smith, Kimothy L; Daugherty, Rebecca R Leadem; Rhoton, Shane D; Leadem, Ben; Georgia, Shalamar; Cardon, Michelle; Huynh, Lynn Y; DeShazer, David; Harvey, Steven P; Robison, Richard; Gal, Daniel; Mayo, Mark J; Wagner, David; Currie, Bart J; Keim, Paul

2007-03-30

The facultative, intracellular bacterium Burkholderia pseudomallei is the causative agent of melioidosis, a serious infectious disease of humans and animals. We identified and categorized tandem repeat arrays and their distribution throughout the genome of B. pseudomallei strain K96243 in order to develop a genetic typing method for B. pseudomallei. We then screened 104 of the potentially polymorphic loci across a diverse panel of 31 isolates including B. pseudomallei, B. mallei and B. thailandensis in order to identify loci with varying degrees of polymorphism. A subset of these tandem repeat arrays were subsequently developed into a multiple-locus VNTR analysis to examine 66 B. pseudomallei and 21 B. mallei isolates from around the world, as well as 95 lineages from a serial transfer experiment encompassing ~18,000 generations. B. pseudomallei contains a preponderance of tandem repeat loci throughout its genome, many of which are duplicated elsewhere in the genome. The majority of these loci are composed of repeat motif lengths of 6 to 9 bp with 4 to 10 repeat units and are predominately located in intergenic regions of the genome. Across geographically diverse B. pseudomallei and B.mallei isolates, the 32 VNTR loci displayed between 7 and 28 alleles, with Nei's diversity values ranging from 0.47 and 0.94. Mutation rates for these loci are comparable (>10-5 per locus per generation) to that of the most diverse tandemly repeated regions found in other less diverse bacteria. The frequency, location and duplicate nature of tandemly repeated regions within the B. pseudomallei genome indicate that these tandem repeat regions may play a role in generating and maintaining adaptive genomic variation. Multiple-locus VNTR analysis revealed extensive diversity within the global isolate set containing B. pseudomallei and B. mallei, and it detected genotypic differences within clonal lineages of both species that were identical using previous typing methods. Given the health threat to humans and livestock and the potential for B. pseudomallei to be released intentionally, MLVA could prove to be an important tool for fine-scale epidemiological or forensic tracking of this increasingly important environmental pathogen.
A genetic network that suppresses genome rearrangements in Saccharomyces cerevisiae and contains defects in cancers

PubMed Central

Putnam, Christopher D.; Srivatsan, Anjana; Nene, Rahul V.; Martinez, Sandra L.; Clotfelter, Sarah P.; Bell, Sara N.; Somach, Steven B.; E.S. de Souza, Jorge; Fonseca, André F.; de Souza, Sandro J.; Kolodner, Richard D.

2016-01-01

Gross chromosomal rearrangements (GCRs) play an important role in human diseases, including cancer. The identity of all Genome Instability Suppressing (GIS) genes is not currently known. Here multiple Saccharomyces cerevisiae GCR assays and query mutations were crossed into arrays of mutants to identify progeny with increased GCR rates. One hundred eighty two GIS genes were identified that suppressed GCR formation. Another 438 cooperatively acting GIS genes were identified that were not GIS genes, but suppressed the increased genome instability caused by individual query mutations. Analysis of TCGA data using the human genes predicted to act in GIS pathways revealed that a minimum of 93% of ovarian and 66% of colorectal cancer cases had defects affecting one or more predicted GIS gene. These defects included loss-of-function mutations, copy-number changes associated with reduced expression, and silencing. In contrast, acute myeloid leukaemia cases did not appear to have defects affecting the predicted GIS genes. PMID:27071721
Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls.

PubMed

Craddock, Nick; Hurles, Matthew E; Cardin, Niall; Pearson, Richard D; Plagnol, Vincent; Robson, Samuel; Vukcevic, Damjan; Barnes, Chris; Conrad, Donald F; Giannoulatou, Eleni; Holmes, Chris; Marchini, Jonathan L; Stirrups, Kathy; Tobin, Martin D; Wain, Louise V; Yau, Chris; Aerts, Jan; Ahmad, Tariq; Andrews, T Daniel; Arbury, Hazel; Attwood, Anthony; Auton, Adam; Ball, Stephen G; Balmforth, Anthony J; Barrett, Jeffrey C; Barroso, Inês; Barton, Anne; Bennett, Amanda J; Bhaskar, Sanjeev; Blaszczyk, Katarzyna; Bowes, John; Brand, Oliver J; Braund, Peter S; Bredin, Francesca; Breen, Gerome; Brown, Morris J; Bruce, Ian N; Bull, Jaswinder; Burren, Oliver S; Burton, John; Byrnes, Jake; Caesar, Sian; Clee, Chris M; Coffey, Alison J; Connell, John M C; Cooper, Jason D; Dominiczak, Anna F; Downes, Kate; Drummond, Hazel E; Dudakia, Darshna; Dunham, Andrew; Ebbs, Bernadette; Eccles, Diana; Edkins, Sarah; Edwards, Cathryn; Elliot, Anna; Emery, Paul; Evans, David M; Evans, Gareth; Eyre, Steve; Farmer, Anne; Ferrier, I Nicol; Feuk, Lars; Fitzgerald, Tomas; Flynn, Edward; Forbes, Alistair; Forty, Liz; Franklyn, Jayne A; Freathy, Rachel M; Gibbs, Polly; Gilbert, Paul; Gokumen, Omer; Gordon-Smith, Katherine; Gray, Emma; Green, Elaine; Groves, Chris J; Grozeva, Detelina; Gwilliam, Rhian; Hall, Anita; Hammond, Naomi; Hardy, Matt; Harrison, Pile; Hassanali, Neelam; Hebaishi, Husam; Hines, Sarah; Hinks, Anne; Hitman, Graham A; Hocking, Lynne; Howard, Eleanor; Howard, Philip; Howson, Joanna M M; Hughes, Debbie; Hunt, Sarah; Isaacs, John D; Jain, Mahim; Jewell, Derek P; Johnson, Toby; Jolley, Jennifer D; Jones, Ian R; Jones, Lisa A; Kirov, George; Langford, Cordelia F; Lango-Allen, Hana; Lathrop, G Mark; Lee, James; Lee, Kate L; Lees, Charlie; Lewis, Kevin; Lindgren, Cecilia M; Maisuria-Armer, Meeta; Maller, Julian; Mansfield, John; Martin, Paul; Massey, Dunecan C O; McArdle, Wendy L; McGuffin, Peter; McLay, Kirsten E; Mentzer, Alex; Mimmack, Michael L; Morgan, Ann E; Morris, Andrew P; Mowat, Craig; Myers, Simon; Newman, William; Nimmo, Elaine R; O'Donovan, Michael C; Onipinla, Abiodun; Onyiah, Ifejinelo; Ovington, Nigel R; Owen, Michael J; Palin, Kimmo; Parnell, Kirstie; Pernet, David; Perry, John R B; Phillips, Anne; Pinto, Dalila; Prescott, Natalie J; Prokopenko, Inga; Quail, Michael A; Rafelt, Suzanne; Rayner, Nigel W; Redon, Richard; Reid, David M; Renwick; Ring, Susan M; Robertson, Neil; Russell, Ellie; St Clair, David; Sambrook, Jennifer G; Sanderson, Jeremy D; Schuilenburg, Helen; Scott, Carol E; Scott, Richard; Seal, Sheila; Shaw-Hawkins, Sue; Shields, Beverley M; Simmonds, Matthew J; Smyth, Debbie J; Somaskantharajah, Elilan; Spanova, Katarina; Steer, Sophia; Stephens, Jonathan; Stevens, Helen E; Stone, Millicent A; Su, Zhan; Symmons, Deborah P M; Thompson, John R; Thomson, Wendy; Travers, Mary E; Turnbull, Clare; Valsesia, Armand; Walker, Mark; Walker, Neil M; Wallace, Chris; Warren-Perry, Margaret; Watkins, Nicholas A; Webster, John; Weedon, Michael N; Wilson, Anthony G; Woodburn, Matthew; Wordsworth, B Paul; Young, Allan H; Zeggini, Eleftheria; Carter, Nigel P; Frayling, Timothy M; Lee, Charles; McVean, Gil; Munroe, Patricia B; Palotie, Aarno; Sawcer, Stephen J; Scherer, Stephen W; Strachan, David P; Tyler-Smith, Chris; Brown, Matthew A; Burton, Paul R; Caulfield, Mark J; Compston, Alastair; Farrall, Martin; Gough, Stephen C L; Hall, Alistair S; Hattersley, Andrew T; Hill, Adrian V S; Mathew, Christopher G; Pembrey, Marcus; Satsangi, Jack; Stratton, Michael R; Worthington, Jane; Deloukas, Panos; Duncanson, Audrey; Kwiatkowski, Dominic P; McCarthy, Mark I; Ouwehand, Willem; Parkes, Miles; Rahman, Nazneen; Todd, John A; Samani, Nilesh J; Donnelly, Peter

2010-04-01

Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed approximately 19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated approximately 50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease-IRGM for Crohn's disease, HLA for Crohn's disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetes-although in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases.
Comprehensive comparison of three commercial human whole-exome capture platforms.

PubMed

Asan; Xu, Yu; Jiang, Hui; Tyler-Smith, Chris; Xue, Yali; Jiang, Tao; Wang, Jiawei; Wu, Mingzhi; Liu, Xiao; Tian, Geng; Wang, Jun; Wang, Jian; Yang, Huangming; Zhang, Xiuqing

2011-09-28

Exome sequencing, which allows the global analysis of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study. We comprehensively compared three platforms: NimbleGen's Sequence Capture Array and SeqCap EZ, and Agilent's SureSelect. We assessed their performance in a variety of ways, including number of genes covered and capture efficacy. Differences that may impact on the choice of platform were that Agilent SureSelect covered approximately 1,100 more genes, while NimbleGen provided better flanking sequence capture. Although all three platforms achieved similar capture specificity of targeted regions, the NimbleGen platforms showed better uniformity of coverage and greater genotype sensitivity at 30- to 100-fold sequencing depth. All three platforms showed similar power in exome SNP calling, including medically relevant SNPs. Compared with genotyping and whole-genome sequencing data, the three platforms achieved a similar accuracy of genotype assignment and SNP detection. Importantly, all three platforms showed similar levels of reproducibility, GC bias and reference allele bias. We demonstrate key differences between the three platforms, particularly advantages of solutions over array capture and the importance of a large gene target set.
Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS) Data

PubMed Central

2016-01-01

Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordable, datasets derived from HTS methods suffer from sequencing error, alignment errors, and missing data, all of which introduce noise and uncertainty to variant discovery and genotype calling. Under such circumstances, meaningful analysis of the data is difficult. Our primary interest lies in the issue of how one can accurately infer or impute missing genotypes in HTS-derived datasets. Many of the existing genotype imputation algorithms and software packages were primarily developed by and optimized for the human genetics community, a field where a complete and accurate reference genome has been constructed and SNP arrays have, in large part, been the common genotyping platform. We set out to answer two questions: 1) can we use existing imputation methods developed by the human genetics community to impute missing genotypes in datasets derived from non-human species and 2) are these methods, which were developed and optimized to impute ascertained variants, amenable for imputation of missing genotypes at HTS-derived variants? We selected Beagle v.4, a widely used algorithm within the human genetics community with reportedly high accuracy, to serve as our imputation contender. We performed a series of cross-validation experiments, using GBS data collected from the species Manihot esculenta by the Next Generation (NEXTGEN) Cassava Breeding Project. NEXTGEN currently imputes missing genotypes in their datasets using a LASSO-penalized, linear regression method (denoted ‘glmnet’). We selected glmnet to serve as a benchmark imputation method for this reason. We obtained estimates of imputation accuracy by masking a subset of observed genotypes, imputing, and calculating the sample Pearson correlation between observed and imputed genotype dosages at the site and individual level; computation time served as a second metric for comparison. We then set out to examine factors affecting imputation accuracy, such as levels of missing data, read depth, minor allele frequency (MAF), and reference panel composition. PMID:27537694
Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS) Data.

PubMed

Chan, Ariel W; Hamblin, Martha T; Jannink, Jean-Luc

2016-01-01

Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordable, datasets derived from HTS methods suffer from sequencing error, alignment errors, and missing data, all of which introduce noise and uncertainty to variant discovery and genotype calling. Under such circumstances, meaningful analysis of the data is difficult. Our primary interest lies in the issue of how one can accurately infer or impute missing genotypes in HTS-derived datasets. Many of the existing genotype imputation algorithms and software packages were primarily developed by and optimized for the human genetics community, a field where a complete and accurate reference genome has been constructed and SNP arrays have, in large part, been the common genotyping platform. We set out to answer two questions: 1) can we use existing imputation methods developed by the human genetics community to impute missing genotypes in datasets derived from non-human species and 2) are these methods, which were developed and optimized to impute ascertained variants, amenable for imputation of missing genotypes at HTS-derived variants? We selected Beagle v.4, a widely used algorithm within the human genetics community with reportedly high accuracy, to serve as our imputation contender. We performed a series of cross-validation experiments, using GBS data collected from the species Manihot esculenta by the Next Generation (NEXTGEN) Cassava Breeding Project. NEXTGEN currently imputes missing genotypes in their datasets using a LASSO-penalized, linear regression method (denoted 'glmnet'). We selected glmnet to serve as a benchmark imputation method for this reason. We obtained estimates of imputation accuracy by masking a subset of observed genotypes, imputing, and calculating the sample Pearson correlation between observed and imputed genotype dosages at the site and individual level; computation time served as a second metric for comparison. We then set out to examine factors affecting imputation accuracy, such as levels of missing data, read depth, minor allele frequency (MAF), and reference panel composition.
Mining biological databases for candidate disease genes

NASA Astrophysics Data System (ADS)

Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

2001-07-01

The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).

The effect of input DNA copy number on genotype call and characterising SNP markers in the humpback whale genome using a nanofluidic array.

PubMed

Bhat, Somanath; Polanowski, Andrea M; Double, Mike C; Jarman, Simon N; Emslie, Kerry R

2012-01-01

Recent advances in nanofluidic technologies have enabled the use of Integrated Fluidic Circuits (IFCs) for high-throughput Single Nucleotide Polymorphism (SNP) genotyping (GT). In this study, we implemented and validated a relatively low cost nanofluidic system for SNP-GT with and without Specific Target Amplification (STA). As proof of principle, we first validated the effect of input DNA copy number on genotype call rate using well characterised, digital PCR (dPCR) quantified human genomic DNA samples and then implemented the validated method to genotype 45 SNPs in the humpback whale, Megaptera novaeangliae, nuclear genome. When STA was not incorporated, for a homozygous human DNA sample, reaction chambers containing, on average 9 to 97 copies, showed 100% call rate and accuracy. Below 9 copies, the call rate decreased, and at one copy it was 40%. For a heterozygous human DNA sample, the call rate decreased from 100% to 21% when predicted copies per reaction chamber decreased from 38 copies to one copy. The tightness of genotype clusters on a scatter plot also decreased. In contrast, when the same samples were subjected to STA prior to genotyping a call rate and a call accuracy of 100% were achieved. Our results demonstrate that low input DNA copy number affects the quality of data generated, in particular for a heterozygous sample. Similar to human genomic DNA, a call rate and a call accuracy of 100% was achieved with whale genomic DNA samples following multiplex STA using either 15 or 45 SNP-GT assays. These calls were 100% concordant with their true genotypes determined by an independent method, suggesting that the nanofluidic system is a reliable platform for executing call rates with high accuracy and concordance in genomic sequences derived from biological tissue.
Transposable elements in Drosophila.

PubMed

McCullers, Tabitha J; Steiniger, Mindy

2017-01-01

Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster .
Transposable elements in Drosophila

PubMed Central

McCullers, Tabitha J.; Steiniger, Mindy

2017-01-01

ABSTRACT Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster. PMID:28580197
The African Genome Variation Project shapes medical genetics in Africa

PubMed Central

Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

2014-01-01

Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterisation of African genetic diversity is needed. The African Genome Variation Project (AGVP) provides a resource to help design, implement and interpret genomic studies in sub-Saharan Africa (SSA) and worldwide. The AGVP represents dense genotypes from 1,481 and whole genome sequences (WGS) from 320 individuals across SSA. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across SSA. We identify new loci under selection, including for malaria and hypertension. We show that modern imputation panels can identify association signals at highly differentiated loci across populations in SSA. Using WGS, we show further improvement in imputation accuracy supporting efforts for large-scale sequencing of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa, showing for the first time that such designs are feasible. PMID:25470054
Organization of Synthetic Alphoid DNA Array in Human Artificial Chromosome (HAC) with a Conditional Centromere

PubMed Central

Kouprina, Natalay; Samoshkin, Alexander; Erliandri, Indri; Nakano, Megumi; Lee, Hee-Sheung; Fu, Haiging; Iida, Yuichi; Aladjem, Mirit; Oshimura, Mitsuo; Masumoto, Hiroshi; Earnshaw, William C.; Larionov, Vladimir

2012-01-01

Human artificial chromosomes (HACs) represent a novel promising episomal system for functional genomics, gene therapy and synthetic biology. HACs are engineered from natural and synthetic alphoid DNA arrays upon transfection into human cells. The use of HACs for gene expression studies requires the knowledge of their structural organization. However, none of de novo HACs constructed so far has been physically mapped in detail. Recently we constructed a synthetic alphoidtetO-HAC that was successfully used for expression of full-length genes to correct genetic deficiencies in human cells. The HAC can be easily eliminated from cell populations by inactivation of its conditional kinetochore. This unique feature provides a control for phenotypic changes attributed to expression of HAC-encoded genes. This work describes organization of a megabase-size synthetic alphoid DNA array in the alphoidtetO-HAC that has been formed from a ~50 kb synthetic alphoidtetO-construct. Our analysis showed that this array represents a 1.1 Mb continuous sequence assembled from multiple copies of input DNA, a significant part of which was rearranged before assembling. The tandem and inverted alphoid DNA repeats in the HAC range in size from 25 to 150 kb. In addition, we demonstrated that the structure and functional domains of the HAC remains unchanged after several rounds of its transfer into different host cells. The knowledge of the alphoidtetO-HAC structure provides a tool to control HAC integrity during different manipulations. Our results also shed light on a mechanism for de novo HAC formation in human cells. PMID:23411994
Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data.

PubMed

Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Song, Jiuzhou; Liu, George E

2013-06-25

Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.
ArrayExpress update--trends in database growth and links to data analysis tools.

PubMed

Rustici, Gabriella; Kolesnikov, Nikolay; Brandizi, Marco; Burdett, Tony; Dylag, Miroslaw; Emam, Ibrahim; Farne, Anna; Hastings, Emma; Ison, Jon; Keays, Maria; Kurbatova, Natalja; Malone, James; Mani, Roby; Mupo, Annalisa; Pedro Pereira, Rui; Pilicheva, Ekaterina; Rung, Johan; Sharma, Anjan; Tang, Y Amy; Ternent, Tobias; Tikhonov, Andrew; Welter, Danielle; Williams, Eleanor; Brazma, Alvis; Parkinson, Helen; Sarkans, Ugis

2013-01-01

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.
Mouse Tumor Biology (MTB): a database of mouse models for human cancer.

PubMed

Bult, Carol J; Krupke, Debra M; Begley, Dale A; Richardson, Joel E; Neuhauser, Steven B; Sundberg, John P; Eppig, Janan T

2015-01-01

The Mouse Tumor Biology (MTB; http://tumor.informatics.jax.org) database is a unique online compendium of mouse models for human cancer. MTB provides online access to expertly curated information on diverse mouse models for human cancer and interfaces for searching and visualizing data associated with these models. The information in MTB is designed to facilitate the selection of strains for cancer research and is a platform for mining data on tumor development and patterns of metastases. MTB curators acquire data through manual curation of peer-reviewed scientific literature and from direct submissions by researchers. Data in MTB are also obtained from other bioinformatics resources including PathBase, the Gene Expression Omnibus and ArrayExpress. Recent enhancements to MTB improve the association between mouse models and human genes commonly mutated in a variety of cancers as identified in large-scale cancer genomics studies, provide new interfaces for exploring regions of the mouse genome associated with cancer phenotypes and incorporate data and information related to Patient-Derived Xenograft models of human cancers. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
P53 oncosuppressor influences selection of genomic imbalances in response to ionizing radiations in human osteosarcoma cell line SAOS-2.

PubMed

Zuffa, Elisa; Mancini, Manuela; Brusa, Gianluca; Pagnotta, Eleonora; Hattinger, Claudia Maria; Serra, Massimo; Remondini, Daniel; Castellani, Gastone; Corrado, Patrizia; Barbieri, Enza; Santucci, Maria Alessandra

2008-07-01

To investigate the impact of TP53 (tumor protein 53, p53) on genomic stability of osteosarcoma (OS). In first instance, we expressed in OS cell line SAOS-2 (lacking p53) a wild type (wt) p53 construct, whose protein undergoes nuclear import and activation in response to ionizing radiations (IR). Thereafter, we investigated genomic imbalances (amplifications and deletions at genes or DNA regions most frequently altered in human cancers) associated with radio-resistance relative to p53 expression by mean of an array-based comparative genomic hybridization (aCGH) strategy. Finally we investigated a putative marker of radio-induced oxidative stress, a 4,977 bp deletion at mitochondrial (mt) DNA usually referred to as 'common' deletion, by mean of a polimerase chain reaction (PCR) strategy. In radio-resistant subclones generated from wt p53-transfected SAOS-2 cells DNA deletions were remarkably reduced and the accumulation of 'common' deletion at mtDNA (that may let the persistence of oxidative damage by precluding detoxification from reactive oxygen species [ROS]) completely abrogated. The results of our study confirm that wt p53 has a role in protection of OS cell DNA integrity. Multiple mechanisms involved in p53 safeguard of genomic integrity and prevention of deletion outcome are discussed.
A Discovery Resource of Rare Copy Number Variations in Individuals with Autism Spectrum Disorder

PubMed Central

Prasad, Aparna; Merico, Daniele; Thiruvahindrapuram, Bhooma; Wei, John; Lionel, Anath C.; Sato, Daisuke; Rickaby, Jessica; Lu, Chao; Szatmari, Peter; Roberts, Wendy; Fernandez, Bridget A.; Marshall, Christian R.; Hatchwell, Eli; Eis, Peggy S.; Scherer, Stephen W.

2012-01-01

The identification of rare inherited and de novo copy number variations (CNVs) in human subjects has proven a productive approach to highlight risk genes for autism spectrum disorder (ASD). A variety of microarrays are available to detect CNVs, including single-nucleotide polymorphism (SNP) arrays and comparative genomic hybridization (CGH) arrays. Here, we examine a cohort of 696 unrelated ASD cases using a high-resolution one-million feature CGH microarray, the majority of which were previously genotyped with SNP arrays. Our objective was to discover new CNVs in ASD cases that were not detected by SNP microarray analysis and to delineate novel ASD risk loci via combined analysis of CGH and SNP array data sets on the ASD cohort and CGH data on an additional 1000 control samples. Of the 615 ASD cases analyzed on both SNP and CGH arrays, we found that 13,572 of 21,346 (64%) of the CNVs were exclusively detected by the CGH array. Several of the CGH-specific CNVs are rare in population frequency and impact previously reported ASD genes (e.g., NRXN1, GRM8, DPYD), as well as novel ASD candidate genes (e.g., CIB2, DAPP1, SAE1), and all were inherited except for a de novo CNV in the GPHN gene. A functional enrichment test of gene-sets in ASD cases over controls revealed nucleotide metabolism as a potential novel pathway involved in ASD, which includes several candidate genes for follow-up (e.g., DPYD, UPB1, UPP1, TYMP). Finally, this extensively phenotyped and genotyped ASD clinical cohort serves as an invaluable resource for the next step of genome sequencing for complete genetic variation detection. PMID:23275889
Genome-wide array-based comparative genomic hybridization (array-CGH) analysis in Aicardi Syndrome

USDA-ARS?s Scientific Manuscript database

Aicardi syndrome is characterized by agenesis of the corpus callosum, chorioretinal lacunae, severe seizures (starting as infantile spasms), neuronal migration defects, mental retardation, costovertebral defects, and typical facial features. Because Aicardi syndrome is sporadic and affects only fem...
Stability of the human sperm DNA methylome to folic acid fortification and short-term supplementation.

PubMed

Chan, D; McGraw, S; Klein, K; Wallock, L M; Konermann, C; Plass, C; Chan, P; Robaire, B; Jacob, R A; Greenwood, C M T; Trasler, J M

2017-02-01

Do short-term and long-term exposures to low-dose folic acid supplementation alter DNA methylation in sperm? No alterations in sperm DNA methylation patterns were found following the administration of low-dose folic acid supplements of 400 μg/day for 90 days (short-term exposure) or when pre-fortification of food with folic acid and post-fortification sperm samples (long-term exposure) were compared. Excess dietary folate may be detrimental to health and DNA methylation profiles due to folate's role in one-carbon metabolism and the formation of S-adenosyl methionine, the universal methyl donor. DNA methylation patterns are established in developing male germ cells and have been suggested to be affected by high-dose (5 mg/day) folic acid supplementation. This is a control versus treatment study where genome-wide sperm DNA methylation patterns were examined prior to fortification of food (1996-1997) in men with no history of infertility at baseline and following 90-day exposure to placebo (n = 9) or supplement containing 400 μg folic acid/day (n = 10). Additionally, pre-fortification sperm DNA methylation profiles (n = 19) were compared with those of a group of post-fortification (post-2004) men (n = 8) who had been exposed for several years to dietary folic acid fortification. Blood and seminal plasma folate levels were measured in participants before and following the 90-day treatment with placebo or supplement. Sperm DNA methylation was assessed using the whole-genome and genome-wide techniques, MassArray epityper, restriction landmark genomic scanning, methyl-CpG immunoprecipitation and Illumina HumanMethylation450 Bead Array. Following treatment, supplemented individuals had significantly higher levels of blood and seminal plasma folates compared to placebo. Initial first-generation genome-wide analyses of sperm DNA methylation showed little evidence of changes when comparing pre- and post-treatment samples. With Illumina HumanMethylation450 BeadChip arrays, no significant changes were observed in individual probes following low-level supplementation; when compared with those of the post-fortification cohort, there were also few differences in methylation despite exposure to years of fortified foods. Illumina HumanMethylation450 BeadChip data from this study have been submitted to the NCBI Gene Expression Omnibus under the accession number GSE89781. This study was limited to the number of participants available in each cohort, in particular those who were not exposed to early (pre-1998) fortification of food with folic acid. While genome-wide DNA methylation was assessed with several techniques that targeted genic and CpG-rich regions, intergenic regions were less well interrogated. Overall, our findings provide evidence that short-term exposure to low-dose folic acid supplements of 400 μg/day, over a period of 3 months, a duration of time that might occur during infertility treatments, has no major impact on the sperm DNA methylome. This work was supported by a grant to J.M.T. from the Canadian Institutes of Health Research (CIHR: MOP-89944). The authors have no conflicts of interest to declare. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies.

PubMed

Sulovari, Arvis; Li, Dawei

2014-07-19

Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. http://www.uvm.edu/genomics/software/gact.
Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

PubMed Central

Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

2012-01-01

Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference genome is yet available to allow such detailed characterization. PMID:22984541
Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome

PubMed Central

Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing; O’Connor, Timothy D.; Abecasis, Gonçalo R.; Wojcik, Genevieve L; Gignoux, Christopher R.; Gourraud, Pierre-Antoine; Lizee, Antoine; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Kenny, Eimear E.; Bustamante, Carlos; Beaty, Terri H.; Mathias, Rasika A.; Barnes, Kathleen C.; Qin, Zhaohui S.; Preethi Boorgula, Meher; Campbell, Monica; Chavan, Sameer; Ford, Jean G.; Foster, Cassandra; Gao, Li; Hansel, Nadia N.; Horowitz, Edward; Huang, Lili; Ortiz, Romina; Potee, Joseph; Rafaels, Nicholas; Ruczinski, Ingo; Scott, Alan F.; Taub, Margaret A.; Vergara, Candelaria; Levin, Albert M.; Padhukasahasram, Badri; Williams, L. Keoki; Dunston, Georgia M.; Faruque, Mezbah U.; Gietzen, Kimberly; Deshpande, Aniket; Grus, Wendy E.; Locke, Devin P.; Foreman, Marilyn G.; Avila, Pedro C.; Grammer, Leslie; Kim, Kwang-Youn A.; Kumar, Rajesh; Schleimer, Robert; De La Vega, Francisco M.; Shringarpure, Suyash S.; Musharoff, Shaila; Burchard, Esteban G.; Eng, Celeste; Hernandez, Ryan D.; Pino-Yanes, Maria; Torgerson, Dara G.; Szpiech, Zachary A.; Torres, Raul; Nicolae, Dan L.; Ober, Carole; Olopade, Christopher O; Olopade, Olufunmilayo; Oluwole, Oluwafemi; Arinola, Ganiyu; Song, Wei; Correa, Adolfo; Musani, Solomon; Wilson, James G.; Lange, Leslie A.; Akey, Joshua; Bamshad, Michael; Chong, Jessica; Fu, Wenqing; Nickerson, Deborah; Reiner, Alexander; Hartert, Tina; Ware, Lorraine B.; Bleecker, Eugene; Meyers, Deborah; Ortega, Victor E.; Maul, Pissamai; Maul, Trevor; Watson, Harold; Ilma Araujo, Maria; Riccio Oliveira, Ricardo; Caraballo, Luis; Marrugo, Javier; Martinez, Beatriz; Meza, Catherine; Ayestas, Gerardo; Francisco Herrera-Paz, Edwin; Landaverde-Torres, Pamela; Erazo, Said Omar Leiva; Martinez, Rosella; Mayorga, Alvaro; Mayorga, Luis F.; Mejia-Mejia, Delmy-Aracely; Ramos, Hector; Saenz, Allan; Varela, Gloria; Marina Vasquez, Olga; Ferguson, Trevor; Knight-Madden, Jennifer; Samms-Vaughan, Maureen; Wilks, Rainford J.; Adegnika, Akim; Ateba-Ngoa, Ulysse; Yazdanbakhsh, Maria

2017-01-01

A primary goal of The Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to develop an ‘African Diaspora Power Chip’ (ADPC), a genotyping array consisting of tagging SNPs, useful in comprehensively identifying African specific genetic variation. This array is designed based on the novel variation identified in 642 CAAPA samples of African ancestry with high coverage whole genome sequence data (~30× depth). This novel variation extends the pattern of variation catalogued in the 1000 Genomes and Exome Sequencing Projects to a spectrum of populations representing the wide range of West African genomic diversity. These individuals from CAAPA also comprise a large swath of the African Diaspora population and incorporate historical genetic diversity covering nearly the entire Atlantic coast of the Americas. Here we show the results of designing and producing such a microchip array. This novel array covers African specific variation far better than other commercially available arrays, and will enable better GWAS analyses for researchers with individuals of African descent in their study populations. A recent study cataloging variation in continental African populations suggests this type of African-specific genotyping array is both necessary and valuable for facilitating large-scale GWAS in populations of African ancestry. PMID:28429804
Development and Evaluation of a Genome-Wide 6K SNP Array for Diploid Sweet Cherry and Tetraploid Sour Cherry

PubMed Central

Peace, Cameron; Bassil, Nahla; Main, Dorrie; Ficklin, Stephen; Rosyara, Umesh R.; Stegmeir, Travis; Sebolt, Audrey; Gilmore, Barbara; Lawley, Cindy; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Iezzoni, Amy

2012-01-01

High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome structure investigation, and genetic diversity assessment in this diploid-tetraploid crop group. PMID:23284615
Evaluation of SNP Data from the Malus Infinium Array Identifies Challenges for Genetic Analysis of Complex Genomes of Polyploid Origin

PubMed Central

Troggio, Michela; Šurbanovski, Nada; Bianco, Luca; Moretto, Marco; Giongo, Lara; Banchi, Elisa; Viola, Roberto; Fernández, Felicdad Fernández; Costa, Fabrizio; Velasco, Riccardo; Cestaro, Alessandro; Sargent, Daniel James

2013-01-01

High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the ‘Golden Delicious’ genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies. PMID:23826289
Diversity Arrays Technology (DArT) for whole-genome profiling of barley

PubMed Central

Wenzl, Peter; Carling, Jason; Kudrna, David; Jaccoud, Damian; Huttner, Eric; Kleinhofs, Andris; Kilian, Andrzej

2004-01-01

Diversity Arrays Technology (DArT) can detect and type DNA variation at several hundred genomic loci in parallel without relying on sequence information. Here we show that it can be effectively applied to genetic mapping and diversity analyses of barley, a species with a 5,000-Mbp genome. We tested several complexity reduction methods and selected two that generated the most polymorphic genomic representations. Arrays containing individual fragments from these representations generated DArT fingerprints with a genotype call rate of 98.0% and a scoring reproducibility of at least 99.8%. The fingerprints grouped barley lines according to known genetic relationships. To validate the Mendelian behavior of DArT markers, we constructed a genetic map for a cross between cultivars Steptoe and Morex. Nearly all polymorphic array features could be incorporated into one of seven linkage groups (98.8%). The resulting map comprised ≈385 unique DArT markers and spanned 1,137 centimorgans. A comparison with the restriction fragment length polymorphism-based framework map indicated that the quality of the DArT map was equivalent, if not superior, to that of the framework map. These results highlight the potential of DArT as a generic technique for genome profiling in the context of molecular breeding and genomics. PMID:15192146
Integrated design, execution, and analysis of arrayed and pooled CRISPR genome-editing experiments.

PubMed

Canver, Matthew C; Haeussler, Maximilian; Bauer, Daniel E; Orkin, Stuart H; Sanjana, Neville E; Shalem, Ophir; Yuan, Guo-Cheng; Zhang, Feng; Concordet, Jean-Paul; Pinello, Luca

2018-05-01

CRISPR (clustered regularly interspaced short palindromic repeats) genome-editing experiments offer enormous potential for the evaluation of genomic loci using arrayed single guide RNAs (sgRNAs) or pooled sgRNA libraries. Numerous computational tools are available to help design sgRNAs with optimal on-target efficiency and minimal off-target potential. In addition, computational tools have been developed to analyze deep-sequencing data resulting from genome-editing experiments. However, these tools are typically developed in isolation and oftentimes are not readily translatable into laboratory-based experiments. Here, we present a protocol that describes in detail both the computational and benchtop implementation of an arrayed and/or pooled CRISPR genome-editing experiment. This protocol provides instructions for sgRNA design with CRISPOR (computational tool for the design, evaluation, and cloning of sgRNA sequences), experimental implementation, and analysis of the resulting high-throughput sequencing data with CRISPResso (computational tool for analysis of genome-editing outcomes from deep-sequencing data). This protocol allows for design and execution of arrayed and pooled CRISPR experiments in 4-5 weeks by non-experts, as well as computational data analysis that can be performed in 1-2 d by both computational and noncomputational biologists alike using web-based and/or command-line versions.
Src-family Tyrosine Kinases in Oogenesis, Oocyte Maturation, and Fertilization: An Evolutionary Perspective

PubMed Central

Kinsey, William H.

2015-01-01

The oocyte is a highly specialized cell poised to respond to fertilization with a unique set of actions needed to recognize and incorporate a single sperm, complete meiosis, reprogram maternal and paternal genomes and assemble them into a unique zygotic genome, and finally initiate the mitotic cell cycle. Oocytes accomplish this diverse series of events through an array of signal transduction pathway components that include a characteristic collection of protein tyrosine kinases. The src-family protein kinases figure importantly in this signaling array and oocytes characteristically express certain SFKs at high levels to provide for the unique actions that the oocyte must perform. The SFKs typically exhibit a distinct pattern of subcellular localization in oocytes and perform critical functions in different subcellular compartments at different steps during oocyte maturation and fertilization. While many aspects of SFK signaling are conserved among oocytes from different species, significant differences exist in the extent to which src-family -mediated pathways are used by oocytes from species that fertilize externally vs those which are fertilized internally. The observation that several oocyte functions which require SFK signaling appear to represent common points of failure during assisted reproductive techniques in humans, highlights the importance of these signaling pathways for human reproductive health. PMID:25030759

Abundant and diverse clustered regularly interspaced short palindromic repeat spacers in Clostridium difficile strains and prophages target multiple phage types within this pathogen.

PubMed

Hargreaves, Katherine R; Flores, Cesar O; Lawley, Trevor D; Clokie, Martha R J

2014-08-26

Clostridium difficile is an important human-pathogenic bacterium causing antibiotic-associated nosocomial infections worldwide. Mobile genetic elements and bacteriophages have helped shape C. difficile genome evolution. In many bacteria, phage infection may be controlled by a form of bacterial immunity called the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system. This uses acquired short nucleotide sequences (spacers) to target homologous sequences (protospacers) in phage genomes. C. difficile carries multiple CRISPR arrays, and in this paper we examine the relationships between the host- and phage-carried elements of the system. We detected multiple matches between spacers and regions in 31 C. difficile phage and prophage genomes. A subset of the spacers was located in prophage-carried CRISPR arrays. The CRISPR spacer profiles generated suggest that related phages would have similar host ranges. Furthermore, we show that C. difficile strains of the same ribotype could either have similar or divergent CRISPR contents. Both synonymous and nonsynonymous mutations in the protospacer sequences were identified, as well as differences in the protospacer adjacent motif (PAM), which could explain how phages escape this system. This paper illustrates how the distribution and diversity of CRISPR spacers in C. difficile, and its prophages, could modulate phage predation for this pathogen and impact upon its evolution and pathogenicity. Clostridium difficile is a significant bacterial human pathogen which undergoes continual genome evolution, resulting in the emergence of new virulent strains. Phages are major facilitators of genome evolution in other bacterial species, and we use sequence analysis-based approaches in order to examine whether the CRISPR/Cas system could control these interactions across divergent C. difficile strains. The presence of spacer sequences in prophages that are homologous to phage genomes raises an extra level of complexity in this predator-prey microbial system. Our results demonstrate that the impact of phage infection in this system is widespread and that the CRISPR/Cas system is likely to be an important aspect of the evolutionary dynamics in C. difficile. Copyright © 2014 Hargreaves et al.
Comparative Genomic Hybridization–Array Analysis Enhances the Detection of Aneuploidies and Submicroscopic Imbalances in Spontaneous Miscarriages

PubMed Central

Schaeffer, Anthony J. ; Chung, June ; Heretis, Konstantina ; Wong, Andrew ; Ledbetter, David H. ; Lese Martin, Christa

2004-01-01

Miscarriage is a condition that affects 10%–15% of all clinically recognized pregnancies, most of which occur in the first trimester. Approximately 50% of first-trimester miscarriages result from fetal chromosome abnormalities. Currently, G-banded chromosome analysis is used to determine if large-scale genetic imbalances are the cause of these pregnancy losses. This technique relies on the culture of cells derived from the fetus, a technique that has many limitations, including a high rate of culture failure, maternal overgrowth of fetal cells, and poor chromosome morphology. Comparative genomic hybridization (CGH)–array analysis is a powerful new molecular cytogenetic technique that allows genomewide analysis of DNA copy number. By hybridizing patient DNA and normal reference DNA to arrays of genomic clones, unbalanced gains or losses of genetic material across the genome can be detected. In this study, 41 product-of-conception (POC) samples, which were previously analyzed by G-banding, were tested using CGH arrays to determine not only if the array could identify all reported abnormalities, but also whether any previously undetected genomic imbalances would be discovered. The array methodology detected all abnormalities as reported by G-banding analysis and revealed new abnormalities in 4/41 (9.8%) cases. Of those, one trisomy 21 POC was also mosaic for trisomy 20, one had a duplication of the 10q telomere region, one had an interstitial deletion of chromosome 9p, and the fourth had an interstitial duplication of the Prader-Willi/Angelman syndrome region on chromosome 15q, which, if maternally inherited, has been implicated in autism. This retrospective study demonstrates that the DNA-based CGH-array technology overcomes many of the limitations of routine cytogenetic analysis of POC samples while enhancing the detection of fetal chromosome aberrations. PMID:15127362
Microarray-Based Comparative Genomic Hybridization Using Sex-Matched Reference DNA Provides Greater Sensitivity for Detection of Sex Chromosome Imbalances than Array-Comparative Genomic Hybridization with Sex-Mismatched Reference DNA

PubMed Central

Yatsenko, Svetlana A.; Shaw, Chad A.; Ou, Zhishuo; Pursley, Amber N.; Patel, Ankita; Bi, Weimin; Cheung, Sau Wai; Lupski, James R.; Chinault, A. Craig; Beaudet, Arthur L.

2009-01-01

In array-comparative genomic hybridization (array-CGH) experiments, the measurement of DNA copy number of sex chromosomal regions depends on the sex of the patient and the reference DNAs used. We evaluated the ability of bacterial artificial chromosomes/P1-derived artificial and oligonucleotide array-CGH analyses to detect constitutional sex chromosome imbalances using sex-mismatched reference DNAs. Twenty-two samples with imbalances involving either the X or Y chromosome, including deletions, duplications, triplications, derivative or isodicentric chromosomes, and aneuploidy, were analyzed. Although concordant results were obtained for approximately one-half of the samples when using sex-mismatched and sex-matched reference DNAs, array-CGH analyses with sex-mismatched reference DNAs did not detect genomic imbalances that were detected using sex-matched reference DNAs in 6 of 22 patients. Small duplications and deletions of the X chromosome were most difficult to detect in female and male patients, respectively, when sex-mismatched reference DNAs were used. Sex-matched reference DNAs in array-CGH analyses provides optimal sensitivity and enables an automated statistical evaluation for the detection of sex chromosome imbalances when compared with an experimental design using sex-mismatched reference DNAs. Using sex-mismatched reference DNAs in array-CGH analyses may generate false-negative, false-positive, and ambiguous results for sex chromosome-specific probes, thus masking potential pathogenic genomic imbalances. Therefore, to optimize both detection of clinically relevant sex chromosome imbalances and ensure proper experimental performance, we suggest that alternative internal controls be developed and used instead of using sex-mismatched reference DNAs. PMID:19324990
The African Genome Variation Project shapes medical genetics in Africa

NASA Astrophysics Data System (ADS)

Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O.; Choudhury, Ananyo; Ritchie, Graham R. S.; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N.; Young, Elizabeth H.; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P.; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A.; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S.

2015-01-01

Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.
The African Genome Variation Project shapes medical genetics in Africa.

PubMed

Gurdasani, Deepti; Carstensen, Tommy; Tekola-Ayele, Fasil; Pagani, Luca; Tachmazidou, Ioanna; Hatzikotoulas, Konstantinos; Karthikeyan, Savita; Iles, Louise; Pollard, Martin O; Choudhury, Ananyo; Ritchie, Graham R S; Xue, Yali; Asimit, Jennifer; Nsubuga, Rebecca N; Young, Elizabeth H; Pomilla, Cristina; Kivinen, Katja; Rockett, Kirk; Kamali, Anatoli; Doumatey, Ayo P; Asiki, Gershim; Seeley, Janet; Sisay-Joof, Fatoumatta; Jallow, Muminatou; Tollman, Stephen; Mekonnen, Ephrem; Ekong, Rosemary; Oljira, Tamiru; Bradman, Neil; Bojang, Kalifa; Ramsay, Michele; Adeyemo, Adebowale; Bekele, Endashaw; Motala, Ayesha; Norris, Shane A; Pirie, Fraser; Kaleebu, Pontiano; Kwiatkowski, Dominic; Tyler-Smith, Chris; Rotimi, Charles; Zeggini, Eleftheria; Sandhu, Manjinder S

2015-01-15

Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.
Advances in Cryptococcus genomics: insights into the evolution of pathogenesis.

PubMed

Cuomo, Christina A; Rhodes, Johanna; Desjardins, Christopher A

2018-01-01

Cryptococcus species are the causative agents of cryptococcal meningitis, a significant source of mortality in immunocompromised individuals. Initial work on the molecular epidemiology of this fungal pathogen utilized genotyping approaches to describe the genetic diversity and biogeography of two species, Cryptococcus neoformans and Cryptococcus gattii. Whole genome sequencing of representatives of both species resulted in reference assemblies enabling a wide array of downstream studies and genomic resources. With the increasing availability of whole genome sequencing, both species have now had hundreds of individual isolates sequenced, providing fine-scale insight into the evolution and diversification of Cryptococcus and allowing for the first genome-wide association studies to identify genetic variants associated with human virulence. Sequencing has also begun to examine the microevolution of isolates during prolonged infection and to identify variants specific to outbreak lineages, highlighting the potential role of hyper-mutation in evolving within short time scales. We can anticipate that further advances in sequencing technology and sequencing microbial genomes at scale, including metagenomics approaches, will continue to refine our view of how the evolution of Cryptococcus drives its success as a pathogen.
Africa: continent of genome contrasts with implications for biomedical research and health.

PubMed

Ramsay, Michèle

2012-08-31

The genomic architecture of African populations is poorly understood and there is considerable variation between ethno-linguistic groups. Genome-wide approaches have been extensively applied to search for genetic associations to complex traits in Europeans, but rarely in Africans. This is largely attributed to lower levels of funding, poor infrastructure and public health systems, and to the small pool of trained scientists. High levels of genetic variation and underlying population structure in Africans present significant challenges, but lower levels of linkage disequilibrium provide an opportunity for more effective localisation of causal variants. High throughput technologies, including dense genotyping arrays, genome sequencing and epigenome studies, together with plummeting costs, are making research more affordable, even for African scientists. Understanding the interactions between genome structure and environmental influences is essential to interpreting their contributions to the increase in infectious diseases and non-communicable diseases, exacerbated by adverse environments and lifestyle choices. The unique genome dynamics in African populations have an important role to play in understanding human health and susceptibility to disease. Copyright © 2012. Published by Elsevier B.V.
Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications

USDA-ARS?s Scientific Manuscript database

Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...
Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array

USDA-ARS?s Scientific Manuscript database

High-density single nucleotide polymorphism (SNP) genotyping chips are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships among individuals in populations and studying marker-trait associations in mapping experiments. We developed a genotyping array includ...
Comparison of Constitutional and Replication Stress-Induced Genome Structural Variation by SNP Array and Mate-Pair Sequencing

PubMed Central

Arlt, Martin F.; Ozdemir, Alev Cagla; Birkeland, Shanda R.; Lyons, Robert H.; Glover, Thomas W.; Wilson, Thomas E.

2011-01-01

Copy-number variants (CNVs) are a major source of genetic variation in human health and disease. Previous studies have implicated replication stress as a causative factor in CNV formation. However, existing data are technically limited in the quality of comparisons that can be made between human CNVs and experimentally induced variants. Here, we used two high-resolution strategies—single nucleotide polymorphism (SNP) arrays and mate-pair sequencing—to compare CNVs that occur constitutionally to those that arise following aphidicolin-induced DNA replication stress in the same human cells. Although the optimized methods provided complementary information, sequencing was more sensitive to small variants and provided superior structural descriptions. The majority of constitutional and all aphidicolin-induced CNVs appear to be formed via homology-independent mechanisms, while aphidicolin-induced CNVs were of a larger median size than constitutional events even when mate-pair data were considered. Aphidicolin thus appears to stimulate formation of CNVs that closely resemble human pathogenic CNVs and the subset of larger nonhomologous constitutional CNVs. PMID:21212237
GeneCount: genome-wide calculation of absolute tumor DNA copy numbers from array comparative genomic hybridization data

PubMed Central

Lyng, Heidi; Lando, Malin; Brøvig, Runar S; Svendsrud, Debbie H; Johansen, Morten; Galteland, Eivind; Brustugun, Odd T; Meza-Zepeda, Leonardo A; Myklebost, Ola; Kristensen, Gunnar B; Hovig, Eivind; Stokke, Trond

2008-01-01

Absolute tumor DNA copy numbers can currently be achieved only on a single gene basis by using fluorescence in situ hybridization (FISH). We present GeneCount, a method for genome-wide calculation of absolute copy numbers from clinical array comparative genomic hybridization data. The tumor cell fraction is reliably estimated in the model. Data consistent with FISH results are achieved. We demonstrate significant improvements over existing methods for exploring gene dosages and intratumor copy number heterogeneity in cancers. PMID:18500990
CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics.

PubMed

Gai, Xiaowu; Perin, Juan C; Murphy, Kevin; O'Hara, Ryan; D'arcy, Monica; Wenocur, Adam; Xie, Hongbo M; Rappaport, Eric F; Shaikh, Tamim H; White, Peter S

2010-02-04

Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist. We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV. To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated projects. Available on the web at: http://sourceforge.net/projects/cnv.
Design of a tobacco exon array with application to investigate the differential cadmium accumulation property in two tobacco varieties

PubMed Central

2012-01-01

Background For decades the tobacco plant has served as a model organism in plant biology to answer fundamental biological questions in the areas of plant development, physiology, and genetics. Due to the lack of sufficient coverage of genomic sequences, however, none of the expressed sequence tag (EST)-based chips developed to date cover gene expression from the whole genome. The availability of Tobacco Genome Initiative (TGI) sequences provides a useful resource to build a whole genome exon array, even if the assembled sequences are highly fragmented. Here, the design of a Tobacco Exon Array is reported and an application to improve the understanding of genes regulated by cadmium (Cd) in tobacco is described. Results From the analysis and annotation of the 1,271,256 Nicotiana tabacum fasta and quality files from methyl filtered genomic survey sequences (GSS) obtained from the TGI and ~56,000 ESTs available in public databases, an exon array with 272,342 probesets was designed (four probes per exon) and tested on two selected tobacco varieties. Two tobacco varieties out of 45 accumulating low and high cadmium in leaf were identified based on the GGE biplot analysis, which is analysis of the genotype main effect (G) plus analysis of the genotype by environment interaction (GE) of eight field trials (four fields over two years) showing reproducibility across the trials. The selected varieties were grown under greenhouse conditions in two different soils and subjected to exon array analyses using root and leaf tissues to understand the genetic make-up of the Cd accumulation. Conclusions An Affymetrix Exon Array was developed to cover a large (~90%) proportion of the tobacco gene space. The Tobacco Exon Array will be available for research use through Affymetrix array catalogue. As a proof of the exon array usability, we have demonstrated that the Tobacco Exon Array is a valuable tool for studying Cd accumulation in tobacco leaves. Data from field and greenhouse experiments supported by gene expression studies strongly suggested that the difference in leaf Cd accumulation between the two specific tobacco cultivars is dependent solely on genetic factors and genetic variability rather than on the environment. PMID:23190529
Genomics and Genetics in the Biology of Adaptation to Exercise

PubMed Central

Bouchard, Claude; Rankinen, Tuomo; Timmons, James A.

2014-01-01

This chapter is devoted to the role of genetic variation and gene-exercise interactions in the biology of adaptation to exercise. There is evidence from genetic epidemiology research that DNA sequence differences contribute to human variation in physical activity level, cardiorespiratory fitness in the untrained state, cardiovascular and metabolic response to acute exercise, and responsiveness to regular exercise. Methodological and technological advances have made it possible to undertake the molecular dissection of the genetic component of complex, multifactorial traits, such as those of interest to exercise biology, in terms of tissue expression profile, genes, and allelic variants. The evidence from animal models and human studies is considered. Data on candidate genes, genome-wide linkage results, genome-wide association findings, expression arrays, and combinations of these approaches are reviewed. Combining transcriptomic and genomic technologies has been shown to be more powerful as evidenced by the development of a recent molecular predictor of the ability to increase VO2max with exercise training. For exercise as a behavior and physiological fitness as a state to be major players in public health policies will require that that the role of human individuality and the influence of DNA sequence differences be understood. Likewise, progress in the use of exercise in therapeutic medicine will depend to a large extent on our ability to identify the favorable responders for given physiological properties to a given exercise regimen. PMID:23733655
HEROD: a human ethnic and regional specific omics database.

PubMed

Zeng, Xian; Tao, Lin; Zhang, Peng; Qin, Chu; Chen, Shangying; He, Weidong; Tan, Ying; Xia Liu, Hong; Yang, Sheng Yong; Chen, Zhe; Jiang, Yu Yang; Chen, Yu Zong

2017-10-15

Genetic and gene expression variations within and between populations and across geographical regions have substantial effects on the biological phenotypes, diseases, and therapeutic response. The development of precision medicines can be facilitated by the OMICS studies of the patients of specific ethnicity and geographic region. However, there is an inadequate facility for broadly and conveniently accessing the ethnic and regional specific OMICS data. Here, we introduced a new free database, HEROD, a human ethnic and regional specific OMICS database. Its first version contains the gene expression data of 53 070 patients of 169 diseases in seven ethnic populations from 193 cities/regions in 49 nations curated from the Gene Expression Omnibus (GEO), the ArrayExpress Archive of Functional Genomics Data (ArrayExpress), the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC). Geographic region information of curated patients was mainly manually extracted from referenced publications of each original study. These data can be accessed and downloaded via keyword search, World map search, and menu-bar search of disease name, the international classification of disease code, geographical region, location of sample collection, ethnic population, gender, age, sample source organ, patient type (patient or healthy), sample type (disease or normal tissue) and assay type on the web interface. The HEROD database is freely accessible at http://bidd2.nus.edu.sg/herod/index.php. The database and web interface are implemented in MySQL, PHP and HTML with all major browsers supported. phacyz@nus.edu.sg. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Using peptide array to identify binding motifs and interaction networks for modular domains.

PubMed

Li, Shawn S-C; Wu, Chenggang

2009-01-01

Specific protein-protein interactions underlie all essential biological processes and form the basis of cellular signal transduction. The recognition of a short, linear peptide sequence in one protein by a modular domain in another represents a common theme of macromolecular recognition in cells, and the importance of this mode of protein-protein interaction is highlighted by the large number of peptide-binding domains encoded by the human genome. This phenomenon also provides a unique opportunity to identify protein-protein binding events using peptide arrays and complementary biochemical assays. Accordingly, high-density peptide array has emerged as a useful tool by which to map domain-mediated protein-protein interaction networks at the proteome level. Using the Src-homology 2 (SH2) and 3 (SH3) domains as examples, we describe the application of oriented peptide array libraries in uncovering specific motifs recognized by an SH2 domain and the use of high-density peptide arrays in identifying interaction networks mediated by the SH3 domain. Methods reviewed here could also be applied to other modular domains, including catalytic domains, that recognize linear peptide sequences.
Identification of Genes Promoting Skin Youthfulness by Genome-Wide Association Study

PubMed Central

Chang, Anne L.S.; Atzmon, Gil; Bergman, Aviv; Brugmann, Samantha; Atwood, Scott X; Chang, Howard Y; Barzilai, Nir

2014-01-01

To identify genes that promote facial skin youthfulness (SY), a genome-wide association study on an Ashkenazi Jewish discovery group (n=428) was performed using Affymetrix 6.0 Single-Nucleotide Polymorphism (SNP) Array. After SNP quality controls, 901,470 SNPs remained for analysis. The eigenstrat method showed no stratification. Cases and controls were identified by global facial skin aging severity including intrinsic and extrinsic parameters. Linear regression adjusted for age and gender, with no significant differences in smoking history, body mass index, menopausal status, or personal or family history of centenarians. Six SNPs met the Bonferroni threshold with Pallele<10−8; two of these six had Pgenotype<10−8. Quantitative trait loci mapping confirmed linkage disequilibrium. The six SNPs were interrogated by MassARRAY in a replication group (n=436) with confirmation of rs6975107, an intronic region of KCND2 (potassium voltage-gated channel, Shal-related family member 2) (Pgenotype=0.023). A second replication group (n=371) confirmed rs318125, downstream of DIAPH2 (diaphanous homolog 2 (Drosophila)) (Pallele=0.010, Pgenotype=0.002) and rs7616661, downstream of EDEM1 (ER degradation enhancer, mannosidase α-like 1) (Pgenotype=0.042). DIAPH2 has been associated with premature ovarian insufficiency, an aging phenotype in humans. EDEM1 associates with lifespan in animal models, although not humans. KCND2 is expressed in human skin, but has not been associated with aging. These genes represent new candidate genes to study the molecular basis of healthy skin aging. PMID:24037343
Engineering customized TALE nucleases (TALENs) and TALE transcription factors by fast ligation-based automatable solid-phase high-throughput (FLASH) assembly.

PubMed

Reyon, Deepak; Maeder, Morgan L; Khayter, Cyd; Tsai, Shengdar Q; Foley, Jonathan E; Sander, Jeffry D; Joung, J Keith

2013-07-01

Customized DNA-binding domains made using transcription activator-like effector (TALE) repeats are rapidly growing in importance as widely applicable research tools. TALE nucleases (TALENs), composed of an engineered array of TALE repeats fused to the FokI nuclease domain, have been used successfully for directed genome editing in various organisms and cell types. TALE transcription factors (TALE-TFs), consisting of engineered TALE repeat arrays linked to a transcriptional regulatory domain, have been used to up- or downregulate expression of endogenous genes in human cells and plants. This unit describes a detailed protocol for the recently described fast ligation-based automatable solid-phase high-throughput (FLASH) assembly method. FLASH enables automated high-throughput construction of engineered TALE repeats using an automated liquid handling robot or manually using a multichannel pipet. Using the automated approach, a single researcher can construct up to 96 DNA fragments encoding TALE repeat arrays of various lengths in a single day, and then clone these to construct sequence-verified TALEN or TALE-TF expression plasmids in a week or less. Plasmids required for FLASH are available by request from the Joung lab (http://eGenome.org). This unit also describes improvements to the Zinc Finger and TALE Targeter (ZiFiT Targeter) web server (http://ZiFiT.partners.org) that facilitate the design and construction of FLASH TALE repeat arrays in high throughput. © 2013 by John Wiley & Sons, Inc.
Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index.

PubMed

Yang, Jian; Bakshi, Andrew; Zhu, Zhihong; Hemani, Gibran; Vinkhuyzen, Anna A E; Lee, Sang Hong; Robinson, Matthew R; Perry, John R B; Nolte, Ilja M; van Vliet-Ostaptchouk, Jana V; Snieder, Harold; Esko, Tonu; Milani, Lili; Mägi, Reedik; Metspalu, Andres; Hamsten, Anders; Magnusson, Patrik K E; Pedersen, Nancy L; Ingelsson, Erik; Soranzo, Nicole; Keller, Matthew C; Wray, Naomi R; Goddard, Michael E; Visscher, Peter M

2015-10-01

We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing data. We demonstrate using simulations based on whole-genome sequencing data that ∼97% and ∼68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ∼17 million imputed variants explain 56% (standard error (s.e.) = 2.3%) of variance for height and 27% (s.e. = 2.5%) of variance for body mass index (BMI), and we find evidence that height- and BMI-associated variants have been under natural selection. Considering the imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60-70% for height and 30-40% for BMI. Therefore, the missing heritability is small for both traits. For further discovery of genes associated with complex traits, a study design with SNP arrays followed by imputation is more cost-effective than whole-genome sequencing at current prices.
Genome-wide transcriptional profiling of human glioblastoma cells in response to ITE treatment.

PubMed

Kang, Bo; Zhou, Yanwen; Zheng, Min; Wang, Ying-Jie

2015-09-01

A ligand-activated transcription factor aryl hydrocarbon receptor (AhR) is recently revealed to play a key role in embryogenesis and tumorigenesis (Feng et al. [1], Safe et al. [2]) and 2-(1'H-indole-3'-carbonyl)-thiazole-4-carboxylic acid methyl ester (ITE) (Song et al. [3]) is an endogenous AhR ligand that possesses anti-tumor activity. In order to gain insights into how ITE acts via the AhR in embryogenesis and tumorigenesis, we analyzed the genome-wide transcriptional profiles of the following three groups of cells: the human glioblastoma U87 parental cells, U87 tumor sphere cells treated with vehicle (DMSO) and U87 tumor sphere cells treated with ITE. Here, we provide the details of the sample gathering strategy and show the quality controls and the analyses associated with our gene array data deposited into the Gene Expression Omnibus (GEO) under the accession code of GSE67986.

The druggable genome and support for target identification and validation in drug development.

PubMed

Finan, Chris; Gaulton, Anna; Kruger, Felix A; Lumbers, R Thomas; Shah, Tina; Engmann, Jorgen; Galver, Luana; Kelley, Ryan; Karlsson, Anneli; Santos, Rita; Overington, John P; Hingorani, Aroon D; Casas, Juan P

2017-03-29

Target identification (determining the correct drug targets for a disease) and target validation (demonstrating an effect of target perturbation on disease biomarkers and disease end points) are important steps in drug development. Clinically relevant associations of variants in genes encoding drug targets model the effect of modifying the same targets pharmacologically. To delineate drug development (including repurposing) opportunities arising from this paradigm, we connected complex disease- and biomarker-associated loci from genome-wide association studies to an updated set of genes encoding druggable human proteins, to agents with bioactivity against these targets, and, where there were licensed drugs, to clinical indications. We used this set of genes to inform the design of a new genotyping array, which will enable association studies of druggable genes for drug target selection and validation in human disease. Copyright © 2017, American Association for the Advancement of Science.
The genetic prehistory of southern Africa.

PubMed

Pickrell, Joseph K; Patterson, Nick; Barbieri, Chiara; Berthold, Falko; Gerlach, Linda; Güldemann, Tom; Kure, Blesswell; Mpoloka, Sununguko Wata; Nakagawa, Hirosi; Naumann, Christfried; Lipson, Mark; Loh, Po-Ru; Lachance, Joseph; Mountain, Joanna; Bustamante, Carlos D; Berger, Bonnie; Tishkoff, Sarah A; Henn, Brenna M; Stoneking, Mark; Reich, David; Pakendorf, Brigitte

2012-01-01

Southern and eastern African populations that speak non-Bantu languages with click consonants are known to harbour some of the most ancient genetic lineages in humans, but their relationships are poorly understood. Here, we report data from 23 populations analysed at over half a million single-nucleotide polymorphisms, using a genome-wide array designed for studying human history. The southern African Khoisan fall into two genetic groups, loosely corresponding to the northwestern and southeastern Kalahari, which we show separated within the last 30,000 years. We find that all individuals derive at least a few percent of their genomes from admixture with non-Khoisan populations that began ∼1,200 years ago. In addition, the East African Hadza and Sandawe derive a fraction of their ancestry from admixture with a population related to the Khoisan, supporting the hypothesis of an ancient link between southern and eastern Africa.
Whole Genome Amplification of Labeled Viable Single Cells Suited for Array-Comparative Genomic Hybridization.

PubMed

Kroneis, Thomas; El-Heliebi, Amin

2015-01-01

Understanding details of a complex biological system makes it necessary to dismantle it down to its components. Immunostaining techniques allow identification of several distinct cell types thereby giving an inside view of intercellular heterogeneity. Often staining reveals that the most remarkable cells are the rarest. To further characterize the target cells on a molecular level, single cell techniques are necessary. Here, we describe the immunostaining, micromanipulation, and whole genome amplification of single cells for the purpose of genomic characterization. First, we exemplify the preparation of cell suspensions from cultured cells as well as the isolation of peripheral mononucleated cells from blood. The target cell population is then subjected to immunostaining. After cytocentrifugation target cells are isolated by micromanipulation and forwarded to whole genome amplification. For whole genome amplification, we use GenomePlex(®) technology allowing downstream genomic analysis such as array-comparative genomic hybridization.
CRISPRDetect: A flexible algorithm to define CRISPR arrays.

PubMed

Biswas, Ambarish; Staals, Raymond H J; Morales, Sergio E; Fineran, Peter C; Brown, Chris M

2016-05-17

CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci. We have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with 'extra-large' repeats in bacteria (repeats 44-50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets. CRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.
Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA.

PubMed

Skvortsova, Ksenia; Zotenko, Elena; Luu, Phuc-Loi; Gould, Cathryn M; Nair, Shalima S; Clark, Susan J; Stirzaker, Clare

2017-01-01

The discovery that 5-methylcytosine (5mC) can be oxidized to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET) proteins has prompted wide interest in the potential role of 5hmC in reshaping the mammalian DNA methylation landscape. The gold-standard bisulphite conversion technologies to study DNA methylation do not distinguish between 5mC and 5hmC. However, new approaches to mapping 5hmC genome-wide have advanced rapidly, although it is unclear how the different methods compare in accurately calling 5hmC. In this study, we provide a comparative analysis on brain DNA using three 5hmC genome-wide approaches, namely whole-genome bisulphite/oxidative bisulphite sequencing (WG Bis/OxBis-seq), Infinium HumanMethylation450 BeadChip arrays coupled with oxidative bisulphite (HM450K Bis/OxBis) and antibody-based immunoprecipitation and sequencing of hydroxymethylated DNA (hMeDIP-seq). We also perform loci-specific TET-assisted bisulphite sequencing (TAB-seq) for validation of candidate regions. We show that whole-genome single-base resolution approaches are advantaged in providing precise 5hmC values but require high sequencing depth to accurately measure 5hmC, as this modification is commonly in low abundance in mammalian cells. HM450K arrays coupled with oxidative bisulphite provide a cost-effective representation of 5hmC distribution, at CpG sites with 5hmC levels >~10%. However, 5hmC analysis is restricted to the genomic location of the probes, which is an important consideration as 5hmC modification is commonly enriched at enhancer elements. Finally, we show that the widely used hMeDIP-seq method provides an efficient genome-wide profile of 5hmC and shows high correlation with WG Bis/OxBis-seq 5hmC distribution in brain DNA. However, in cell line DNA with low levels of 5hmC, hMeDIP-seq-enriched regions are not detected by WG Bis/OxBis or HM450K, either suggesting misinterpretation of 5hmC calls by hMeDIP or lack of sensitivity of the latter methods. We highlight both the advantages and caveats of three commonly used genome-wide 5hmC profiling technologies and show that interpretation of 5hmC data can be significantly influenced by the sensitivity of methods used, especially as the levels of 5hmC are low and vary in different cell types and different genomic locations.
Characterization of a chromosome-specific chimpanzee alpha satellite subset: Evolutionary relationship to subsets on human chromosomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Warburton, P.E.; Gosden, J.; Lawson, D.

1996-04-15

Alpha satellite DNA is a tandemly repeated DNA family found at the centromeres of all primate chromosomes examined. The fundamental repeat units of alpha satellite DNA are diverged 169- to 172-bp monomers, often found to be organized in chromosome-specific higher-order repeat units. The chromosomes of human (Homo sapiens (HSA)), chimpanzee (Pan troglodytes (PTR) and Pan paniscus), and gorilla (Gorilla gorilla) share a remarkable similarity and synteny. It is of interest to ask if alpha satellite arrays at centromeres of homologous chromosomes between these species are closely related (evolving in an orthologous manner) or if the evolutionary processes that homogenize andmore » spread these arrays within and between chromosomes result in nonorthologous evolution of arrays. By using PCR primers specific for human chromosome 17-specific alpha satellite DNA, we have amplified, cloned, and characterized a chromosome-specific subset from the PTR chimpanzee genome. Hybridization both on Southern blots and in situ as well as sequence analysis show that this subset is most closely related, as expected, to sequences on HSA 17. However, in situ hybridization reveals that this subset is not found on the homologous chromosome in chimpanzee (PTR 19), but instead on PTR 12, which is homologous to HSA 2p. 40 refs., 3 figs.« less
Novel Array-Based Target Identification for Synergistic Sensitization of Breast Cancer to Herceptin

DTIC Science & Technology

2010-05-01

cancer cell lines and expressed in human breast tumors. Oncotarget, (submitted). Abstract Farah Rahmatpanah, Zhenyu Jia, Tatsuya Azum, Eileen Adamson...Michael McClelland, Eileen Adamson, Dan Mercola. Egr1 regulates the coordinated expression of numerous EGF receptor target genes as identified by...ChIP on chip. Genome Biology 2008, 9:R166 [Epub ahead of print]. Jun Hayakawa, Shalu Mittal, Yipeng Wang, Kemal Korkmaz, Mashide Ohmichi, Eileen
Simultaneous Profiling of DNA Mutation and Methylation by Melting Analysis Using Magnetoresistive Biosensor Array.

PubMed

Rizzi, Giovanni; Lee, Jung-Rok; Dahl, Christina; Guldberg, Per; Dufva, Martin; Wang, Shan X; Hansen, Mikkel F

2017-09-26

Epigenetic modifications, in particular DNA methylation, are gaining increasing interest as complementary information to DNA mutations for cancer diagnostics and prognostics. We introduce a method to simultaneously profile DNA mutation and methylation events for an array of sites with single site specificity. Genomic (mutation) or bisulphite-treated (methylation) DNA is amplified using nondiscriminatory primers, and the amplicons are then hybridized to a giant magnetoresistive (GMR) biosensor array followed by melting curve measurements. The GMR biosensor platform offers scalable multiplexed detection of DNA hybridization, which is insensitive to temperature variation. The melting curve approach further enhances the assay specificity and tolerance to variations in probe length. We demonstrate the utility of this method by simultaneously profiling five mutation and four methylation sites in human melanoma cell lines. The method correctly identified all mutation and methylation events and further provided quantitative assessment of methylation density validated by bisulphite pyrosequencing.
A Universal Genome Array and Transcriptome Atlas for Brachypodium Distachyon

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mockler, Todd

Brachypodium distachyon is the premier experimental model grass platform and is related to candidate feedstock crops for bioethanol production. Based on the DOE-JGI Brachypodium Bd21 genome sequence and annotation we designed a whole genome DNA microarray platform. The quality of this array platform is unprecedented due to the exceptional quality of the Brachypodium genome assembly and annotation and the stringent probe selection criteria employed in the design. We worked with members of the international community and the bioinformatics/design team at Affymetrix at all stages in the development of the array. We used the Brachypodium arrays to interrogate the transcriptomes ofmore » plants grown in a variety of environmental conditions including diurnal and circadian light/temperature conditions and under a variety of environmental conditions. We examined the transciptional responses of Brachypodium seedlings subjected to various abiotic stresses including heat, cold, salt, and high intensity light. We generated a gene expression atlas representing various organs and developmental stages. The results of these efforts including all microarray datasets are published and available at online public databases.« less
The Landscape of Somatic Chromosomal Copy Number Aberrations in GEM Models of Prostate Carcinoma

PubMed Central

Bianchi-Frias, Daniella; Hernandez, Susana A.; Coleman, Roger; Wu, Hong; Nelson, Peter S.

2015-01-01

Human prostate cancer (PCa) is known to harbor recurrent genomic aberrations consisting of chromosomal losses, gains, rearrangements and mutations that involve oncogenes and tumor suppressors. Genetically engineered mouse (GEM) models have been constructed to assess the causal role of these putative oncogenic events and provide molecular insight into disease pathogenesis. While GEM models generally initiate neoplasia by manipulating a single gene, expression profiles of GEM tumors typically comprise hundreds of transcript alterations. It is unclear whether these transcriptional changes represent the pleiotropic effects of single oncogenes, and/or cooperating genomic or epigenomic events. Therefore, it was determined if structural chromosomal alterations occur in GEM models of PCa and whether the changes are concordant with human carcinomas. Whole genome array-based comparative genomic hybridization (CGH) was used to identify somatic chromosomal copy number aberrations (SCNAs) in the widely used TRAMP, Hi-Myc, Pten-null and LADY GEM models. Interestingly, very few SCNAs were identified and the genomic architecture of Hi-Myc, Pten-null and LADY tumors were essentially identical to the germline. TRAMP neuroendocrine carcinomas contained SCNAs, which comprised three recurrent aberrations including a single copy loss of chromosome 19 (encoding Pten). In contrast, cell lines derived from the TRAMP, Hi-Myc, and Pten-null tumors were notable for numerous SCNAs that included copy gains of chromosome 15 (encoding Myc) and losses of chromosome 11 (encoding p53). PMID:25298407
Stability of the human sperm DNA methylome to folic acid fortification and short-term supplementation

PubMed Central

Chan, D.; McGraw, S.; Klein, K.; Wallock, L.M.; Konermann, C.; Plass, C.; Chan, P.; Robaire, B.; Jacob, R.A.; Greenwood, C.M.T.; Trasler, J.M.

2017-01-01

STUDY QUESTION Do short-term and long-term exposures to low-dose folic acid supplementation alter DNA methylation in sperm? SUMMARY ANSWER No alterations in sperm DNA methylation patterns were found following the administration of low-dose folic acid supplements of 400 μg/day for 90 days (short-term exposure) or when pre-fortification of food with folic acid and post-fortification sperm samples (long-term exposure) were compared. WHAT IS KNOWN ALREADY Excess dietary folate may be detrimental to health and DNA methylation profiles due to folate's role in one-carbon metabolism and the formation of S-adenosyl methionine, the universal methyl donor. DNA methylation patterns are established in developing male germ cells and have been suggested to be affected by high-dose (5 mg/day) folic acid supplementation. STUDY DESIGN, SIZE, DURATION This is a control versus treatment study where genome-wide sperm DNA methylation patterns were examined prior to fortification of food (1996–1997) in men with no history of infertility at baseline and following 90-day exposure to placebo (n = 9) or supplement containing 400 μg folic acid/day (n = 10). Additionally, pre-fortification sperm DNA methylation profiles (n = 19) were compared with those of a group of post-fortification (post-2004) men (n = 8) who had been exposed for several years to dietary folic acid fortification. PARTICIPANTS/MATERIALS, SETTING, METHODS Blood and seminal plasma folate levels were measured in participants before and following the 90-day treatment with placebo or supplement. Sperm DNA methylation was assessed using the whole-genome and genome-wide techniques, MassArray epityper, restriction landmark genomic scanning, methyl-CpG immunoprecipitation and Illumina HumanMethylation450 Bead Array. MAIN RESULTS AND THE ROLE OF CHANCE Following treatment, supplemented individuals had significantly higher levels of blood and seminal plasma folates compared to placebo. Initial first-generation genome-wide analyses of sperm DNA methylation showed little evidence of changes when comparing pre- and post-treatment samples. With Illumina HumanMethylation450 BeadChip arrays, no significant changes were observed in individual probes following low-level supplementation; when compared with those of the post-fortification cohort, there were also few differences in methylation despite exposure to years of fortified foods. LARGE SCALE DATA Illumina HumanMethylation450 BeadChip data from this study have been submitted to the NCBI Gene Expression Omnibus under the accession number GSE89781. LIMITATIONS, REASONS FOR CAUTION This study was limited to the number of participants available in each cohort, in particular those who were not exposed to early (pre-1998) fortification of food with folic acid. While genome-wide DNA methylation was assessed with several techniques that targeted genic and CpG-rich regions, intergenic regions were less well interrogated. WIDER IMPLICATIONS OF THE FINDINGS Overall, our findings provide evidence that short-term exposure to low-dose folic acid supplements of 400 μg/day, over a period of 3 months, a duration of time that might occur during infertility treatments, has no major impact on the sperm DNA methylome. STUDY FUNDING/COMPETING INTERESTS This work was supported by a grant to J.M.T. from the Canadian Institutes of Health Research (CIHR: MOP-89944). The authors have no conflicts of interest to declare. PMID:27994001
Marker chromosome genomic structure and temporal origin implicate a chromoanasynthesis event in a family with pleiotropic psychiatric phenotypes.

PubMed

Grochowski, Christopher M; Gu, Shen; Yuan, Bo; Tcw, Julia; Brennand, Kristen J; Sebat, Jonathan; Malhotra, Dheeraj; McCarthy, Shane; Rudolph, Uwe; Lindstrand, Anna; Chong, Zechen; Levy, Deborah L; Lupski, James R; Carvalho, Claudia M B

2018-04-25

Small supernumerary marker chromosomes (sSMC) are chromosomal fragments difficult to characterize genomically. Here, we detail a proband with schizoaffective disorder and a mother with bipolar disorder with psychotic features who present with a marker chromosome that segregates with disease. We explored the architecture of this marker and investigated its temporal origin. Array comparative genomic hybridization (aCGH) analysis revealed three duplications and three triplications that spanned the short arm of chromosome 9, suggestive of a chromoanasynthesis-like event. Segregation of marker genotypes, phased using sSMC mosaicism in the mother, provided evidence that it was generated during a germline-level event in the proband's maternal grandmother. Whole-genome sequencing (WGS) was performed to resolve the structure and junctions of the chromosomal fragments, revealing further complexities. While structural variations have been previously associated with neuropsychiatric disorders and marker chromosomes, here we detail the precise architecture, human life-cycle genesis, and propose a DNA replicative/repair mechanism underlying formation. © 2018 Wiley Periodicals, Inc.
Development of a dense SNP-based linkage map of an apple rootstock progeny using the Malus Infinium whole genome genotyping array.

PubMed

Antanaviciute, Laima; Fernández-Fernández, Felicidad; Jansen, Johannes; Banchi, Elisa; Evans, Katherine M; Viola, Roberto; Velasco, Riccardo; Dunwell, Jim M; Troggio, Michela; Sargent, Daniel J

2012-05-25

A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the 'Golden Delicious' genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the 'Golden Delicious' pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the 'Golden Delicious' reference sequence will assist in the continued improvement of the genome sequence assembly for that variety.
Tumor Touch Imprints as Source for Whole Genome Analysis of Neuroblastoma Tumors

PubMed Central

Brunner, Clemens; Brunner-Herglotz, Bettina; Ziegler, Andrea; Frech, Christian; Amann, Gabriele; Ladenstein, Ruth; Ambros, Inge M.; Ambros, Peter F.

2016-01-01

Introduction Tumor touch imprints (TTIs) are routinely used for the molecular diagnosis of neuroblastomas by interphase fluorescence in-situ hybridization (I-FISH). However, in order to facilitate a comprehensive, up-to-date molecular diagnosis of neuroblastomas and to identify new markers to refine risk and therapy stratification methods, whole genome approaches are needed. We examined the applicability of an ultra-high density SNP array platform that identifies copy number changes of varying sizes down to a few exons for the detection of genomic changes in tumor DNA extracted from TTIs. Material and Methods DNAs were extracted from TTIs of 46 neuroblastoma and 4 other pediatric tumors. The DNAs were analyzed on the Cytoscan HD SNP array platform to evaluate numerical and structural genomic aberrations. The quality of the data obtained from TTIs was compared to that from randomly chosen fresh or fresh frozen solid tumors (n = 212) and I-FISH validation was performed. Results SNP array profiles were obtained from 48 (out of 50) TTI DNAs of which 47 showed genomic aberrations. The high marker density allowed for single gene analysis, e.g. loss of nine exons in the ATRX gene and the visualization of chromothripsis. Data quality was comparable to fresh or fresh frozen tumor SNP profiles. SNP array results were confirmed by I-FISH. Conclusion TTIs are an excellent source for SNP array processing with the advantage of simple handling, distribution and storage of tumor tissue on glass slides. The minimal amount of tumor tissue needed to analyze whole genomes makes TTIs an economic surrogate source in the molecular diagnostic work up of tumor samples. PMID:27560999
Expression Profiling Smackdown: Human Transcriptome Array HTA 2.0 vs. RNA-Seq

PubMed Central

Palermo, Meghann; Driscoll, Heather; Tighe, Scott; Dragon, Julie; Bond, Jeff; Shukla, Arti; Vangala, Mahesh; Vincent, James; Hunter, Tim

2014-01-01

The advent of both microarray and massively parallel sequencing have revolutionized high-throughput analysis of the human transcriptome. Due to limitations in microarray technology, detecting and quantifying coding transcript isoforms, in addition to non-coding transcripts, has been challenging. As a result, RNA-Seq has been the preferred method for characterizing the full human transcriptome, until now. A new high-resolution array from Affymetrix, GeneChip Human Transcriptome Array 2.0 (HTA 2.0), has been designed to interrogate all transcript isoforms in the human transcriptome with >6 million probes targeting coding transcripts, exon-exon splice junctions, and non-coding transcripts. Here we compare expression results from GeneChip HTA 2.0 and RNA-Seq data using identical RNA extractions from three samples each of healthy human mesothelial cells in culture, LP9-C1, and healthy mesothelial cells treated with asbestos, LP9-A1. For GeneChip HTA 2.0 sample preparation, we chose to compare two target preparation methods, NuGEN Ovation Pico WTA V2 with the Encore Biotin Module versus Affymetrix's GeneChip WT PLUS with the WT Terminal Labeling Kit, on identical RNA extractions from both untreated and treated samples. These same RNA extractions were used for the RNA-Seq library preparation. All analyses were performed in Partek Genomics Suite 6.6. Expression profiles for control and asbestos-treated mesothelial cells prepared with NuGEN versus Affymetrix target preparation methods (GeneChip HTA 2.0) are compared to each other as well as to RNA-Seq results.
Epigenomics of Alzheimer’s Disease

PubMed Central

Bennett, David A.; Yu, Lei; Yang, Jingyun; Srivastava, Gyan P.; Aubin, Cristin; De Jager, Philip L.

2014-01-01

Alzheimer’s disease (AD) is a large and growing public health problem. It is characterized by the accumulation of amyloid-β peptides and abnormally phosphorylated tau proteins that are associated with cognitive decline and dementia. Much has been learned about the genomics of AD from linkage analyses and more recently, genome-wide association studies. Several but not all aspects of the genomic landscape are involved in amyloid-metabolism. The moderate concordance of disease among twins suggests other factors, potentially epigenomic factors, are related to AD. We are at the earliest stages of examining the relation of the epigenome to the clinical and pathologic phenotypes that characterize AD. Our literature review suggests that there is some evidence of age-related changes in human brain methylation. Unfortunately, studies of AD have been relatively small with limited coverage of methylation sites and microRNA, let alone other epigenomic marks. We are in the midst of two large studies of human brains including coverage of more than 420,000 autosomal cytosine-guanine dinucleotides (CGs) with the Illumina Infinium HumanMethylation 450K BeadArray, and histone acetylation with chromatin immunoprecipitation-sequencing. We present descriptive data to help inform other researchers what to expect from these approaches in order to better design and power their studies. We then discuss future directions to inform on the epigenomic architecture of AD. PMID:24905038
Who Are the Okinawans? Ancestry, Genome Diversity, and Implications for the Genetic Study of Human Longevity From a Geographically Isolated Population

PubMed Central

Hsueh, Wen-Chi; He, Qimei; Willcox, D. Craig; Nievergelt, Caroline M.; Donlon, Timothy A.; Kwok, Pui-Yan; Suzuki, Makoto; Willcox, Bradley J.

2014-01-01

Isolated populations have advantages for genetic studies of longevity from decreased haplotype diversity and long-range linkage disequilibrium. This permits smaller sample sizes without loss of power, among other utilities. Little is known about the genome of the Okinawans, a potential population isolate, recognized for longevity. Therefore, we assessed genetic diversity, structure, and admixture in Okinawans, and compared this with Caucasians, Chinese, Japanese, and Africans from HapMap II, genotyped on the same Affymetrix GeneChip Human Mapping 500K array. Principal component analysis, haplotype coverage, and linkage disequilibrium decay revealed a distinct Okinawan genome—more homogeneity, less haplotype diversity, and longer range linkage disequilibrium. Population structure and admixture analyses utilizing 52 global reference populations from the Human Genome Diversity Cell Line Panel demonstrated that Okinawans clustered almost exclusively with East Asians. Sibling relative risk (λs) analysis revealed that siblings of Okinawan centenarians have 3.11 times (females) and 3.77 times (males) more likelihood of centenarianism. These findings suggest that Okinawans are genetically distinct and share several characteristics of a population isolate, which are prone to develop extreme phenotypes (eg, longevity) from genetic drift, natural selection, and population bottlenecks. These data support further exploration of genetic influence on longevity in the Okinawans. PMID:24444611
HumanMethylation450K Array–Identified Biomarkers Predict Tumour Recurrence/Progression at Initial Diagnosis of High-risk Non-muscle Invasive Bladder Cancer

PubMed Central

Kitchen, Mark O; Bryan, Richard T; Emes, Richard D; Luscombe, Christopher J; Cheng, KK; Zeegers, Maurice P; James, Nicholas D; Gommersall, Lyndon M; Fryer, Anthony A

2018-01-01

Background: High-risk non-muscle invasive bladder cancer (HR-NMIBC) is a clinically unpredictable disease. Despite clinical risk estimation tools, many patients are undertreated with intra-vesical therapies alone, whereas others may be over-treated with early radical surgery. Molecular biomarkers, particularly DNA methylation, have been reported as predictive of tumour/patient outcomes in numerous solid organ and haematologic malignancies; however, there are few reports in HR-NMIBC and none using genome-wide array assessment. We therefore sought to identify novel DNA methylation markers of HR-NMIBC clinical outcomes that might predict tumour behaviour at initial diagnosis and help guide patient management. Patients and methods: A total of 21 primary initial diagnosis HR-NMIBC tumours were analysed by Illumina HumanMethylation450 BeadChip arrays and subsequently bisulphite Pyrosequencing. In all, 7 had not recurred at 1 year after resection and 14 had recurred and/or progressed despite intra-vesical BCG. A further independent cohort of 32 HR-NMIBC tumours (17 no recurrence and 15 recurrence and/or progression despite BCG) were also assessed by bisulphite Pyrosequencing. Results: Array analyses identified 206 CpG loci that segregated non-recurrent HR-NMIBC tumours from clinically more aggressive recurrence/progression tumours. Hypermethylation of CpG cg11850659 and hypomethylation of CpG cg01149192 in combination predicted HR-NMIBC recurrence and/or progression within 1 year of diagnosis with 83% sensitivity, 79% specificity, and 83% positive and 79% negative predictive values. Conclusions: This is the first genome-wide DNA methylation analysis of a unique HR-NMIBC tumour cohort encompassing known 1-year clinical outcomes. Our analyses identified potential novel epigenetic markers that could help guide individual patient management in this clinically unpredictable disease. PMID:29343995
BeadArray Expression Analysis Using Bioconductor

PubMed Central

Ritchie, Matthew E.; Dunning, Mark J.; Smith, Mike L.; Shi, Wei; Lynch, Andy G.

2011-01-01

Illumina whole-genome expression BeadArrays are a popular choice in gene profiling studies. Aside from the vendor-provided software tools for analyzing BeadArray expression data (GenomeStudio/BeadStudio), there exists a comprehensive set of open-source analysis tools in the Bioconductor project, many of which have been tailored to exploit the unique properties of this platform. In this article, we explore a number of these software packages and demonstrate how to perform a complete analysis of BeadArray data in various formats. The key steps of importing data, performing quality assessments, preprocessing, and annotation in the common setting of assessing differential expression in designed experiments will be covered. PMID:22144879
Estimation of linkage disequilibrium and interspecific gene flow in Ficedula flycatchers by a newly developed 50k single-nucleotide polymorphism array

PubMed Central

Kawakami, Takeshi; Backström, Niclas; Burri, Reto; Husby, Arild; Olason, Pall; Rice, Amber M; Ålund, Murielle; Qvarnström, Anna; Ellegren, Hans

2014-01-01

With the access to draft genome sequence assemblies and whole-genome resequencing data from population samples, molecular ecology studies will be able to take truly genome-wide approaches. This now applies to an avian model system in ecological and evolutionary research: Old World flycatchers of the genus Ficedula, for which we recently obtained a 1.1 Gb collared flycatcher genome assembly and identified 13 million single-nucleotide polymorphism (SNP)s in population resequencing of this species and its sister species, pied flycatcher. Here, we developed a custom 50K Illumina iSelect flycatcher SNP array with markers covering 30 autosomes and the Z chromosome. Using a number of selection criteria for inclusion in the array, both genotyping success rate and polymorphism information content (mean marker heterozygosity = 0.41) were high. We used the array to assess linkage disequilibrium (LD) and hybridization in flycatchers. Linkage disequilibrium declined quickly to the background level at an average distance of 17 kb, but the extent of LD varied markedly within the genome and was more than 10-fold higher in ‘genomic islands’ of differentiation than in the rest of the genome. Genetic ancestry analysis identified 33 F1 hybrids but no later-generation hybrids from sympatric populations of collared flycatchers and pied flycatchers, contradicting earlier reports of backcrosses identified from much fewer number of markers. With an estimated divergence time as recently as <1 Ma, this suggests strong selection against F1 hybrids and unusually rapid evolution of reproductive incompatibility in an avian system. PMID:24784959

MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

PubMed Central

Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

2007-01-01

MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813
Development of a 63K SNP array for Gossypium and high-density mapping of intra- and inter-specific populations of cotton (G. hirsutum L.)

USDA-ARS?s Scientific Manuscript database

High-throughput genotyping arrays provide a standardized resource for crop research communities that are useful for a breadth of applications including high-density genetic mapping, genome-wide association studies (GWAS), genomic selection (GS), candidate marker and quantitative trait loci (QTL) ide...
Diversity, genetic mapping, and signatures of domestication in the carrot (Daucus carota L.) genome, as revealed by Diversity Arrays Technology (DArT) markers

USDA-ARS?s Scientific Manuscript database

Carrot is one of the most economically important vegetables worldwide, however, genetic and genomic resources supporting carrot breeding remain limited. We developed a Diversity Arrays Technology (DArT) platform for wild and cultivated carrot and used it to investigate genetic diversity and to devel...
Development and evaluation of a high density genotyping 'Axiom_Arachis' array with 58K SNPs for accelerating genetics and breeding in groundnut

USDA-ARS?s Scientific Manuscript database

Single nucleotide polymorphisms (SNPs) are the most abundant DNA sequence variation in the genomes which can be used to associate genotypic variation to the phenotype. Therefore, availability of a high-density SNP array with uniform genome coverage can advance genetic studies and breeding applicatio...
Evaluation of whole genome amplified DNA to decrease material expenditure and increase quality.

PubMed

Bækvad-Hansen, Marie; Bybjerg-Grauholm, Jonas; Poulsen, Jesper B; Hansen, Christine S; Hougaard, David M; Hollegaard, Mads V

2017-06-01

The overall aim of this study is to evaluate whole genome amplification of DNA extracted from dried blood spot samples. We wish to explore ways of optimizing the amplification process, while decreasing the amount of input material and inherently the cost. Our primary focus of optimization is on the amount of input material, the amplification reaction volume, the number of replicates and amplification time and temperature. Increasing the quality of the amplified DNA and the subsequent results of array genotyping is a secondary aim of this project. This study is based on DNA extracted from dried blood spot samples. The extracted DNA was subsequently whole genome amplified using the REPLIg kit and genotyped on the PsychArray BeadChip (assessing > 570,000 SNPs genome wide). We used Genome Studio to evaluate the quality of the genotype data by call rates and log R ratios. The whole genome amplification process is robust and does not vary between replicates. Altering amplification time, temperature or number of replicates did not affect our results. We found that spot size i.e. amount of input material could be reduced without compromising the quality of the array genotyping data. We also showed that whole genome amplification reaction volumes can be reduced by a factor of 4, without compromising the DNA quality. Whole genome amplified DNA samples from dried blood spots is well suited for array genotyping and produces robust and reliable genotype data. However, the amplification process introduces additional noise to the data, making detection of structural variants such as copy number variants difficult. With this study, we explore ways of optimizing the amplification protocol in order to reduce noise and increase data quality. We found, that the amplification process was very robust, and that changes in amplification time or temperature did not alter the genotyping calls or quality of the array data. Adding additional replicates of each sample also lead to insignificant changes in the array data. Thus, the amount of noise introduced by the amplification process was consistent regardless of changes made to the amplification protocol. We also explored ways of decreasing material expenditure by reducing the spot size or the amplification reaction volume. The reduction did not affect the quality of the genotyping data.
Harnessing the genome for characterization of GPCRs in cancer pathogenesis

PubMed Central

Feigin, Michael E.

2014-01-01

G-protein coupled receptors (GPCRs) mediate numerous physiological processes and represent the targets for a vast array of therapeutics for diseases ranging from depression to hypertension to reflux. Despite the recognition that GPCRs can act as oncogenes and tumor suppressors by regulating oncogenic signaling networks, few drugs targeting GPCRs are utilized in cancer therapy. Recent large-scale genome-wide analyses of multiple human tumors have uncovered novel GPCRs altered in cancer. However, the work of determining which GPCRs from these lists are drivers of tumorigenesis, and hence valid therapeutic targets, remains a formidable challenge. In this review I will highlight recent studies providing evidence that GPCRs are relevant targets for cancer therapy through their effects on known cancer signaling pathways, tumor progression, invasion and metastasis, and the microenvironment. Furthermore, I will explore how genomic analysis is beginning to shine a light on GPCRs as therapeutic targets in the age of personalized medicine. PMID:23927072
Site-Specific Editing of the Plasmodium falciparum Genome Using Engineered Zinc-Finger Nucleases

PubMed Central

Straimer, Judith; Lee, Marcus CS; Lee, Andrew H; Zeitler, Bryan; Williams, April E; Pearl, Jocelynn R; Zhang, Lei; Rebar, Edward J; Gregory, Philip D; Llinás, Manuel; Urnov, Fyodor D; Fidock, David A

2013-01-01

Malaria afflicts over 200 million people worldwide and its most lethal etiologic agent, Plasmodium falciparum, is evolving to resist even the latest-generation therapeutics. Efficient tools for genome-directed investigations of P. falciparum pathogenesis, including drug resistance mechanisms, are clearly required. Here we report rapid and targeted genetic engineering of this parasite, using zinc-finger nucleases (ZFNs) that produce a double-strand break in a user-defined locus and trigger homology-directed repair. Targeting an integrated egfp locus, we obtained gene deletion parasites with unprecedented speed (two weeks), both with and without direct selection. ZFNs engineered against the endogenous parasite gene pfcrt, responsible for chloroquine treatment escape, rapidly produced parasites that carried either an allelic replacement or a panel of specified point mutations. The efficiency, versatility and precision of this method will enable a diverse array of genome editing approaches to interrogate this human pathogen. PMID:22922501
Micro-Scale Genomic DNA Copy Number Aberrations as Another Means of Mutagenesis in Breast Cancer

PubMed Central

Chao, Hann-Hsiang; He, Xiaping; Parker, Joel S.; Zhao, Wei; Perou, Charles M.

2012-01-01

Introduction In breast cancer, the basal-like subtype has high levels of genomic instability relative to other breast cancer subtypes with many basal-like-specific regions of aberration. There is evidence that this genomic instability extends to smaller scale genomic aberrations, as shown by a previously described micro-deletion event in the PTEN gene in the Basal-like SUM149 breast cancer cell line. Methods We sought to identify if small regions of genomic DNA copy number changes exist by using a high density, gene-centric Comparative Genomic Hybridizations (CGH) array on cell lines and primary tumors. A custom tiling array for CGH (244,000 probes, 200 bp tiling resolution) was created to identify small regions of genomic change, which was focused on previously identified basal-like-specific, and general cancer genes. Tumor genomic DNA from 94 patients and 2 breast cancer cell lines was labeled and hybridized to these arrays. Aberrations were called using SWITCHdna and the smallest 25% of SWITCHdna-defined genomic segments were called micro-aberrations (<64 contiguous probes, ∼ 15 kb). Results Our data showed that primary tumor breast cancer genomes frequently contained many small-scale copy number gains and losses, termed micro-aberrations, most of which are undetectable using typical-density genome-wide aCGH arrays. The basal-like subtype exhibited the highest incidence of these events. These micro-aberrations sometimes altered expression of the involved gene. We confirmed the presence of the PTEN micro-amplification in SUM149 and by mRNA-seq showed that this resulted in loss of expression of all exons downstream of this event. Micro-aberrations disproportionately affected the 5′ regions of the affected genes, including the promoter region, and high frequency of micro-aberrations was associated with poor survival. Conclusion Using a high-probe-density, gene-centric aCGH microarray, we present evidence of small-scale genomic aberrations that can contribute to gene inactivation. These events may contribute to tumor formation through mechanisms not detected using conventional DNA copy number analyses. PMID:23284754
Development of a Medium Density Combined-Species SNP Array for Pacific and European Oysters (Crassostrea gigas and Ostrea edulis).

PubMed

Gutierrez, Alejandro P; Turner, Frances; Gharbi, Karim; Talbot, Richard; Lowe, Natalie R; Peñaloza, Carolina; McCullough, Mark; Prodöhl, Paulo A; Bean, Tim P; Houston, Ross D

2017-07-05

SNP arrays are enabling tools for high-resolution studies of the genetic basis of complex traits in farmed and wild animals. Oysters are of critical importance in many regions from both an ecological and economic perspective, and oyster aquaculture forms a key component of global food security. The aim of our study was to design a combined-species, medium density SNP array for Pacific oyster ( Crassostrea gigas ) and European flat oyster ( Ostrea edulis ), and to test the performance of this array on farmed and wild populations from multiple locations, with a focus on European populations. SNP discovery was carried out by whole-genome sequencing (WGS) of pooled genomic DNA samples from eight C. gigas populations, and restriction site-associated DNA sequencing (RAD-Seq) of 11 geographically diverse O. edulis populations. Nearly 12 million candidate SNPs were discovered and filtered based on several criteria, including preference for SNPs segregating in multiple populations and SNPs with monomorphic flanking regions. An Affymetrix Axiom Custom Array was created and tested on a diverse set of samples ( n = 219) showing ∼27 K high quality SNPs for C. gigas and ∼11 K high quality SNPs for O. edulis segregating in these populations. A high proportion of SNPs were segregating in each of the populations, and the array was used to detect population structure and levels of linkage disequilibrium (LD). Further testing of the array on three C. gigas nuclear families ( n = 165) revealed that the array can be used to clearly distinguish between both families based on identity-by-state (IBS) clustering parental assignment software. This medium density, combined-species array will be publicly available through Affymetrix, and will be applied for genome-wide association and evolutionary genetic studies, and for genomic selection in oyster breeding programs. Copyright © 2017 Gutierrez et al.
Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar).

PubMed

Houston, Ross D; Taggart, John B; Cézard, Timothé; Bekaert, Michaël; Lowe, Natalie R; Downing, Alison; Talbot, Richard; Bishop, Stephen C; Archibald, Alan L; Bron, James E; Penman, David J; Davassi, Alessandro; Brew, Fiona; Tinch, Alan E; Gharbi, Karim; Hamilton, Alastair

2014-02-06

Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array. SNP discovery was performed using extensive deep sequencing of Reduced Representation (RR-Seq), Restriction site-Associated DNA (RAD-Seq) and mRNA (RNA-Seq) libraries derived from farmed and wild Atlantic salmon samples (n = 283) resulting in the discovery of > 400 K putative SNPs. An Affymetrix Axiom® myDesign Custom Array was created and tested on samples of animals of wild and farmed origin (n = 96) revealing a total of 132,033 polymorphic SNPs with high call rate, good cluster separation on the array and stable Mendelian inheritance in our sample. At least 38% of these SNPs are from transcribed genomic regions and therefore more likely to include functional variants. Linkage analysis utilising the lack of male recombination in salmonids allowed the mapping of 40,214 SNPs distributed across all 29 pairs of chromosomes, highlighting the extensive genome-wide coverage of the SNPs. An identity-by-state clustering analysis revealed that the array can clearly distinguish between fish of different origins, within and between farmed and wild populations. Finally, Y-chromosome-specific probes included on the array provide an accurate molecular genetic test for sex. This manuscript describes the first high-density SNP genotyping array for Atlantic salmon. This array will be publicly available and is likely to be used as a platform for high-resolution genetics research into traits of evolutionary and economic importance in salmonids and in aquaculture breeding programs via genomic selection.
Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar)

PubMed Central

2014-01-01

Background Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array. Results SNP discovery was performed using extensive deep sequencing of Reduced Representation (RR-Seq), Restriction site-Associated DNA (RAD-Seq) and mRNA (RNA-Seq) libraries derived from farmed and wild Atlantic salmon samples (n = 283) resulting in the discovery of > 400 K putative SNPs. An Affymetrix Axiom® myDesign Custom Array was created and tested on samples of animals of wild and farmed origin (n = 96) revealing a total of 132,033 polymorphic SNPs with high call rate, good cluster separation on the array and stable Mendelian inheritance in our sample. At least 38% of these SNPs are from transcribed genomic regions and therefore more likely to include functional variants. Linkage analysis utilising the lack of male recombination in salmonids allowed the mapping of 40,214 SNPs distributed across all 29 pairs of chromosomes, highlighting the extensive genome-wide coverage of the SNPs. An identity-by-state clustering analysis revealed that the array can clearly distinguish between fish of different origins, within and between farmed and wild populations. Finally, Y-chromosome-specific probes included on the array provide an accurate molecular genetic test for sex. Conclusions This manuscript describes the first high-density SNP genotyping array for Atlantic salmon. This array will be publicly available and is likely to be used as a platform for high-resolution genetics research into traits of evolutionary and economic importance in salmonids and in aquaculture breeding programs via genomic selection. PMID:24524230
Integrative Genomics Viewer (IGV) | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.
Molecular karyotyping by array CGH in a Russian cohort of children with intellectual disability, autism, epilepsy and congenital anomalies

PubMed Central

2012-01-01

Background Array comparative genomic hybridization (CGH) has been repeatedly shown to be a successful tool for the identification of genomic variations in a clinical population. During the last decade, the implementation of array CGH has resulted in the identification of new causative submicroscopic chromosome imbalances and copy number variations (CNVs) in neuropsychiatric (neurobehavioral) diseases. Currently, array-CGH-based technologies have become an integral part of molecular diagnosis and research in individuals with neuropsychiatric disorders and children with intellectual disability (mental retardation) and congenital anomalies. Here, we introduce the Russian cohort of children with intellectual disability, autism, epilepsy and congenital anomalies analyzed by BAC array CGH and a novel bioinformatic strategy. Results Among 54 individuals highly selected according to clinical criteria and molecular and cytogenetic data (from 2426 patients evaluated cytogenetically and molecularly between November 2007 and May 2012), chromosomal imbalances were detected in 26 individuals (48%). In two patients (4%), a previously undescribed condition was observed. The latter has been designated as meiotic (constitutional) genomic instability resulted in multiple submicroscopic rearrangements (including CNVs). Using bioinformatic strategy, we were able to identify clinically relevant CNVs in 15 individuals (28%). Selected cases were confirmed by molecular cytogenetic and molecular genetic methods. Eight out of 26 chromosomal imbalances (31%) have not been previously reported. Among them, three cases were co-occurrence of subtle chromosome 9 and 21 deletions. Conclusions We conducted an array CGH study of Russian patients suffering from intellectual disability, autism, epilepsy and congenital anomalies. In total, phenotypic manifestations of clinically relevant genomic variations were found to result from genomic rearrangements affecting 1247 disease-causing and pathway-involved genes. Obviously, a significantly lesser part of them are true candidates for intellectual disability, autism or epilepsy. The success of our preliminary array CGH and bioinformatic study allows us to expand the cohort. According to the available literature, this is the first comprehensive array CGH evaluation of a Russian cohort of children with neuropsychiatric disorders and congenital anomalies. PMID:23272938
High-resolution array comparative genomic hybridization analysis of human bronchial and salivary adenoid cystic carcinoma.

PubMed

Bernheim, Alain; Toujani, Saloua; Saulnier, Patrick; Robert, Thomas; Casiraghi, Odile; Validire, Pierre; Temam, Stéphane; Menard, Philippe; Dessen, Philippe; Fouret, Pierre

2008-05-01

Adenoid cystic carcinoma (ACC) is a rare but distinctive tumor. Oligonucleotide array comparative genomic hybridization has been applied for cataloging genomic copy number alterations (CNAs) in 17 frozen salivary or bronchial tumors. Only four whole chromosome CNAs were found, and most cases had 2-4 segmental CNAs. No high level amplification was observed. There were recurrent gains at 7p15.2, 17q21-25, and 22q11-13, and recurrent losses at 1p35, 6q22-25, 8q12-13, 9p21, 12q12-13, and 17p11-13. The minimal region of gain at 7p15.2 contained the HOXA cluster. The minimal common regions of deletions contained the CDKN2A/CDKN2B, TP53, and LIMA1 tumor suppressor genes. The recurrent deletion at 8q12.3-13.1 contained no straightforward tumor suppressor gene, but the MIRN124A2 microRNA gene, whose product regulates MMP2 and CDK6. Among unique CNAs, gains harbored CCND1, KIT/PDGFRA/KDR, MDM2, and JAK2. The CNAs involving CCND1, MDM2, KIT, CDKN2A/2B, and TP53 were validated by FISH and/or multiplex ligation-dependent probe amplification. Although most tumors overexpressed cyclin D1 compared with surrounding glands, the only case to overexpress MDM2 had the corresponding CNA. In conclusion, our report suggests that ACC is characterized by a relatively low level of structural complexity. Array CGH and immunohistochemical data implicate MDM2 as the oncogene targeted at 12q15. The gain at 4q12 warrants further exploration as it contains a cluster of receptor kinase genes (KIT/PDGFRA/KDR), whose products can be responsive to specific therapies.
Progressive but Previously Untreated CLL Patients with Greater Array CGH Complexity Exhibit a Less Durable Response to Chemoimmunotherapy

PubMed Central

Kay, Neil E.; Eckel-Passow, Jeanette E.; Braggio, Esteban; VanWier, Scott; Shanafelt, Tait D.; Van Dyke, Daniel L.; Jelinek, Diane F.; Tschumper, Renee C.; Kipps, Thomas; Byrd, John C.; Fonseca, Rafael

2010-01-01

To better understand the implications of genomic instability and outcome in B-cell CLL, we sought to address genomic complexity as a predictor of chemosensitivity and ultimately clinical outcome in this disease. We employed array-based comparative genomic hybridization (aCGH), using a one-million probe array and identified gains and losses of genetic material in 48 patients treated on a chemoimmunotherapy (CIT) clinical trial. We identified chromosomal gain or loss in ≥6% of the patients on chromosomes 3, 8, 9, 10, 11, 12, 13, 14 and 17. Higher genomic complexity, as a mechanism favoring clonal selection, was associated with shorter progression-free survival and predicted a poor response to treatment. Of interest, CLL cases with loss of p53 surveillance showed more complex genomic features and were found both in patients with a 17p13.1 deletion and in the more favorable genetic subtype characterized by the presence of 13q14.1 deletion. This aCGH study adds information on the association between inferior trial response and increasing genetic complexity as CLL progresses. PMID:21156228
The Minnesota Center for Twin and Family Research Genome-Wide Association Study

PubMed Central

Miller, Michael B.; Basu, Saonli; Cunningham, Julie; Eskin, Eleazar; Malone, Steven M.; Oetting, William S.; Schork, Nicholas; Sul, Jae Hoon; Iacono, William G.; Mcgue, Matt

2012-01-01

As part of the Genes, Environment and Development Initiative (GEDI), the Minnesota Center for Twin and Family Research (MCTFR) undertook a genome-wide association study (GWAS), which we describe here. A total of 8405 research participants, clustered in 4-member families, have been successfully genotyped on 527,829 single nucleotide polymorphism (SNP) markers using Illumina’s Human660W-Quad array. Quality control screening of samples and markers as well as SNP imputation procedures are described. We also describe methods for ancestry control and how the familial clustering of the MCTFR sample can be accounted for in the analysis using a Rapid Feasible Generalized Least Squares algorithm. The rich longitudinal MCTFR assessments provide numerous opportunities for collaboration. PMID:23363460
An experimental loop design for the detection of constitutional chromosomal aberrations by array CGH

PubMed Central

2009-01-01

Background Comparative genomic hybridization microarrays for the detection of constitutional chromosomal aberrations is the application of microarray technology coming fastest into routine clinical application. Through genotype-phenotype association, it is also an important technique towards the discovery of disease causing genes and genomewide functional annotation in human. When using a two-channel microarray of genomic DNA probes for array CGH, the basic setup consists in hybridizing a patient against a normal reference sample. Two major disadvantages of this setup are (1) the use of half of the resources to measure a (little informative) reference sample and (2) the possibility that deviating signals are caused by benign copy number variation in the "normal" reference instead of a patient aberration. Instead, we apply an experimental loop design that compares three patients in three hybridizations. Results We develop and compare two statistical methods (linear models of log ratios and mixed models of absolute measurements). In an analysis of 27 patients seen at our genetics center, we observed that the linear models of the log ratios are advantageous over the mixed models of the absolute intensities. Conclusion The loop design and the performance of the statistical analysis contribute to the quick adoption of array CGH as a routine diagnostic tool. They lower the detection limit of mosaicisms and improve the assignment of copy number variation for genetic association studies. PMID:19925645
An experimental loop design for the detection of constitutional chromosomal aberrations by array CGH.

PubMed

Allemeersch, Joke; Van Vooren, Steven; Hannes, Femke; De Moor, Bart; Vermeesch, Joris Robert; Moreau, Yves

2009-11-19

Comparative genomic hybridization microarrays for the detection of constitutional chromosomal aberrations is the application of microarray technology coming fastest into routine clinical application. Through genotype-phenotype association, it is also an important technique towards the discovery of disease causing genes and genomewide functional annotation in human. When using a two-channel microarray of genomic DNA probes for array CGH, the basic setup consists in hybridizing a patient against a normal reference sample. Two major disadvantages of this setup are (1) the use of half of the resources to measure a (little informative) reference sample and (2) the possibility that deviating signals are caused by benign copy number variation in the "normal" reference instead of a patient aberration. Instead, we apply an experimental loop design that compares three patients in three hybridizations. We develop and compare two statistical methods (linear models of log ratios and mixed models of absolute measurements). In an analysis of 27 patients seen at our genetics center, we observed that the linear models of the log ratios are advantageous over the mixed models of the absolute intensities. The loop design and the performance of the statistical analysis contribute to the quick adoption of array CGH as a routine diagnostic tool. They lower the detection limit of mosaicisms and improve the assignment of copy number variation for genetic association studies.
Transcript Profiling of Common Bean (Phaseolus vulgaris L.) Using the GeneChip(R) Soybean Genome Array: Optimizing Analysis by Masking Biased Probes

USDA-ARS?s Scientific Manuscript database

Common bean (Phaseolus vulgaris) and soybean (Glycine max) both belong to the Phaseoleae tribe and share significant coding sequence homology. This suggests that the GeneChip(R) Soybean Genome Array (soybean GeneChip) may be used for gene expression studies using common bean. To evaluate the utility...
Copy number variants analysis in a cohort of isolated and syndromic developmental delay/intellectual disability reveals novel genomic disorders, position effects and candidate disease genes.

PubMed

Di Gregorio, E; Riberi, E; Belligni, E F; Biamino, E; Spielmann, M; Ala, U; Calcia, A; Bagnasco, I; Carli, D; Gai, G; Giordano, M; Guala, A; Keller, R; Mandrile, G; Arduino, C; Maffè, A; Naretto, V G; Sirchia, F; Sorasio, L; Ungari, S; Zonta, A; Zacchetti, G; Talarico, F; Pappi, P; Cavalieri, S; Giorgio, E; Mancini, C; Ferrero, M; Brussino, A; Savin, E; Gandione, M; Pelle, A; Giachino, D F; De Marchi, M; Restagno, G; Provero, P; Cirillo Silengo, M; Grosso, E; Buxbaum, J D; Pasini, B; De Rubeis, S; Brusco, A; Ferrero, G B

2017-10-01

Array-comparative genomic hybridization (array-CGH) is a widely used technique to detect copy number variants (CNVs) associated with developmental delay/intellectual disability (DD/ID). Identification of genomic disorders in DD/ID. We performed a comprehensive array-CGH investigation of 1,015 consecutive cases with DD/ID and combined literature mining, genetic evidence, evolutionary constraint scores, and functional information in order to assess the pathogenicity of the CNVs. We identified non-benign CNVs in 29% of patients. Amongst the pathogenic variants (11%), detected with a yield consistent with the literature, we found rare genomic disorders and CNVs spanning known disease genes. We further identified and discussed 51 cases with likely pathogenic CNVs spanning novel candidate genes, including genes encoding synaptic components and/or proteins involved in corticogenesis. Additionally, we identified two deletions spanning potential Topological Associated Domain (TAD) boundaries probably affecting the regulatory landscape. We show how phenotypic and genetic analyses of array-CGH data allow unraveling complex cases, identifying rare disease genes, and revealing unexpected position effects. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

Whole-genome sequence-based genomic prediction in laying chickens with different genomic relationship matrices to account for genetic architecture.

PubMed

Ni, Guiyan; Cavero, David; Fangmann, Anna; Erbe, Malena; Simianer, Henner

2017-01-16

With the availability of next-generation sequencing technologies, genomic prediction based on whole-genome sequencing (WGS) data is now feasible in animal breeding schemes and was expected to lead to higher predictive ability, since such data may contain all genomic variants including causal mutations. Our objective was to compare prediction ability with high-density (HD) array data and WGS data in a commercial brown layer line with genomic best linear unbiased prediction (GBLUP) models using various approaches to weight single nucleotide polymorphisms (SNPs). A total of 892 chickens from a commercial brown layer line were genotyped with 336 K segregating SNPs (array data) that included 157 K genic SNPs (i.e. SNPs in or around a gene). For these individuals, genome-wide sequence information was imputed based on data from re-sequencing runs of 25 individuals, leading to 5.2 million (M) imputed SNPs (WGS data), including 2.6 M genic SNPs. De-regressed proofs (DRP) for eggshell strength, feed intake and laying rate were used as quasi-phenotypic data in genomic prediction analyses. Four weighting factors for building a trait-specific genomic relationship matrix were investigated: identical weights, -(log 10 P) from genome-wide association study results, squares of SNP effects from random regression BLUP, and variable selection based weights (known as BLUP|GA). Predictive ability was measured as the correlation between DRP and direct genomic breeding values in five replications of a fivefold cross-validation. Averaged over the three traits, the highest predictive ability (0.366 ± 0.075) was obtained when only genic SNPs from WGS data were used. Predictive abilities with genic SNPs and all SNPs from HD array data were 0.361 ± 0.072 and 0.353 ± 0.074, respectively. Prediction with -(log 10 P) or squares of SNP effects as weighting factors for building a genomic relationship matrix or BLUP|GA did not increase accuracy, compared to that with identical weights, regardless of the SNP set used. Our results show that little or no benefit was gained when using all imputed WGS data to perform genomic prediction compared to using HD array data regardless of the weighting factors tested. However, using only genic SNPs from WGS data had a positive effect on prediction ability.
ChIP-Chip Identifies SEC23A, CFDP1, and NSD1 as TFII-I Target Genes in Human Neural Crest Progenitor Cells.

PubMed

Makeyev, Aleksandr V; Bayarsaihan, Dashzeveg

2013-05-01

Objectives : GTF2I and GTF2IRD1 genes located in Williams-Beuren syndrome (WBS) critical region encode TFII-I family transcription factors. The aim of this study was to map genomic sites bound by these proteins across promoter regions of developmental regulators associated with craniofacial development. Design : Chromatin was isolated from human neural crest progenitor cells and the DNA-binding profile was generated using the human RefSeq tiling promoter ChIP-chip arrays. Results : TFII-I transcription factors are recruited to the promoters of SEC23A, CFDP1, and NSD1 previously defined as TFII-I target genes. Moreover, our analysis revealed additional binding elements that contain E-boxes and initiator-like motifs. Conclusions : Genome-wide promoter binding studies revealed SEC23A, CFDP1, and NSD1 linked to craniofacial or dental development as direct TFII-I targets. Developmental regulation of these genes by TFII-I factors could contribute to the WBS-specific facial dysmorphism.
Single step generation of protein arrays from DNA by cell-free expression and in situ immobilisation (PISA method).

PubMed

He, M; Taussig, M J

2001-08-01

We describe a format for production of protein arrays termed 'protein in situ array' (PISA). A PISA is rapidly generated in one step directly from PCR-generated DNA fragments by cell-free protein expression and in situ immobilisation at a surface. The template for expression is DNA encoding individual proteins or domains, which is produced by PCR using primers designed from information in DNA databases. Coupled transcription and translation is carried out on a surface to which the tagged protein adheres as soon as it is synthesised. Because proteins generated by cell-free synthesis are usually soluble and functional, this method can overcome problems of insolubility or degradation associated with bacterial expression of recombinant proteins. Moreover, the use of PCR-generated DNA enables rapid production of proteins or domains based on genome information alone and will be particularly useful where cloned material is not available. Here we show that human single-chain antibody fragments (three domain, V(H)/K form) and an enzyme (luciferase) can be functionally arrayed by the PISA method.
FISH Oracle: a web server for flexible visualization of DNA copy number data in a genomic context.

PubMed

Mader, Malte; Simon, Ronald; Steinbiss, Sascha; Kurtz, Stefan

2011-07-28

The rapidly growing amount of array CGH data requires improved visualization software supporting the process of identifying candidate cancer genes. Optimally, such software should work across multiple microarray platforms, should be able to cope with data from different sources and should be easy to operate. We have developed a web-based software FISH Oracle to visualize data from multiple array CGH experiments in a genomic context. Its fast visualization engine and advanced web and database technology supports highly interactive use. FISH Oracle comes with a convenient data import mechanism, powerful search options for genomic elements (e.g. gene names or karyobands), quick navigation and zooming into interesting regions, and mechanisms to export the visualization into different high quality formats. These features make the software especially suitable for the needs of life scientists. FISH Oracle offers a fast and easy to use visualization tool for array CGH and SNP array data. It allows for the identification of genomic regions representing minimal common changes based on data from one or more experiments. FISH Oracle will be instrumental to identify candidate onco and tumor suppressor genes based on the frequency and genomic position of DNA copy number changes. The FISH Oracle application and an installed demo web server are available at http://www.zbh.uni-hamburg.de/fishoracle.
FISH Oracle: a web server for flexible visualization of DNA copy number data in a genomic context

PubMed Central

2011-01-01

Background The rapidly growing amount of array CGH data requires improved visualization software supporting the process of identifying candidate cancer genes. Optimally, such software should work across multiple microarray platforms, should be able to cope with data from different sources and should be easy to operate. Results We have developed a web-based software FISH Oracle to visualize data from multiple array CGH experiments in a genomic context. Its fast visualization engine and advanced web and database technology supports highly interactive use. FISH Oracle comes with a convenient data import mechanism, powerful search options for genomic elements (e.g. gene names or karyobands), quick navigation and zooming into interesting regions, and mechanisms to export the visualization into different high quality formats. These features make the software especially suitable for the needs of life scientists. Conclusions FISH Oracle offers a fast and easy to use visualization tool for array CGH and SNP array data. It allows for the identification of genomic regions representing minimal common changes based on data from one or more experiments. FISH Oracle will be instrumental to identify candidate onco and tumor suppressor genes based on the frequency and genomic position of DNA copy number changes. The FISH Oracle application and an installed demo web server are available at http://www.zbh.uni-hamburg.de/fishoracle. PMID:21884636
Abundant and Diverse Clustered Regularly Interspaced Short Palindromic Repeat Spacers in Clostridium difficile Strains and Prophages Target Multiple Phage Types within This Pathogen

PubMed Central

Hargreaves, Katherine R.; Flores, Cesar O.; Lawley, Trevor D.

2014-01-01

ABSTRACT Clostridium difficile is an important human-pathogenic bacterium causing antibiotic-associated nosocomial infections worldwide. Mobile genetic elements and bacteriophages have helped shape C. difficile genome evolution. In many bacteria, phage infection may be controlled by a form of bacterial immunity called the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system. This uses acquired short nucleotide sequences (spacers) to target homologous sequences (protospacers) in phage genomes. C. difficile carries multiple CRISPR arrays, and in this paper we examine the relationships between the host- and phage-carried elements of the system. We detected multiple matches between spacers and regions in 31 C. difficile phage and prophage genomes. A subset of the spacers was located in prophage-carried CRISPR arrays. The CRISPR spacer profiles generated suggest that related phages would have similar host ranges. Furthermore, we show that C. difficile strains of the same ribotype could either have similar or divergent CRISPR contents. Both synonymous and nonsynonymous mutations in the protospacer sequences were identified, as well as differences in the protospacer adjacent motif (PAM), which could explain how phages escape this system. This paper illustrates how the distribution and diversity of CRISPR spacers in C. difficile, and its prophages, could modulate phage predation for this pathogen and impact upon its evolution and pathogenicity. PMID:25161187
Comparative cytogenetic characterization of primary canine melanocytic lesions using array CGH and fluorescence in situ hybridization

PubMed Central

Poorman, Kelsey; Borst, Luke; Moroff, Scott; Roy, Siddharth; Labelle, Philippe; Motsinger-Reif, Alison

2017-01-01

Melanocytic lesions originating from the oral mucosa or cutaneous epithelium are common in the general dog population, with up to 100,000 diagnoses each year in the USA. Oral melanoma is the most frequent canine neoplasm of the oral cavity, exhibiting a highly aggressive course. Cutaneous melanocytomas occur frequently, but rarely develop into a malignant form. Despite the differential prognosis, it has been assumed that subtypes of melanocytic lesions represent the same disease. To address the relative paucity of information about their genomic status, molecular cytogenetic analysis was performed on the three recognized subtypes of canine melanocytic lesions. Using array comparative genomic hybridization (aCGH) analysis, highly aberrant distinct copy number status across the tumor genome for both of the malignant melanoma subtypes was revealed. The most frequent aberrations included gain of dog chromosome (CFA) 13 and 17 and loss of CFA 22. Melanocytomas possessed fewer genome wide aberrations, yet showed a recurrent gain of CFA 20q15.3–17. A distinctive copy number profile, evident only in oral melanomas, displayed a sigmoidal pattern of copy number loss followed immediately by a gain, around CFA 30q14. Moreover, when assessed by fluorescence in situ hybridization (FISH), copy number aberrations of targeted genes, such as gain of c-MYC (80 % of cases) and loss of CDKN2A (68 % of cases), were observed. This study suggests that in concordance with what is known for human melanomas, canine melanomas of the oral mucosa and cutaneous epithelium are discrete and initiated by different molecular pathways. PMID:25511566
Molecular Methods for the Detection of Mycoplasma and Ureaplasma Infections in Humans

PubMed Central

Waites, Ken B.; Xiao, Li; Paralanov, Vanya; Viscardi, Rose M.; Glass, John I.

2012-01-01

Mycoplasma and Ureaplasma species are well-known human pathogens responsible for a broad array of inflammatory conditions involving the respiratory and urogenital tracts of neonates, children, and adults. Greater attention is being given to these organisms in diagnostic microbiology, largely as a result of improved methods for their laboratory detection, made possible by powerful molecular-based techniques that can be used for primary detection in clinical specimens. For slow-growing species, such as Mycoplasma pneumoniae and Mycoplasma genitalium, molecular-based detection is the only practical means for rapid microbiological diagnosis. Most molecular-based methods used for detection and characterization of conventional bacteria have been applied to these organisms. A complete genome sequence is available for one or more strains of all of the important human pathogens in the Mycoplasma and Ureaplasma genera. Information gained from genome analyses and improvements in efficiency of DNA sequencing are expected to significantly advance the field of molecular detection and genotyping during the next few years. This review provides a summary and critical review of methods suitable for detection and characterization of mycoplasmas and ureaplasmas of humans, with emphasis on molecular genotypic techniques. PMID:22819362
Phylogeographic separation and formation of sexually discrete lineages in a global population of Yersinia pseudotuberculosis

PubMed Central

Seecharran, Tristan; Kalin-Manttari, Laura; Koskela, Katja; Nikkari, Simo; Dickins, Benjamin; Corander, Jukka; Skurnik, Mikael

2017-01-01

Yersinia pseudotuberculosis is a Gram-negative intestinal pathogen of humans and has been responsible for several nationwide gastrointestinal outbreaks. Large-scale population genomic studies have been performed on the other human pathogenic species of the genus Yersinia, Yersinia pestis and Yersinia enterocolitica allowing a high-resolution understanding of the ecology, evolution and dissemination of these pathogens. However, to date no purpose-designed large-scale global population genomic analysis of Y. pseudotuberculosis has been performed. Here we present analyses of the genomes of 134 strains of Y. pseudotuberculosis isolated from around the world, from multiple ecosystems since the 1960s. Our data display a phylogeographic split within the population, with an Asian ancestry and subsequent dispersal of successful clonal lineages into Europe and the rest of the world. These lineages can be differentiated by CRISPR cluster arrays, and we show that the lineages are limited with respect to inter-lineage genetic exchange. This restriction of genetic exchange maintains the discrete lineage structure in the population despite co-existence of lineages for thousands of years in multiple countries. Our data highlights how CRISPR can be informative of the evolutionary trajectory of bacterial lineages, and merits further study across bacteria. PMID:29177091
The universality of nucleosome organization: from yeast to human

NASA Astrophysics Data System (ADS)

Chereji, Razvan

The basic units of DNA packaging are called nucleosomes. Their locations on the chromosomes play an essential role in gene regulation. We study nucleosome positioning in yeast, fly, mouse, and human, and build biophysical models in order to explain the genome-wide nucleosome organization. We show that DNA sequence alone is not able to generate the phased arrays of nucleosomes observed in vivo near the transcription start sites. We discuss simple models which can account for the formation of nucleosome depleted regions and nucleosome phasing at the gene promoters. We show that the same principles apply to different organisms. References: [1] RV Chereji, D Tolkunov, G Locke, AV Morozov - Phys. Rev. E 83, 050903 (2011) [2] RV Chereji, AV Morozov - J. Stat. Phys. 144, 379 (2011) [3] RV Chereji, AV Morozov - Proc. Natl. Acad. Sci. U.S.A. 111, 5236 (2014) [4] RV Chereji, T-W Kan, et al. - Nucleic Acids Res. (2015) doi: 10.1093/nar/gkv978 [5] RV Chereji, AV Morozov - Brief. Funct. Genomics 14, 50 (2015) [6] HA Cole, J Ocampo, JR Iben, RV Chereji, DJ Clark - Nucleic Acids Res. 42, 12512 (2014) [7] D Ganguli, RV Chereji, J Iben, HA Cole, DJ Clark - Genome Res. 24, 1637 (2014)
Development of a dense SNP-based linkage map of an apple rootstock progeny using the Malus Infinium whole genome genotyping array

PubMed Central

2012-01-01

Background A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Results Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the ‘Golden Delicious’ genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the ‘Golden Delicious’ pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. Conclusions We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the ‘Golden Delicious’ reference sequence will assist in the continued improvement of the genome sequence assembly for that variety. PMID:22631220
Genomic profiling of human penile carcinoma predicts worse prognosis and survival.

PubMed

Busso-Lopes, Ariane F; Marchi, Fábio A; Kuasne, Hellen; Scapulatempo-Neto, Cristovam; Trindade-Filho, José Carlos S; de Jesus, Carlos Márcio N; Lopes, Ademar; Guimarães, Gustavo C; Rogatto, Silvia R

2015-02-01

The molecular mechanisms underlying penile carcinoma are still poorly understood, and the detection of genetic markers would be of great benefit for these patients. In this study, we assessed the genomic profile aiming at identifying potential prognostic biomarkers in penile carcinoma. Globally, 46 penile carcinoma samples were considered to evaluate DNA copy-number alterations via array comparative genomic hybridization (aCGH) combined with human papillomavirus (HPV) genotyping. Specific genes were investigated by using qPCR, FISH, and RT-qPCR. Genomic alterations mapped at 3p and 8p were related to worse prognostic features, including advanced T and clinical stage, recurrence and death from the disease. Losses of 3p21.1-p14.3 and gains of 3q25.31-q29 were associated with reduced cancer-specific and disease-free survival. Genomic alterations detected for chromosome 3 (LAMP3, PPARG, TNFSF10 genes) and 8 (DLC1) were evaluated by qPCR. DLC1 and PPARG losses were associated with poor prognosis characteristics. Losses of DLC1 were an independent risk factor for recurrence on multivariate analysis. The gene-expression analysis showed downexpression of DLC1 and PPARG and overexpression of LAMP3 and TNFSF10 genes. Chromosome Y losses and MYC gene (8q24) gains were confirmed by FISH. HPV infection was detected in 34.8% of the samples, and 19 differential genomic regions were obtained related to viral status. At first time, we described recurrent copy-number alterations and its potential prognostic value in penile carcinomas. We also showed a specific genomic profile according to HPV infection, supporting the hypothesis that penile tumors present distinct etiologies according to virus status. ©2014 American Association for Cancer Research.
Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array

USDA-ARS?s Scientific Manuscript database

Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases...
Multigenome analysis identifies a worldwide distributed epidemic Legionella pneumophila clone that emerged within a highly diverse species

PubMed Central

Cazalet, Christel; Jarraud, Sophie; Ghavi-Helm, Yad; Kunst, Frank; Glaser, Philippe; Etienne; Buchrieser, Carmen

2008-01-01

Genomics can provide the basis for understanding the evolution of emerging, lethal human pathogens such as Legionella pneumophila, the causative agent of Legionnaires’ disease. This bacterium replicates within amoebae and persists in the environment as a free-living microbe. Among the many Legionella species described, L. pneumophila is associated with 90% of human disease and within the 15 serogroups (Sg), L. pneumophila Sg1 causes over 84% of Legionnaires’ disease worldwide. Why L. pneumophila Sg1 is so predominant is unknown. Here, we report the first comprehensive screen of the gene content of 217 L. pneumophila and 32 non-L. pneumophila strains isolated from humans and the environment using a Legionella DNA-array. Strikingly, we uncovered a high conservation of virulence- and eukaryotic-like genes, indicating strong environmental selection pressures for their preservation. No specific hybridization profile differentiated clinical and environmental strains or strains of different serogroups. Surprisingly, the gene cluster coding the determinants of the core and the O side-chain synthesis of the lipopolysaccaride (LPS cluster) determining Sg1 was present in diverse genomic backgrounds, strongly implicating the LPS of Sg1 itself as a principal cause of the high prevalence of Sg1 strains in human disease and suggesting that the LPS cluster can be transferred horizontally. Genomic analysis also revealed that L. pneumophila is a genetically diverse species, in part due to horizontal gene transfer of mobile genetic elements among L. pneumophila strains, but also between different Legionella species. However, the genomic background also plays a role in disease causation as demonstrated by the identification of a globally distributed epidemic strain exhibiting the genotype of the sequenced L. pneumophila strain Paris. PMID:18256241
A parallel SNP array study of genomic aberrations associated with mental retardation in patients and general population in Estonia.

PubMed

Männik, Katrin; Parkel, Sven; Palta, Priit; Zilina, Olga; Puusepp, Helen; Esko, Tõnu; Mägi, Reedik; Nõukas, Margit; Veidenberg, Andres; Nelis, Mari; Metspalu, Andres; Remm, Maido; Ounap, Katrin; Kurg, Ants

2011-01-01

The increasing use of whole-genome array screening has revealed the important role of DNA copy-number variations in the pathogenesis of neurodevelopmental disorders and several recurrent genomic disorders have been defined during recent years. However, some variants considered to be pathogenic have also been observed in phenotypically normal individuals. This underlines the importance of further characterization of genomic variants with potentially variable expressivity in both patient and general population cohorts to clarify their phenotypic consequence. In this study whole-genome SNP arrays were used to investigate genomic rearrangements in 77 Estonian families with idiopathic mental retardation. In addition to this family-based approach, phenotype and genotype data from a cohort of 1000 individuals in the general population were used for accurate interpretation of aberrations found in mental retardation patients. Relevant structural aberrations were detected in 18 of the families analyzed (23%). Fifteen of those were in genomic regions where clinical significance has previously been established. In 3 families, 4 novel aberrations associated with intellectual disability were detected in chromosome regions 2p25.1-p24.3, 3p12.1-p11.2, 7p21.2-p21.1 and Xq28. Carriers of imbalances in 15q13.3, 16p11.2 and Xp22.31 were identified among reference individuals, affirming the variable phenotypic consequence of rare variants in some genomic regions considered as pathogenic. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
Genome-wide transcriptional profiling of human glioblastoma cells in response to ITE treatment

PubMed Central

Kang, Bo; Zhou, Yanwen; Zheng, Min; Wang, Ying-Jie

2015-01-01

A ligand-activated transcription factor aryl hydrocarbon receptor (AhR) is recently revealed to play a key role in embryogenesis and tumorigenesis (Feng et al. [1], Safe et al. [2]) and 2-(1′H-indole-3′-carbonyl)-thiazole-4-carboxylic acid methyl ester (ITE) (Song et al. [3]) is an endogenous AhR ligand that possesses anti-tumor activity. In order to gain insights into how ITE acts via the AhR in embryogenesis and tumorigenesis, we analyzed the genome-wide transcriptional profiles of the following three groups of cells: the human glioblastoma U87 parental cells, U87 tumor sphere cells treated with vehicle (DMSO) and U87 tumor sphere cells treated with ITE. Here, we provide the details of the sample gathering strategy and show the quality controls and the analyses associated with our gene array data deposited into the Gene Expression Omnibus (GEO) under the accession code of GSE67986. PMID:26484269
Genome-wide analysis of alternative splicing during dendritic cell response to a bacterial challenge.

PubMed

Rodrigues, Raquel; Grosso, Ana Rita; Moita, Luís

2013-01-01

The immune system relies on the plasticity of its components to produce appropriate responses to frequent environmental challenges. Dendritic cells (DCs) are critical initiators of innate immunity and orchestrate the later and more specific adaptive immunity. The generation of diversity in transcriptional programs is central for effective immune responses. Alternative splicing is widely considered a key generator of transcriptional and proteomic complexity, but its role has been rarely addressed systematically in immune cells. Here we used splicing-sensitive arrays to assess genome-wide gene- and exon-level expression profiles in human DCs in response to a bacterial challenge. We find widespread alternative splicing events and splicing factor transcriptional signatures induced by an E. coli challenge to human DCs. Alternative splicing acts in concert with transcriptional modulation, but these two mechanisms of gene regulation affect primarily distinct functional gene groups. Alternative splicing is likely to have an important role in DC immunobiology because it affects genes known to be involved in DC development, endocytosis, antigen presentation and cell cycle arrest.
Microbial genome sequencing using optical mapping and Illumina sequencing

USDA-ARS?s Scientific Manuscript database

Introduction Optical mapping is a technique in which strands of genomic DNA are digested with one or more restriction enzymes, and a physical map of the genome constructed from the resulting image. In outline, genomic DNA is extracted from a pure culture, linearly arrayed on a specialized glass sli...
Genetic studies of Prader-Willi patients provide evidence for conservation of genomic architecture in proximal chromosome 15q.

PubMed

Hou, Aihua; Lin, Shuan-Pei; Ho, Shi Yun; Chen, Chi-Fung Jennifer; Lin, Hsiang-Yu; Chen, Yen-Juin; Huang, Chi-Yu; Chiu, Huei-Ching; Chuang, Chih-Kuang; Chen, Ken-Shiung

2011-03-01

Prader-Willi syndrome (PWS) is a neurogenetic disorder associated with recurrent genomic recombination involving low copy repeats (LCRs) located in the human chromosome 15q11-q13. Previous studies of PWS patients from Asia suggested that there is a higher incidence of deletion and lower incidence of maternal uniparental disomy (mUPD) compared to that of Western populations. In this report, we present genetic etiology of 28 PWS patients from Taiwan. Consistent with the genetic etiology findings from Western populations, the type II deletion appears to be the most common deletion subtype. Furthermore, the ratio of the two most common deletion subtypes and the ratio of the maternal heterodisomy to isodisomy cases observed from this study are in agreement with previous findings from Western populations. In addition, we identified and further mapped the deletion breakpoints in two patients with atypical deletions using array CGH (comparative genomic hybridization). Despite the relatively small numbers of patients in each subgroup, our findings suggest that the genomic architecture responsible for the recurrent recombination in PWS is conserved in Taiwanese of the Han Chinese heritage and Western populations, thereby predisposing chromosome 15q11-q13 to a similar risk of rearrangements. © 2010 The Authors Annals of Human Genetics © 2010 Blackwell Publishing Ltd/University College London.
Methylation-Sensitive Amplification Length Polymorphism (MS-AFLP) Microarrays for Epigenetic Analysis of Human Genomes.

PubMed

Alonso, Sergio; Suzuki, Koichi; Yamamoto, Fumiichiro; Perucho, Manuel

2018-01-01

Somatic, and in a minor scale also germ line, epigenetic aberrations are fundamental to carcinogenesis, cancer progression, and tumor phenotype. DNA methylation is the most extensively studied and arguably the best understood epigenetic mechanisms that become altered in cancer. Both somatic loss of methylation (hypomethylation) and gain of methylation (hypermethylation) are found in the genome of malignant cells. In general, the cancer cell epigenome is globally hypomethylated, while some regions-typically gene-associated CpG islands-become hypermethylated. Given the profound impact that DNA methylation exerts on the transcriptional profile and genomic stability of cancer cells, its characterization is essential to fully understand the complexity of cancer biology, improve tumor classification, and ultimately advance cancer patient management and treatment. A plethora of methods have been devised to analyze and quantify DNA methylation alterations. Several of the early-developed methods relied on the use of methylation-sensitive restriction enzymes, whose activity depends on the methylation status of their recognition sequences. Among these techniques, methylation-sensitive amplification length polymorphism (MS-AFLP) was developed in the early 2000s, and successfully adapted from its original gel electrophoresis fingerprinting format to a microarray format that notably increased its throughput and allowed the quantification of the methylation changes. This array-based platform interrogates over 9500 independent loci putatively amplified by the MS-AFLP technique, corresponding to the NotI sites mapped throughout the human genome.

Simultaneous isolation of high-quality DNA, RNA, miRNA and proteins from tissues for genomic applications

PubMed Central

Peña-Llopis, Samuel; Brugarolas, James

2014-01-01

Genomic technologies have revolutionized our understanding of complex Mendelian diseases and cancer. Solid tumors present several challenges for genomic analyses, such as tumor heterogeneity and tumor contamination with surrounding stroma and infiltrating lymphocytes. We developed a protocol to (i) select tissues of high cellular purity on the basis of histological analyses of immediately flanking sections and (ii) simultaneously extract genomic DNA (gDNA), messenger RNA (mRNA), noncoding RNA (ncRNA; enriched in microRNA (miRNA)) and protein from the same tissues. After tissue selection, about 12–16 extractions of DNA/RNA/protein can be obtained per day. Compared with other similar approaches, this fast and reliable methodology allowed us to identify mutations in tumors with remarkable sensitivity and to perform integrative analyses of whole-genome and exome data sets, DNA copy numbers (by single-nucleotide polymorphism (SNP) arrays), gene expression data (by transcriptome profiling and quantitative PCR (qPCR)) and protein levels (by western blotting and immunohistochemical analysis) from the same samples. Although we focused on renal cell carcinoma, this protocol may be adapted with minor changes to any human or animal tissue to obtain high-quality and high-yield nucleic acids and proteins. PMID:24136348
Gene chips and arrays revealed: a primer on their power and their uses.

PubMed

Watson, S J; Akil, H

1999-03-01

This article provides an overview and general explanation of the rapidly developing area of gene chips and expression array technology. These are methods targeted at allowing the simultaneous study of thousands of genes or messenger RNAs under various physiological and pathological states. Their technical basis grows from the Human Genome Project. Both methods place DNA strands on glass computer chips (or microscope slides). Expression arrays start with complementary DNA (cDNA) clones derived from the EST data base, whereas Gene Chips synthesize oligonucleotides directly on the chip itself. Both are analyzed using image analysis systems, are capable of reading values from two different individuals at any one site, and can yield quantitative data for thousands of genes or mRNAs per slide. These methods promise to revolutionize molecular biology, cell biology, neuroscience and psychiatry. It is likely that this technology will radically open up our ability to study the actions and structure of the multiple genes involved in the complex genetics of brain disorders.
Pervasive polymorphic imprinted methylation in the human placenta

PubMed Central

Hanna, Courtney W.; Peñaherrera, Maria S.; Saadeh, Heba; Andrews, Simon; McFadden, Deborah E.; Kelsey, Gavin; Robinson, Wendy P.

2016-01-01

The maternal and paternal copies of the genome are both required for mammalian development, and this is primarily due to imprinted genes, those that are monoallelically expressed based on parent-of-origin. Typically, this pattern of expression is regulated by differentially methylated regions (DMRs) that are established in the germline and maintained after fertilization. There are a large number of germline DMRs that have not yet been associated with imprinting, and their function in development is unknown. In this study, we developed a genome-wide approach to identify novel imprinted DMRs in the human placenta and investigated the dynamics of these imprinted DMRs during development in somatic and extraembryonic tissues. DNA methylation was evaluated using the Illumina HumanMethylation450 array in 134 human tissue samples, publicly available reduced representation bisulfite sequencing in the human embryo and germ cells, and targeted bisulfite sequencing in term placentas. Forty-three known and 101 novel imprinted DMRs were identified in the human placenta by comparing methylation between diandric and digynic triploid conceptions in addition to female and male gametes. Seventy-two novel DMRs showed a pattern consistent with placental-specific imprinting, and this monoallelic methylation was entirely maternal in origin. Strikingly, these DMRs exhibited polymorphic imprinted methylation between placental samples. These data suggest that imprinting in human development is far more extensive and dynamic than previously reported and that the placenta preferentially maintains maternal germline-derived DNA methylation. PMID:26769960
The human-induced pluripotent stem cell initiative—data resources for cellular genetics

PubMed Central

Streeter, Ian; Harrison, Peter W.; Faulconbridge, Adam; Flicek, Paul; Parkinson, Helen; Clarke, Laura

2017-01-01

The Human Induced Pluripotent Stem Cell Initiative (HipSci) isf establishing a large catalogue of human iPSC lines, arguably the most well characterized collection to date. The HipSci portal enables researchers to choose the right cell line for their experiment, and makes HipSci's rich catalogue of assay data easy to discover and reuse. Each cell line has genomic, transcriptomic, proteomic and cellular phenotyping data. Data are deposited in the appropriate EMBL-EBI archives, including the European Nucleotide Archive (ENA), European Genome-phenome Archive (EGA), ArrayExpress and PRoteomics IDEntifications (PRIDE) databases. The project will make 500 cell lines from healthy individuals, and from 150 patients with rare genetic diseases; these will be available through the European Collection of Authenticated Cell Cultures (ECACC). As of August 2016, 238 cell lines are available for purchase. Project data is presented through the HipSci data portal (http://www.hipsci.org/lines) and is downloadable from the associated FTP site (ftp://ftp.hipsci.ebi.ac.uk/vol1/ftp). The data portal presents a summary matrix of the HipSci cell lines, showing available data types. Each line has its own page containing descriptive metadata, quality information, and links to archived assay data. Analysis results are also available in a Track Hub, allowing visualization in the context of public genomic annotations (http://www.hipsci.org/data/trackhubs). PMID:27733501
Differential nuclear scaffold/matrix attachment marks expressed genes.

PubMed

Linnemann, Amelia K; Platts, Adrian E; Krawetz, Stephen A

2009-02-15

It is well established that nuclear architecture plays a key role in poising regions of the genome for transcription. This may be achieved using scaffold/matrix attachment regions (S/MARs) that establish loop domains. However, the relationship between changes in the physical structure of the genome as mediated by attachment to the nuclear scaffold/matrix and gene expression is not clearly understood. To define the role of S/MARs in organizing our genome and to resolve the often contradictory loci-specific studies, we have surveyed the S/MARs in HeLa S3 cells on human chromosomes 14-18 by array comparative genomic hybridization. Comparison of LIS (lithium 3,5-diiodosalicylate) extraction to identify SARs and 2 m NaCl extraction to identify MARs revealed that approximately one-half of the sites were in common. The results presented in this study suggest that SARs 5' of a gene are associated with transcript presence whereas MARs contained within a gene are associated with silenced genes. The varied functions of the S/MARs as revealed by the different extraction methods highlights their unique functional contribution.
Differential nuclear scaffold/matrix attachment marks expressed genes†

PubMed Central

Linnemann, Amelia K.; Platts, Adrian E.; Krawetz, Stephen A.

2009-01-01

It is well established that nuclear architecture plays a key role in poising regions of the genome for transcription. This may be achieved using scaffold/matrix attachment regions (S/MARs) that establish loop domains. However, the relationship between changes in the physical structure of the genome as mediated by attachment to the nuclear scaffold/matrix and gene expression is not clearly understood. To define the role of S/MARs in organizing our genome and to resolve the often contradictory loci-specific studies, we have surveyed the S/MARs in HeLa S3 cells on human chromosomes 14–18 by array comparative genomic hybridization. Comparison of LIS (lithium 3,5-diiodosalicylate) extraction to identify SARs and 2 m NaCl extraction to identify MARs revealed that approximately one-half of the sites were in common. The results presented in this study suggest that SARs 5′ of a gene are associated with transcript presence whereas MARs contained within a gene are associated with silenced genes. The varied functions of the S/MARs as revealed by the different extraction methods highlights their unique functional contribution. PMID:19017725
Mitochondrial DNA repairs double-strand breaks in yeast chromosomes.

PubMed

Ricchetti, M; Fairhead, C; Dujon, B

1999-11-04

The endosymbiotic theory for the origin of eukaryotic cells proposes that genetic information can be transferred from mitochondria to the nucleus of a cell, and genes that are probably of mitochondrial origin have been found in nuclear chromosomes. Occasionally, short or rearranged sequences homologous to mitochondrial DNA are seen in the chromosomes of different organisms including yeast, plants and humans. Here we report a mechanism by which fragments of mitochondrial DNA, in single or tandem array, are transferred to yeast chromosomes under natural conditions during the repair of double-strand breaks in haploid mitotic cells. These repair insertions originate from noncontiguous regions of the mitochondrial genome. Our analysis of the Saccharomyces cerevisiae mitochondrial genome indicates that the yeast nuclear genome does indeed contain several short sequences of mitochondrial origin which are similar in size and composition to those that repair double-strand breaks. These sequences are located predominantly in non-coding regions of the chromosomes, frequently in the vicinity of retrotransposon long terminal repeats, and appear as recent integration events. Thus, colonization of the yeast genome by mitochondrial DNA is an ongoing process.
An integrated semiconductor device enabling non-optical genome sequencing.

PubMed

Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

2011-07-20

The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.
Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

PubMed

Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

2013-01-01

Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.
A comparative genomic hybridization approach to study gene copy number variations among Chinese hamster cell lines.

PubMed

Vishwanathan, Nandita; Bandyopadhyay, Arpan; Fu, Hsu-Yuan; Johnson, Kathryn C; Springer, Nathan M; Hu, Wei-Shou

2017-08-01

Chinese Hamster Ovary (CHO) cells are aneuploid in nature. The genome of recombinant protein producing CHO cell lines continuously undergoes changes in its structure and organization. We analyzed nine cell lines, including parental cell lines, using a comparative genomic hybridization (CGH) array focused on gene-containing regions. The comparison of CGH with copy-number estimates from sequencing data showed good correlation. Hierarchical clustering of the gene copy number variation data from CGH data revealed the lineage relationships between the cell lines. On analyzing the clones of a clonal population, some regions with altered genomic copy number status were identified indicating genomic changes during passaging. A CGH array is thus an effective tool in quantifying genomic alterations in industrial cell lines and can provide insights into the changes in the genomic structure during cell line derivation and long term culture. Biotechnol. Bioeng. 2017;114: 1903-1908. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

PubMed Central

2011-01-01

Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336
Organization of 5S rDNA in species of the fish Leporinus: two different genomic locations are characterized by distinct nontranscribed spacers.

PubMed

Martins, C; Galetti, P M

2001-10-01

To address understanding the organization of the 5S rRNA multigene family in the fish genome, the nucleotide sequence and organization array of 5S rDNA were investigated in the genus Leporinus, a representative freshwater fish group of South American fauna. PCR, subgenomic library screening, genomic blotting, fluorescence in situ hybridization, and DNA sequencing were employed in this study. Two arrays of 5S rDNA were identified for all species investigated, one consisting of monomeric repeat units of around 200 bp and another one with monomers of 900 bp. These 5S rDNA arrays were characterized by distinct NTS sequences (designated NTS-I and NTS-II for the 200- and 900-bp monomers, respectively); however, their coding sequences were nearly identical. The 5S rRNA genes were clustered in two chromosome loci, a major one corresponding to the NTS-I sites and a minor one corresponding to the NTS-II sites. The NTS-I sequence was variable among Leporinus spp., whereas the NTS-II was conserved among them and even in the related genus Schizodon. The distinct 5S rDNA arrays might characterize two 5S rRNA gene subfamilies that have been evolving independently in the genome.
Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp) using a soybean genome array.

PubMed

Das, Sayan; Bhat, Prasanna R; Sudhakar, Chinta; Ehlers, Jeffrey D; Wanamaker, Steve; Roberts, Philip A; Cui, Xinping; Close, Timothy J

2008-02-28

Cowpea (Vigna unguiculata L. Walp) is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP) is a microarray-based marker which can be used for high throughput genotyping and high density mapping. Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max) genome array. Robustified projection pursuit (RPP) was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL) population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.
Time for Genome Editing: Next-Generation Attenuated Malaria Parasites.

PubMed

Singer, Mirko; Frischknecht, Friedrich

2017-03-01

Immunization with malaria parasites that developmentally arrest in or immediately after the liver stage is the only way currently known to confer sterilizing immunity in both humans and rodent models. There are various ways to attenuate parasite development resulting in different timings of arrest, which has a significant impact on vaccination efficiency. To understand what most impacts vaccination efficiency, newly developed gain-of-function methods can now be used to generate a wide array of differently attenuated parasites. The combination of multiple attenuation approaches offers the potential to engineer efficiently attenuated Plasmodium parasites and learn about their fascinating biology at the same time. Here we discuss recent studies and the potential of targeted parasite manipulation using genome editing to develop live attenuated malaria vaccines. Copyright © 2016 Elsevier Ltd. All rights reserved.
CLDN1 expression in cervical cancer cells is related to tumor invasion and metastasis.

PubMed

Zhang, Wei-Na; Li, Wei; Wang, Xiao-Li; Hu, Zheng; Zhu, Da; Ding, Wen-Cheng; Liu, Dan; Li, Ke-Zhen; Ma, Ding; Wang, Hui

2016-12-27

Even though infection with human papillomaviruses (HPV) is very important, it is not the sole cause of cervical cancer. Because it is known that genetic variations that result from HPV infection are probably the most important causes of cervical cancer, we used human whole genome array comparative genomic hybridization to detect the copy number variations of genes in cervical squamous cell carcinoma. The results of the array were validated by PCR, FISH and immunohistochemistry. We find that the copy number and protein expression of claudin-1 (CLDN1) increase with the progression of cervical cancer. The strong positive staining of CLDN1 in the cervical lymph node metastasis group received a significantly higher score than the staining in the group with no lymph node metastasis of cervical cancer tissues. The overexpression of CLDN1 in SiHa cells can increase anti-apoptosis ability and promote invasive ability of these cells accompanied by a decrease in expression of the epithelial marker E-cadherin as well as an increase in the expression of the mesenchymal marker vimentin. CLDN1 induces the epithelial-mesenchymal transition (EMT) through its interaction with SNAI1. Furthermore, we demonstrate that CLDN1 overexpression has significant effects on the growth and metastasis of xenografted tumors in athymic mice. These data suggest that CLDN1 promotes invasion and metastasis in cervical cancer cells via the expression of EMT/invasion-related genes. Therefore, CLDN1 could be a potential therapeutic target for the treatment of cervical cancer.
DNA methylome profiling of maternal peripheral blood and placentas reveal potential fetal DNA markers for non-invasive prenatal testing.

PubMed

Xiang, Yuqian; Zhang, Junyu; Li, Qiaoli; Zhou, Xinyao; Wang, Teng; Xu, Mingqing; Xia, Shihui; Xing, Qinghe; Wang, Lei; He, Lin; Zhao, Xinzhi

2014-09-01

Utilizing epigenetic (DNA methylation) differences to differentiate between maternal peripheral blood (PBL) and fetal (placental) DNA has been a promising strategy for non-invasive prenatal testing (NIPT). However, the differentially methylated regions (DMRs) have yet to be fully ascertained. In the present study, we performed genome-wide comparative methylome analysis between maternal PBL and placental DNA from pregnancies of first trimester by methylated DNA immunoprecipitation-sequencing (MeDIP-Seq) and Infinium HumanMethylation450 BeadChip assays. A total of 36 931 DMRs and 45 804 differentially methylated sites (DMSs) covering the whole genome, exclusive of the Y chromosome, were identified via MeDIP-Seq and Infinium 450k array, respectively, of which 3759 sites in 2188 regions were confirmed by both methods. Not only did we find the previously reported potential fetal DNA markers in our identified DMRs/DMSs but also we verified fully the identified DMRs/DMSs in the validation round by MassARRAY EpiTYPER. The screened potential fetal DNA markers may be used for NIPT on aneuploidies and other chromosomal diseases, such as cri du chat syndrome and velo-cardio-facial syndrome. In addition, these potential markers may have application in the early diagnosis of placental dysfunction, such as pre-eclampsia. © The Author 2014. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The iSelect 9 K SNP analysis revealed polyploidization induced revolutionary changes and intense human selection causing strong haplotype blocks in wheat.

PubMed

Hao, Chenyang; Wang, Yuquan; Chao, Shiaoman; Li, Tian; Liu, Hongxia; Wang, Lanfen; Zhang, Xueyong

2017-01-30

A Chinese wheat mini core collection was genotyped using the wheat 9 K iSelect SNP array. Total 2420 and 2396 polymorphic SNPs were detected on the A and the B genome chromosomes, which formed 878 haplotype blocks. There were more blocks in the B genome, but the average block size was significantly (P < 0.05) smaller than those in the A genome. Intense selection (domestication and breeding) had a stronger effect on the A than on the B genome chromosomes. Based on the genetic pedigrees, many blocks can be traced back to a well-known Strampelli cross, which was made one century ago. Furthermore, polyploidization of wheat (both tetraploidization and hexaploidization) induced revolutionary changes in both the A and the B genomes, with a greater increase of gene diversity compared to their diploid ancestors. Modern breeding has dramatically increased diversity in the gene coding regions, though obvious blocks were formed on most of the chromosomes in both tetraploid and hexaploid wheats. Tag-SNP markers identified in this study can be used for marker assisted selection using haplotype blocks as a wheat breeding strategy. This strategy can also be employed to facilitate genome selection in other self-pollinating crop species.
Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.

PubMed

Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross

2016-08-01

High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle. Copyright © 2016 The Author. Published by Elsevier Inc. All rights reserved.
Development of a 63K SNP Array for Cotton and High-Density Mapping of Intraspecific and Interspecific Populations of Gossypium spp.

PubMed Central

Hulse-Kemp, Amanda M.; Lemm, Jana; Plieske, Joerg; Ashrafi, Hamid; Buyyarapu, Ramesh; Fang, David D.; Frelichowski, James; Giband, Marc; Hague, Steve; Hinze, Lori L.; Kochan, Kelli J.; Riggs, Penny K.; Scheffler, Jodi A.; Udall, Joshua A.; Ulloa, Mauricio; Wang, Shirley S.; Zhu, Qian-Hao; Bag, Sumit K.; Bhardwaj, Archana; Burke, John J.; Byers, Robert L.; Claverie, Michel; Gore, Michael A.; Harker, David B.; Islam, Md S.; Jenkins, Johnie N.; Jones, Don C.; Lacape, Jean-Marc; Llewellyn, Danny J.; Percy, Richard G.; Pepper, Alan E.; Poland, Jesse A.; Mohan Rai, Krishan; Sawant, Samir V.; Singh, Sunil Kumar; Spriggs, Andrew; Taylor, Jen M.; Wang, Fei; Yourstone, Scott M.; Zheng, Xiuting; Lawley, Cindy T.; Ganal, Martin W.; Van Deynze, Allen; Wilson, Iain W.; Stelly, David M.

2015-01-01

High-throughput genotyping arrays provide a standardized resource for plant breeding communities that are useful for a breadth of applications including high-density genetic mapping, genome-wide association studies (GWAS), genomic selection (GS), complex trait dissection, and studying patterns of genomic diversity among cultivars and wild accessions. We have developed the CottonSNP63K, an Illumina Infinium array containing assays for 45,104 putative intraspecific single nucleotide polymorphism (SNP) markers for use within the cultivated cotton species Gossypium hirsutum L. and 17,954 putative interspecific SNP markers for use with crosses of other cotton species with G. hirsutum. The SNPs on the array were developed from 13 different discovery sets that represent a diverse range of G. hirsutum germplasm and five other species: G. barbadense L., G. tomentosum Nuttal × Seemann, G. mustelinum Miers × Watt, G. armourianum Kearny, and G. longicalyx J.B. Hutchinson and Lee. The array was validated with 1,156 samples to generate cluster positions to facilitate automated analysis of 38,822 polymorphic markers. Two high-density genetic maps containing a total of 22,829 SNPs were generated for two F2 mapping populations, one intraspecific and one interspecific, and 3,533 SNP markers were co-occurring in both maps. The produced intraspecific genetic map is the first saturated map that associates into 26 linkage groups corresponding to the number of cotton chromosomes for a cross between two G. hirsutum lines. The linkage maps were shown to have high levels of collinearity to the JGI G. raimondii Ulbrich reference genome sequence. The CottonSNP63K array, cluster file and associated marker sequences constitute a major new resource for the global cotton research community. PMID:25908569
Preimplantation genetic screening for all 24 chromosomes by microarray comparative genomic hybridization significantly increases implantation rates and clinical pregnancy rates in patients undergoing in vitro fertilization with poor prognosis

PubMed Central

Majumdar, Gaurav; Majumdar, Abha; Lall, Meena; Verma, Ishwar C.; Upadhyaya, Kailash C.

2016-01-01

CONTEXT: A majority of human embryos produced in vitro are aneuploid, especially in couples undergoing in vitro fertilization (IVF) with poor prognosis. Preimplantation genetic screening (PGS) for all 24 chromosomes has the potential to select the most euploid embryos for transfer in such cases. AIM: To study the efficacy of PGS for all 24 chromosomes by microarray comparative genomic hybridization (array CGH) in Indian couples undergoing IVF cycles with poor prognosis. SETTINGS AND DESIGN: A retrospective, case–control study was undertaken in an institution-based tertiary care IVF center to compare the clinical outcomes of twenty patients, who underwent 21 PGS cycles with poor prognosis, with 128 non-PGS patients in the control group, with the same inclusion criterion as for the PGS group. MATERIALS AND METHODS: Single cells were obtained by laser-assisted embryo biopsy from day 3 embryos and subsequently analyzed by array CGH for all 24 chromosomes. Once the array CGH results were available on the morning of day 5, only chromosomally normal embryos that had progressed to blastocyst stage were transferred. RESULTS: The implantation rate and clinical pregnancy rate (PR) per transfer were found to be significantly higher in the PGS group than in the control group (63.2% vs. 26.2%, P = 0.001 and 73.3% vs. 36.7%, P = 0.006, respectively), while the multiple PRs sharply declined from 31.9% to 9.1% in the PGS group. CONCLUSIONS: In this pilot study, we have shown that PGS by array CGH can improve the clinical outcome in patients undergoing IVF with poor prognosis. PMID:27382234

Comparison of array comparative genomic hybridization and quantitative real-time PCR-based aneuploidy screening of blastocyst biopsies.

PubMed

Capalbo, Antonio; Treff, Nathan R; Cimadomo, Danilo; Tao, Xin; Upham, Kathleen; Ubaldi, Filippo Maria; Rienzi, Laura; Scott, Richard T

2015-07-01

Comprehensive chromosome screening (CCS) methods are being extensively used to select chromosomally normal embryos in human assisted reproduction. Some concerns related to the stage of analysis and which aneuploidy screening method to use still remain. In this study, the reliability of blastocyst-stage aneuploidy screening and the diagnostic performance of the two mostly used CCS methods (quantitative real-time PCR (qPCR) and array comparative genome hybridization (aCGH)) has been assessed. aCGH aneuploid blastocysts were rebiopsied, blinded, and evaluated by qPCR. Discordant cases were subsequently rebiopsied, blinded, and evaluated by single-nucleotide polymorphism (SNP) array-based CCS. Although 81.7% of embryos showed the same diagnosis when comparing aCGH and qPCR-based CCS, 18.3% (22/120) of embryos gave a discordant result for at least one chromosome. SNP array reanalysis showed that a discordance was reported in ten blastocysts for aCGH, mostly due to false positives, and in four cases for qPCR. The discordant aneuploidy call rate per chromosome was significantly higher for aCGH (5.7%) compared with qPCR (0.6%; P<0.01). To corroborate these findings, 39 embryos were simultaneously biopsied for aCGH and qPCR during blastocyst-stage aneuploidy screening cycles. 35 matched including all 21 euploid embryos. Blinded SNP analysis on rebiopsies of the four embryos matched qPCR. These findings demonstrate the high reliability of diagnosis performed at the blastocyst stage with the use of different CCS methods. However, the application of aCGH can be expected to result in a higher aneuploidy rate than other contemporary methods of CCS.
Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array

PubMed Central

Wang, Shichen; Wong, Debbie; Forrest, Kerrie; Allen, Alexandra; Chao, Shiaoman; Huang, Bevan E; Maccaferri, Marco; Salvi, Silvio; Milner, Sara G; Cattivelli, Luigi; Mastrangelo, Anna M; Whan, Alex; Stephen, Stuart; Barker, Gary; Wieseke, Ralf; Plieske, Joerg; International Wheat Genome Sequencing Consortium; Lillemo, Morten; Mather, Diane; Appels, Rudi; Dolferus, Rudy; Brown-Guedira, Gina; Korol, Abraham; Akhunova, Alina R; Feuillet, Catherine; Salse, Jerome; Morgante, Michele; Pozniak, Curtis; Luo, Ming-Cheng; Dvorak, Jan; Morell, Matthew; Dubcovsky, Jorge; Ganal, Martin; Tuberosa, Roberto; Lawley, Cindy; Mikoulitch, Ivan; Cavanagh, Colin; Edwards, Keith J; Hayden, Matthew; Akhunov, Eduard

2014-01-01

High-density single nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships between individuals in populations and studying marker–trait associations in mapping experiments. We developed a genotyping array including about 90 000 gene-associated SNPs and used it to characterize genetic variation in allohexaploid and allotetraploid wheat populations. The array includes a significant fraction of common genome-wide distributed SNPs that are represented in populations of diverse geographical origin. We used density-based spatial clustering algorithms to enable high-throughput genotype calling in complex data sets obtained for polyploid wheat. We show that these model-free clustering algorithms provide accurate genotype calling in the presence of multiple clusters including clusters with low signal intensity resulting from significant sequence divergence at the target SNP site or gene deletions. Assays that detect low-intensity clusters can provide insight into the distribution of presence–absence variation (PAV) in wheat populations. A total of 46 977 SNPs from the wheat 90K array were genetically mapped using a combination of eight mapping populations. The developed array and cluster identification algorithms provide an opportunity to infer detailed haplotype structure in polyploid wheat and will serve as an invaluable resource for diversity studies and investigating the genetic basis of trait variation in wheat. PMID:24646323
Detection and quantitation of chromosomal mosaicism in human blastocysts using copy number variation sequencing.

PubMed

Ruttanajit, Tida; Chanchamroen, Sujin; Cram, David S; Sawakwongpra, Kritchakorn; Suksalak, Wanwisa; Leng, Xue; Fan, Junmei; Wang, Li; Yao, Yuanqing; Quangkananurug, Wiwat

2016-02-01

Currently, our understanding of the nature and reproductive potential of blastocysts associated with trophectoderm (TE) lineage chromosomal mosaicism is limited. The objective of this study was to first validate copy number variation sequencing (CNV-Seq) for measuring the level of mosaicism and second, examine the nature and level of mosaicism in TE biopsies of patient's blastocysts. TE biopy samples were analysed by array comparative genomic hybridization (CGH) and CNV-Seq to discriminate between euploid, aneuploid and mosaic blastocysts. Using artificial models of TE mosaicism for five different chromosomes, CNV-Seq accurately and reproducibly quantitated mosaicism at levels of 50% and 20%. In a comparative 24-chromosome study of 49 blastocysts by array CGH and CNV-Seq, 43 blastocysts (87.8%) had a concordant diagnosis and 6 blastocysts (12.2%) were discordant. The discordance was attributed to low to medium levels of chromosomal mosaicism (30-70%) not detected by array CGH. In an expanded study of 399 blastocysts using CNV-Seq as the sole diagnostic method, the proportion of diploid-aneuploid mosaics (34, 8.5%) was significantly higher than aneuploid mosaics (18, 4.5%) (p < 0.02). Mosaicism is a significant chromosomal abnormality associated with the TE lineage of human blastocysts that can be reliably and accurately detected by CNV-Seq. © 2015 John Wiley & Sons, Ltd.
Identifying allelic loss and homozygous deletions in pancreatic cancer without matched normals using high-density single-nucleotide polymorphism arrays.

PubMed

Calhoun, Eric S; Hucl, Tomas; Gallmeier, Eike; West, Kristen M; Arking, Dan E; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Chakravarti, Aravinda; Hruban, Ralph H; Kern, Scott E

2006-08-15

Recent advances in oligonucleotide arrays and whole-genome complexity reduction data analysis now permit the evaluation of tens of thousands of single-nucleotide polymorphisms simultaneously for a genome-wide analysis of allelic status. Using these arrays, we created high-resolution allelotype maps of 26 pancreatic cancer cell lines. The areas of heterozygosity implicitly served to reveal regions of allelic loss. The array-derived maps were verified by a panel of 317 microsatellite markers used in a subset of seven samples, showing a 97.1% concordance between heterozygous calls. Three matched tumor/normal pairs were used to estimate the false-negative and potential false-positive rates for identifying loss of heterozygosity: 3.6 regions (average minimal region of loss, 720,228 bp) and 2.3 regions (average heterozygous gap distance, 4,434,994 bp) per genome, respectively. Genomic fractional allelic loss calculations showed that cumulative levels of allelic loss ranged widely from 17.1% to 79.9% of the haploid genome length. Regional increases in "NoCall" frequencies combined with copy number loss estimates were used to identify 41 homozygous deletions (19 first reports), implicating an additional 13 regions disrupted in pancreatic cancer. Unexpectedly, 23 of these occurred in just two lines (BxPc3 and MiaPaCa2), suggesting the existence of at least two subclasses of chromosomal instability (CIN) patterns, distinguished here by allelic loss and copy number changes (original CIN) and those also highly enriched in the genomic "holes" of homozygous deletions (holey CIN). This study provides previously unavailable high-resolution allelotype and deletion breakpoint maps in widely shared pancreatic cancer cell lines and effectively eliminates the need for matched normal tissue to define informative loci.
Genomic Analysis of Childhood Brain Tumors: Methods for Genome-Wide Discovery and Precision Medicine Become Mainstream.

PubMed

Mack, Stephen C; Northcott, Paul A

2017-07-20

Recent breakthroughs in next-generation sequencing technology and complementary genomic platforms have transformed our capacity to interrogate the molecular landscapes of human cancers, including childhood brain tumors. Numerous high-throughput genomic studies have been reported for the major histologic brain tumor entities diagnosed in children, including interrogations at the level of the genome, epigenome, and transcriptome, many of which have yielded essential new insights into disease biology. The nature of these discoveries has been largely platform dependent, exemplifying the usefulness of applying different genomic and computational strategies, or integrative approaches, to address specific biologic and/or clinical questions. The goal of this article is to summarize the spectrum of molecular profiling methods available for investigating genomic aspects of childhood brain tumors in both the research and the clinical setting. We provide an overview of the main next-generation sequencing and array-based technologies currently being applied in this field and draw from key examples in the recent neuro-oncology literature to illustrate how these genomic approaches have profoundly advanced our understanding of individual tumor entities. Moreover, we discuss the current status of genomic profiling in the clinic and how different platforms are being used to improve patient diagnosis and stratification, as well as to identify actionable targets for informing molecularly guided therapies, especially for patients for whom conventional standard-of-care treatments have failed. Both the demand for genomic testing and the main challenges associated with incorporating genomics into the clinical management of pediatric patients with brain tumors are discussed, as are recommendations for incorporating these assays into future clinical trials.
Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm.

PubMed

Hoffmann, Thomas J; Zhan, Yiping; Kvale, Mark N; Hesselson, Stephanie E; Gollub, Jeremy; Iribarren, Carlos; Lu, Yontao; Mei, Gangwu; Purdy, Matthew M; Quesenberry, Charles; Rowell, Sarah; Shapero, Michael H; Smethurst, David; Somkin, Carol P; Van den Eeden, Stephen K; Walter, Larry; Webster, Teresa; Whitmer, Rachel A; Finn, Andrea; Schaefer, Catherine; Kwok, Pui-Yan; Risch, Neil

2011-12-01

Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. Copyright © 2011 Elsevier Inc. All rights reserved.
Clinical utility of an array comparative genomic hybridization analysis for Williams syndrome.

PubMed

Yagihashi, Tatsuhiko; Torii, Chiharu; Takahashi, Reiko; Omori, Mikimasa; Kosaki, Rika; Yoshihashi, Hiroshi; Ihara, Masahiro; Minagawa-Kawai, Yasuyo; Yamamoto, Junichi; Takahashi, Takao; Kosaki, Kenjiro

2014-11-01

To reveal the relation between intellectual disability and the deleted intervals in Williams syndrome, we performed an array comparative genomic hybridization analysis and standardized developmental testing for 11 patients diagnosed as having Williams syndrome based on fluorescent in situ hybridization testing. One patient had a large 4.2-Mb deletion spanning distally beyond the common 1.5-Mb intervals observed in 10/11 patients. We formulated a linear equation describing the developmental age of the 10 patients with the common deletion; the developmental age of the patient with the 4.2-Mb deletion was significantly below the expectation (developmental age = 0.51 × chronological age). The large deletion may account for the severe intellectual disability; therefore, the use of array comparative genomic hybridization may provide practical information regarding individuals with Williams syndrome. © 2014 Japanese Teratology Society.
Molecular inversion probe assay for allelic quantitation

PubMed Central

Ji, Hanlee; Welch, Katrina

2010-01-01

Molecular inversion probe (MIP) technology has been demonstrated to be a robust platform for large-scale dual genotyping and copy number analysis. Applications in human genomic and genetic studies include the possibility of running dual germline genotyping and combined copy number variation ascertainment. MIPs analyze large numbers of specific genetic target sequences in parallel, relying on interrogation of a barcode tag, rather than direct hybridization of genomic DNA to an array. The MIP approach does not replace, but is complementary to many of the copy number technologies being performed today. Some specific advantages of MIP technology include: Less DNA required (37 ng vs. 250 ng), DNA quality less important, more dynamic range (amplifications detected up to copy number 60), allele specific information “cleaner” (less SNP crosstalk/contamination), and quality of markers better (fewer individual MIPs versus SNPs needed to identify copy number changes). MIPs can be considered a candidate gene (targeted whole genome) approach and can find specific areas of interest that otherwise may be missed with other methods. PMID:19488872
Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA.

PubMed

Guo, Shicheng; Diep, Dinh; Plongthongkum, Nongluk; Fung, Ho-Lim; Zhang, Kang; Zhang, Kun

2017-04-01

Adjacent CpG sites in mammalian genomes can be co-methylated owing to the processivity of methyltransferases or demethylases, yet discordant methylation patterns have also been observed, which are related to stochastic or uncoordinated molecular processes. We focused on a systematic search and investigation of regions in the full human genome that show highly coordinated methylation. We defined 147,888 blocks of tightly coupled CpG sites, called methylation haplotype blocks, after analysis of 61 whole-genome bisulfite sequencing data sets and validation with 101 reduced-representation bisulfite sequencing data sets and 637 methylation array data sets. Using a metric called methylation haplotype load, we performed tissue-specific methylation analysis at the block level. Subsets of informative blocks were further identified for deconvolution of heterogeneous samples. Finally, using methylation haplotypes we demonstrated quantitative estimation of tumor load and tissue-of-origin mapping in the circulating cell-free DNA of 59 patients with lung or colorectal cancer.
Effects of soluble and particulate Cr(VI) on genome-wide DNA methylation in human B lymphoblastoid cells.

PubMed

Lou, Jianlin; Wang, Yu; Chen, Junqiang; Ju, Li; Yu, Min; Jiang, Zhaoqiang; Feng, Lingfang; Jin, Lingzhi; Zhang, Xing

2015-10-01

Several previous studies highlighted the potential epigenetic effects of Cr(VI), especially DNA methylation. However, few studies have compared the effects of Cr(VI) on DNA methylation profiles between soluble and particulate chromate in vitro. Accordingly, Illumina Infinium Human Methylation 450K BeadChip array was used to analyze DNA methylation profiles of human B lymphoblastoid cells exposed to potassium dichromate or lead chromate, and the cell viability was also studied. Array based DNA methylation analysis showed that the impacts of Cr(VI) on DNA methylation were limited, only about 40 differentially methylated CpG sites, with an overlap of 15CpG sites, were induced by both potassium dichromate and lead chromate. The results of mRNA expression showed that after Cr(VI) treatment, mRNA expression changes of four genes (TBL1Y, FZD5, IKZF2, and KIAA1949) were consistent with their DNA methylation alteration, but DNA methylation changes of other six genes did not correlate with mRNA expression. In conclusion, both of soluble and particulate Cr(VI) could induce a small amount of differentially methylated sites in human B lymphoblastoid cells, and the correlations between DNA methylation changes and mRNA expression varied between different genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Genomic profiling of 766 cancer-related genes in archived esophageal normal and carcinoma tissues.

PubMed

Chen, Jing; Guo, Liping; Peiffer, Daniel A; Zhou, Lixin; Chan, Owen Tsan Mo; Bibikova, Marina; Wickham-Garcia, Eliza; Lu, Shih-Hsin; Zhan, Qimin; Wang-Rodriguez, Jessica; Jiang, Wei; Fan, Jian-Bing

2008-05-15

We employed the BeadArraytrade mark technology to perform a genetic analysis in 33 formalin-fixed, paraffin-embedded (FFPE) human esophageal carcinomas, mostly squamous-cell-carcinoma (ESCC), and their adjacent normal tissues. A total of 1,432 single nucleotide polymorphisms (SNPs) derived from 766 cancer-related genes were genotyped with partially degraded genomic DNAs isolated from these samples. This directly targeted genomic profiling identified not only previously reported somatic gene amplifications (e.g., CCND1) and deletions (e.g., CDKN2A and CDKN2B) but also novel genomic aberrations. Among these novel targets, the most frequently deleted genomic regions were chromosome 3p (including tumor suppressor genes FANCD2 and CTNNB1) and chromosome 5 (including tumor suppressor gene APC). The most frequently amplified genomic region was chromosome 3q (containing DVL3, MLF1, ABCC5, BCL6, AGTR1 and known oncogenes TNK2, TNFSF10, FGF12). The chromosome 3p deletion and 3q amplification occurred coincidently in nearly all of the affected cases, suggesting a molecular mechanism for the generation of somatic chromosomal aberrations. We also detected significant differences in germline allele frequency between the esophageal cohort of our study and normal control samples from the International HapMap Project for 10 genes (CSF1, KIAA1804, IL2, PMS2, IRF7, FLT3, NTRK2, MAP3K9, ERBB2 and PRKAR1A), suggesting that they might play roles in esophageal cancer susceptibility and/or development. Taken together, our results demonstrated the utility of the BeadArray technology for high-throughput genetic analysis in FFPE tumor tissues and provided a detailed genetic profiling of cancer-related genes in human esophageal cancer. (c) 2008 Wiley-Liss, Inc.
Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines

PubMed Central

Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

2016-01-01

Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours’ biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription–quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes. PMID:29263807
Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

PubMed

Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

2016-01-01

Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.
Comprehensive identification of mutations induced by heavy-ion beam irradiation in Arabidopsis thaliana.

PubMed

Hirano, Tomonari; Kazama, Yusuke; Ishii, Kotaro; Ohbu, Sumie; Shirakawa, Yuki; Abe, Tomoko

2015-04-01

Heavy-ion beams are widely used for mutation breeding and molecular biology. Although the mutagenic effects of heavy-ion beam irradiation have been characterized by sequence analysis of some restricted chromosomal regions or loci, there have been no evaluations at the whole-genome level or of the detailed genomic rearrangements in the mutant genomes. In this study, using array comparative genomic hybridization (array-CGH) and resequencing, we comprehensively characterized the mutations in Arabidopsis thaliana genomes irradiated with Ar or Fe ions. We subsequently used this information to investigate the mutagenic effects of the heavy-ion beams. Array-CGH demonstrated that the average number of deleted areas per genome were 1.9 and 3.7 following Ar-ion and Fe-ion irradiation, respectively, with deletion sizes ranging from 149 to 602,180 bp; 81% of the deletions were accompanied by genomic rearrangements. To provide a further detailed analysis, the genomes of the mutants induced by Ar-ion beam irradiation were resequenced, and total mutations, including base substitutions, duplications, in/dels, inversions, and translocations, were detected using three algorithms. All three resequenced mutants had genomic rearrangements. Of the 22 DNA fragments that contributed to the rearrangements, 19 fragments were responsible for the intrachromosomal rearrangements, and multiple rearrangements were formed in the localized regions of the chromosomes. The interchromosomal rearrangements were detected in the multiply rearranged regions. These results indicate that the heavy-ion beams led to clustered DNA damage in the chromosome, and that they have great potential to induce complicated intrachromosomal rearrangements. Heavy-ion beams will prove useful as unique mutagens for plant breeding and the establishment of mutant lines. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Integrative genomics identifies molecular alterations that challenge the linear model of melanoma progression.

PubMed

Rose, Amy E; Poliseno, Laura; Wang, Jinhua; Clark, Michael; Pearlman, Alexander; Wang, Guimin; Vega Y Saenz de Miera, Eleazar C; Medicherla, Ratna; Christos, Paul J; Shapiro, Richard; Pavlick, Anna; Darvishian, Farbod; Zavadil, Jiri; Polsky, David; Hernando, Eva; Ostrer, Harry; Osman, Iman

2011-04-01

Superficial spreading melanoma (SSM) and nodular melanoma (NM) are believed to represent sequential phases of linear progression from radial to vertical growth. Several lines of clinical, pathologic, and epidemiologic evidence suggest, however, that SSM and NM might be the result of independent pathways of tumor development. We utilized an integrative genomic approach that combines single nucleotide polymorphism array (6.0; Affymetrix) with gene expression array (U133A 2.0; Affymetrix) to examine molecular differences between SSM and NM. Pathway analysis of the most differentially expressed genes between SSM and NM (N = 114) revealed significant differences related to metabolic processes. We identified 8 genes (DIS3, FGFR1OP, G3BP2, GALNT7, MTAP, SEC23IP, USO1, and ZNF668) in which NM/SSM-specific copy number alterations correlated with differential gene expression (P < 0.05; Spearman's rank). SSM-specific genomic deletions in G3BP2, MTAP, and SEC23IP were independently verified in two external data sets. Forced overexpression of metabolism-related gene MTAP (methylthioadenosine phosphorylase) in SSM resulted in reduced cell growth. The differential expression of another metabolic-related gene, aldehyde dehydrogenase 7A1 (ALDH7A1), was validated at the protein level by using tissue microarrays of human melanoma. In addition, we show that the decreased ALDH7A1 expression in SSM may be the result of epigenetic modifications. Our data reveal recurrent genomic deletions in SSM not present in NM, which challenge the linear model of melanoma progression. Furthermore, our data suggest a role for altered regulation of metabolism-related genes as a possible cause of the different clinical behavior of SSM and NM.
Integrative genomics identifies molecular alterations that challenge the linear model of melanoma progression

PubMed Central

Rose, Amy E.; Poliseno, Laura; Wang, Jinhua; Clark, Michael; Pearlman, Alexander; Wang, Guimin; Vega y Saenz de Miera, Eleazar C.; Medicherla, Ratna; Christos, Paul J.; Shapiro, Richard; Pavlick, Anna; Darvishian, Farbod; Zavadil, Jiri; Polsky, David; Hernando, Eva; Ostrer, Harry; Osman, Iman

2011-01-01

Superficial spreading melanoma (SSM) and nodular melanoma (NM) are believed to represent sequential phases of linear progression from radial to vertical growth. Several lines of clinical, pathological and epidemiologic evidence suggest, however, that SSM and NM might be the result of independent pathways of tumor development. We utilized an integrative genomic approach that combines single nucleotide polymorphism array (SNP 6.0, Affymetrix) with gene expression array (U133A 2.0, Affymetrix) to examine molecular differences between SSM and NM. Pathway analysis of the most differentially expressed genes between SSM and NM (N=114) revealed significant differences related to metabolic processes. We identified 8 genes (DIS3, FGFR1OP, G3BP2, GALNT7, MTAP, SEC23IP, USO1, ZNF668) in which NM/SSM-specific copy number alterations correlated with differential gene expression (P<0.05, Spearman’s rank). SSM-specific genomic deletions in G3BP2, MTAP, and SEC23IP were independently verified in two external data sets. Forced overexpression of metabolism-related gene methylthioadenosine phosphorylase (MTAP) in SSM resulted in reduced cell growth. The differential expression of another metabolic related gene, aldehyde dehydrogenase 7A1 (ALDH7A1), was validated at the protein level using tissue microarrays of human melanoma. In addition, we show that the decreased ALDH7A1 expression in SSM may be the result of epigenetic modifications. Our data reveal recurrent genomic deletions in SSM not present in NM, which challenge the linear model of melanoma progression. Furthermore, our data suggest a role for altered regulation of metabolism-related genes as a possible cause of the different clinical behavior of SSM and NM. PMID:21343389
Genomic alterations identified by array comparative genomic hybridization as prognostic markers in tamoxifen-treated estrogen receptor-positive breast cancer

PubMed Central

Han, Wonshik; Han, Mi-Ryung; Kang, Jason Jongho; Bae, Ji-Yeon; Lee, Ji Hyun; Bae, Young Ju; Lee, Jeong Eon; Shin, Hyuk-Jae; Hwang, Ki-Tae; Hwang, Sung-Eun; Kim, Sung-Won; Noh, Dong-Young

2006-01-01

Background A considerable proportion of estrogen receptor (ER)-positive breast cancer recurs despite tamoxifen treatment, which is a serious problem commonly encountered in clinical practice. We tried to find novel prognostic markers in this subtype of breast cancer. Methods We performed array comparative genomic hybridization (CGH) with 1,440 human bacterial artificial chromosome (BAC) clones to assess copy number changes in 28 fresh-frozen ER-positive breast cancer tissues. All of the patients included had received at least 1 year of tamoxifen treatment. Nine patients had distant recurrence within 5 years (Recurrence group) of diagnosis and 19 patients were alive without disease at least 5 years after diagnosis (Non-recurrence group). Results Potential prognostic variables were comparable between the two groups. In an unsupervised clustering analysis, samples from each group were well separated. The most common regions of gain in all samples were 1q32.1, 17q23.3, 8q24.11, 17q12-q21.1, and 8p11.21, and the most common regions of loss were 6q14.1-q16.3, 11q21-q24.3, and 13q13.2-q14.3, as called by CGH-Explorer software. The average frequency of copy number changes was similar between the two groups. The most significant chromosomal alterations found more often in the Recurrence group using two different statistical methods were loss of 11p15.5-p15.4, 1p36.33, 11q13.1, and 11p11.2 (adjusted p values <0.001). In subgroup analysis according to lymph node status, loss of 11p15 and 1p36 were found more often in Recurrence group with borderline significance within the lymph node positive patients (adjusted p = 0.052). Conclusion Our array CGH analysis with BAC clones could detect various genomic alterations in ER-positive breast cancers, and Recurrence group samples showed a significantly different pattern of DNA copy number changes than did Non-recurrence group samples. PMID:16608533
A new normalizing algorithm for BAC CGH arrays with quality control metrics.

PubMed

Miecznikowski, Jeffrey C; Gaile, Daniel P; Liu, Song; Shepherd, Lori; Nowak, Norma

2011-01-01

The main focus in pin-tip (or print-tip) microarray analysis is determining which probes, genes, or oligonucleotides are differentially expressed. Specifically in array comparative genomic hybridization (aCGH) experiments, researchers search for chromosomal imbalances in the genome. To model this data, scientists apply statistical methods to the structure of the experiment and assume that the data consist of the signal plus random noise. In this paper we propose "SmoothArray", a new method to preprocess comparative genomic hybridization (CGH) bacterial artificial chromosome (BAC) arrays and we show the effects on a cancer dataset. As part of our R software package "aCGHplus," this freely available algorithm removes the variation due to the intensity effects, pin/print-tip, the spatial location on the microarray chip, and the relative location from the well plate. removal of this variation improves the downstream analysis and subsequent inferences made on the data. Further, we present measures to evaluate the quality of the dataset according to the arrayer pins, 384-well plates, plate rows, and plate columns. We compare our method against competing methods using several metrics to measure the biological signal. With this novel normalization algorithm and quality control measures, the user can improve their inferences on datasets and pinpoint problems that may arise in their BAC aCGH technology.
Analyses of Genotypes and Phenotypes of Ten Chinese Patients with Wolf-Hirschhorn Syndrome by Multiplex Ligation-dependent Probe Amplification and Array Comparative Genomic Hybridization

PubMed Central

Yang, Wen-Xu; Pan, Hong; Li, Lin; Wu, Hai-Rong; Wang, Song-Tao; Bao, Xin-Hua; Jiang, Yu-Wu; Qi, Yu

2016-01-01

Background: Wolf-Hirschhorn syndrome (WHS) is a contiguous gene syndrome that is typically caused by a deletion of the distal portion of the short arm of chromosome 4. However, there are few reports about the features of Chinese WHS patients. This study aimed to characterize the clinical and molecular cytogenetic features of Chinese WHS patients using the combination of multiplex ligation-dependent probe amplification (MLPA) and array comparative genomic hybridization (array CGH). Methods: Clinical information was collected from ten patients with WHS. Genomic DNA was extracted from the peripheral blood of the patients. The deletions were analyzed by MLPA and array CGH. Results: All patients exhibited the core clinical symptoms of WHS, including severe growth delay, a Greek warrior helmet facial appearance, differing degrees of intellectual disability, and epilepsy or electroencephalogram anomalies. The 4p deletions ranged from 2.62 Mb to 17.25 Mb in size and included LETM1, WHSC1, and FGFR3. Conclusions: The combined use of MLPA and array CGH is an effective and specific means to diagnose WHS and allows for the precise identification of the breakpoints and sizes of deletions. The deletion of genes in the WHS candidate region is closely correlated with the core WHS phenotype. PMID:26960370
Yeast Sub1 and human PC4 are G-quadruplex binding proteins that suppress genome instability at co-transcriptionally formed G4 DNA.

PubMed

Lopez, Christopher R; Singh, Shivani; Hambarde, Shashank; Griffin, Wezley C; Gao, Jun; Chib, Shubeena; Yu, Yang; Ira, Grzegorz; Raney, Kevin D; Kim, Nayun

2017-06-02

G-quadruplex or G4 DNA is a non-B secondary DNA structure consisting of a stacked array of guanine-quartets that can disrupt critical cellular functions such as replication and transcription. When sequences that can adopt Non-B structures including G4 DNA are located within actively transcribed genes, the reshaping of DNA topology necessary for transcription process stimulates secondary structure-formation thereby amplifying the potential for genome instability. Using a reporter assay designed to study G4-induced recombination in the context of an actively transcribed locus in Saccharomyces cerevisiae, we tested whether co-transcriptional activator Sub1, recently identified as a G4-binding factor, contributes to genome maintenance at G4-forming sequences. Our data indicate that, upon Sub1-disruption, genome instability linked to co-transcriptionally formed G4 DNA in Top1-deficient cells is significantly augmented and that its highly conserved DNA binding domain or the human homolog PC4 is sufficient to suppress G4-associated genome instability. We also show that Sub1 interacts specifically with co-transcriptionally formed G4 DNA in vivo and that yeast cells become highly sensitivity to G4-stabilizing chemical ligands by the loss of Sub1. Finally, we demonstrate the physical and genetic interaction of Sub1 with the G4-resolving helicase Pif1, suggesting a possible mechanism by which Sub1 suppresses instability at G4 DNA. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Evaluating imputation algorithms for low-depth genotyping-by-sequencing (GBS) data

USDA-ARS?s Scientific Manuscript database

Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordabl...
GLINT: a user-friendly toolset for the analysis of high-throughput DNA-methylation array data.

PubMed

Rahmani, Elior; Yedidim, Reut; Shenhav, Liat; Schweiger, Regev; Weissbrod, Omer; Zaitlen, Noah; Halperin, Eran

2017-06-15

GLINT is a user-friendly command-line toolset for fast analysis of genome-wide DNA methylation data generated using the Illumina human methylation arrays. GLINT, which does not require any programming proficiency, allows an easy execution of Epigenome-Wide Association Study analysis pipeline under different models while accounting for known confounders in methylation data. GLINT is a command-line software, freely available at https://github.com/cozygene/glint/releases . It requires Python 2.7 and several freely available Python packages. Further information and documentation as well as a quick start tutorial are available at http://glint-epigenetics.readthedocs.io . elior.rahmani@gmail.com or ehalperin@cs.ucla.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Development of a 690K SNP array in catfish and its application for genetic mapping and validation of the reference genome sequence

USDA-ARS?s Scientific Manuscript database

Single nucleotide polymorphisms (SNPs) are capable of providing the highest level of genome coverage for genomic and genetic analysis because of their abundance and relatively even distribution in the genome. Such a capacity, however, cannot be achieved without an efficient genotyping platform such ...
Optimization of oligonucleotide arrays and RNA amplification protocols for analysis of transcript structure and alternative splicing.

PubMed

Castle, John; Garrett-Engele, Phil; Armour, Christopher D; Duenwald, Sven J; Loerch, Patrick M; Meyer, Michael R; Schadt, Eric E; Stoughton, Roland; Parrish, Mark L; Shoemaker, Daniel D; Johnson, Jason M

2003-01-01

Microarrays offer a high-resolution means for monitoring pre-mRNA splicing on a genomic scale. We have developed a novel, unbiased amplification protocol that permits labeling of entire transcripts. Also, hybridization conditions, probe characteristics, and analysis algorithms were optimized for detection of exons, exon-intron edges, and exon junctions. These optimized protocols can be used to detect small variations and isoform mixtures, map the tissue specificity of known human alternative isoforms, and provide a robust, scalable platform for high-throughput discovery of alternative splicing.
Optimization of oligonucleotide arrays and RNA amplification protocols for analysis of transcript structure and alternative splicing

PubMed Central

Castle, John; Garrett-Engele, Phil; Armour, Christopher D; Duenwald, Sven J; Loerch, Patrick M; Meyer, Michael R; Schadt, Eric E; Stoughton, Roland; Parrish, Mark L; Shoemaker, Daniel D; Johnson, Jason M

2003-01-01

Microarrays offer a high-resolution means for monitoring pre-mRNA splicing on a genomic scale. We have developed a novel, unbiased amplification protocol that permits labeling of entire transcripts. Also, hybridization conditions, probe characteristics, and analysis algorithms were optimized for detection of exons, exon-intron edges, and exon junctions. These optimized protocols can be used to detect small variations and isoform mixtures, map the tissue specificity of known human alternative isoforms, and provide a robust, scalable platform for high-throughput discovery of alternative splicing. PMID:14519201
The Pathogen-Host Interactions database (PHI-base): additions and future developments

PubMed Central

Urban, Martin; Pant, Rashmi; Raghunath, Arathi; Irvine, Alistair G.; Pedro, Helder; Hammond-Kosack, Kim E.

2015-01-01

Rapidly evolving pathogens cause a diverse array of diseases and epidemics that threaten crop yield, food security as well as human, animal and ecosystem health. To combat infection greater comparative knowledge is required on the pathogenic process in multiple species. The Pathogen-Host Interactions database (PHI-base) catalogues experimentally verified pathogenicity, virulence and effector genes from bacterial, fungal and protist pathogens. Mutant phenotypes are associated with gene information. The included pathogens infect a wide range of hosts including humans, animals, plants, insects, fish and other fungi. The current version, PHI-base 3.6, available at http://www.phi-base.org, stores information on 2875 genes, 4102 interactions, 110 host species, 160 pathogenic species (103 plant, 3 fungal and 54 animal infecting species) and 181 diseases drawn from 1243 references. Phenotypic and gene function information has been obtained by manual curation of the peer-reviewed literature. A controlled vocabulary consisting of nine high-level phenotype terms permits comparisons and data analysis across the taxonomic space. PHI-base phenotypes were mapped via their associated gene information to reference genomes available in Ensembl Genomes. Virulence genes and hotspots can be visualized directly in genome browsers. Future plans for PHI-base include development of tools facilitating community-led curation and inclusion of the corresponding host target(s). PMID:25414340
Lesion complexity drives age related cancer susceptibility in human mammary epithelial cells

DOE PAGES

Sridharan, Deepa M.; Enerio, Shiena; Stampfer, Martha M.; ...

2017-02-28

Exposures to various DNA damaging agents can deregulate a wide array of critical mechanisms that maintain genome integrity. It is unclear how these processes are impacted by one's age at the time of exposure and the complexity of the DNA lesion. To clarify this, we employed radiation as a tool to generate simple and complex lesions in normal primary human mammary epithelial cells derived from women of various ages. We hypothesized that genomic instability in the progeny of older cells exposed to complex damages will be exacerbated by age-associated deterioration in function and accentuate age-related cancer predisposition. Centrosome aberrations andmore » changes in stem cell numbers were examined to assess cancer susceptibility. Our data show that the frequency of centrosome aberrations proportionately increases with age following complex damage causing exposures. However, a dose-dependent increase in stem cell numbers was independent of both age and the nature of the insult. Phospho-protein signatures provide mechanistic clues to signaling networks implicated in these effects. Together these studies suggest that complex damage can threaten the genome stability of the stem cell population in older people. Propagation of this instability is subject to influence by the microenvironment and will ultimately define cancer risk in the older population.« less
Safety paradigm: genetic evaluation of therapeutic grade human embryonic stem cells.

PubMed

Stephenson, Emma; Ogilvie, Caroline Mackie; Patel, Heema; Cornwell, Glenda; Jacquet, Laureen; Kadeva, Neli; Braude, Peter; Ilic, Dusko

2010-12-06

The use of stem cells for regenerative medicine has captured the imagination of the public, with media attention contributing to rising expectations of clinical benefits. Human embryonic stem cells (hESCs) are the best model for capital investment in stem cell therapy and there is a clear need for their robust genetic characterization before scaling-up cell expansion for that purpose. We have to be certain that the genome of the starting material is stable and normal, but the limited resolution of conventional karyotyping is unable to give us such assurance. Advanced molecular cytogenetic technologies such as array comparative genomic hybridization for identifying chromosomal imbalances, and single nucleotide polymorphism analysis for identifying ethnic background and loss of heterozygosity should be introduced as obligatory diagnostic tests for each newly derived hESC line before it is deposited in national stem cell banks. If this new quality standard becomes a requirement, as we are proposing here, it would facilitate and accelerate the banking process, since end-users would be able to select the most appropriate line for their particular application, thus improving efficiency and streamlining the route to manufacturing therapeutics. The pharmaceutical industry, which may use hESC-derived cells for drug screening, should not ignore their genomic profile as this may risk misinterpretation of results and significant waste of resources.
Bilateral wilms tumor with TP53-related anaplasia.

PubMed

Popov, Sergey D; Vujanic, Gordan M; Sebire, Neil J; Chagtai, Tasnim; Williams, Richard; Vaidya, Sucheta; Pritchard-Jones, Kathy

2013-01-01

Wilms tumor (WT) with diffuse anaplasia has an unfavorable prognosis and is often (>70%) associated with mutations in the TP53 gene. Although most WTs are unilateral, 5-10% are bilateral, and they are almost always present with nephrogenic rests. The latter are considered a precursor of WT. Two cases of bilateral WTs with nephroblastomatosis, in which anaplastic changes were detected over a period of time, were analyzed using clinical, radiological, histopathological, and molecular-genetic data. TP53 was analyzed by direct sequencing of its full coding sequence and intron-exon boundaries in 11 fragments. DNA was extracted from paraffin-embedded or frozen specimens. High-resolution genomic copy number profiling was carried out by UCL Genomics on the Affymetrix Human Mapping 250K Nsp or Genome-Wide Human SNP Array 6.0 platform. Both cases demonstrated a strong association between the appearance of anaplastic clones and TP53 mutations. Synchronous ganglioneuroma was diagnosed in one case. Our cases are unique as they represent a long disease history and demonstrate the difficulties in managing rare cases of bilateral WT with anaplasia. These cases also emphasize the practical importance of modern molecular-genetic techniques and their clinical application. Moreover, they highlight the issue of the adequate sampling needed in order to gather comprehensive, efficient, and sufficient information about genetic events in a single tumor.
Lesion complexity drives age related cancer susceptibility in human mammary epithelial cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sridharan, Deepa M.; Enerio, Shiena; Stampfer, Martha M.

Exposures to various DNA damaging agents can deregulate a wide array of critical mechanisms that maintain genome integrity. It is unclear how these processes are impacted by one's age at the time of exposure and the complexity of the DNA lesion. To clarify this, we employed radiation as a tool to generate simple and complex lesions in normal primary human mammary epithelial cells derived from women of various ages. We hypothesized that genomic instability in the progeny of older cells exposed to complex damages will be exacerbated by age-associated deterioration in function and accentuate age-related cancer predisposition. Centrosome aberrations andmore » changes in stem cell numbers were examined to assess cancer susceptibility. Our data show that the frequency of centrosome aberrations proportionately increases with age following complex damage causing exposures. However, a dose-dependent increase in stem cell numbers was independent of both age and the nature of the insult. Phospho-protein signatures provide mechanistic clues to signaling networks implicated in these effects. Together these studies suggest that complex damage can threaten the genome stability of the stem cell population in older people. Propagation of this instability is subject to influence by the microenvironment and will ultimately define cancer risk in the older population.« less
An integrated genomics analysis of epigenetic subtypes in human breast tumors links DNA methylation patterns to chromatin states in normal mammary cells.

PubMed

Holm, Karolina; Staaf, Johan; Lauss, Martin; Aine, Mattias; Lindgren, David; Bendahl, Pär-Ola; Vallon-Christersson, Johan; Barkardottir, Rosa Bjork; Höglund, Mattias; Borg, Åke; Jönsson, Göran; Ringnér, Markus

2016-02-29

Aberrant DNA methylation is frequently observed in breast cancer. However, the relationship between methylation patterns and the heterogeneity of breast cancer has not been comprehensively characterized. Whole-genome DNA methylation analysis using Illumina Infinium HumanMethylation450 BeadChip arrays was performed on 188 human breast tumors. Unsupervised bootstrap consensus clustering was performed to identify DNA methylation epigenetic subgroups (epitypes). The Cancer Genome Atlas data, including methylation profiles of 669 human breast tumors, was used for validation. The identified epitypes were characterized by integration with publicly available genome-wide data, including gene expression levels, DNA copy numbers, whole-exome sequencing data, and chromatin states. We identified seven breast cancer epitypes. One epitype was distinctly associated with basal-like tumors and with BRCA1 mutations, one epitype contained a subset of ERBB2-amplified tumors characterized by multiple additional amplifications and the most complex genomes, and one epitype displayed a methylation profile similar to normal epithelial cells. Luminal tumors were stratified into the remaining four epitypes, with differences in promoter hypermethylation, global hypomethylation, proliferative rates, and genomic instability. Specific hyper- and hypomethylation across the basal-like epitype was rare. However, we observed that the candidate genomic instability drivers BRCA1 and HORMAD1 displayed aberrant methylation linked to gene expression levels in some basal-like tumors. Hypomethylation in luminal tumors was associated with DNA repeats and subtelomeric regions. We observed two dominant patterns of aberrant methylation in breast cancer. One pattern, constitutively methylated in both basal-like and luminal breast cancer, was linked to genes with promoters in a Polycomb-repressed state in normal epithelial cells and displayed no correlation with gene expression levels. The second pattern correlated with gene expression levels and was associated with methylation in luminal tumors and genes with active promoters in normal epithelial cells. Our results suggest that hypermethylation patterns across basal-like breast cancer may have limited influence on tumor progression and instead reflect the repressed chromatin state of the tissue of origin. On the contrary, hypermethylation patterns specific to luminal breast cancer influence gene expression, may contribute to tumor progression, and may present an actionable epigenetic alteration in a subset of luminal breast cancers.
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration

PubMed Central

Thorvaldsdóttir, Helga; Mesirov, Jill P.

2013-01-01

Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today’s sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license. PMID:22517427
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

PubMed

Thorvaldsdóttir, Helga; Robinson, James T; Mesirov, Jill P

2013-03-01

Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license.
Automated array-based genomic profiling in chronic lymphocytic leukemia: Development of a clinical tool and discovery of recurrent genomic alterations

PubMed Central

Schwaenen, Carsten; Nessling, Michelle; Wessendorf, Swen; Salvi, Tatjana; Wrobel, Gunnar; Radlwimmer, Bernhard; Kestler, Hans A.; Haslinger, Christian; Stilgenbauer, Stephan; Döhner, Hartmut; Bentz, Martin; Lichter, Peter

2004-01-01

B cell chronic lymphocytic leukemia (B-CLL) is characterized by a highly variable clinical course. Recurrent chromosomal imbalances provide significant prognostic markers. Risk-adapted therapy based on genomic alterations has become an option that is currently being tested in clinical trials. To supply a robust tool for such large scale studies, we developed a comprehensive DNA microarray dedicated to the automated analysis of recurrent genomic imbalances in B-CLL by array-based comparative genomic hybridization (matrix–CGH). Validation of this chip in a series of 106 B-CLL cases revealed a high specificity and sensitivity that fulfils the criteria for application in clinical oncology. This chip is immediately applicable within clinical B-CLL treatment trials that evaluate whether B-CLL cases with distinct chromosomal abnormalities should be treated with chemotherapy of different intensities and/or stem cell transplantation. Through the control set of DNA fragments equally distributed over the genome, recurrent genomic imbalances were discovered: trisomy of chromosome 19 and gain of the MYCN oncogene correlating with an elevation of MYCN mRNA expression. PMID:14730057
Creation of a Human Secretome: A Novel Composite Library of Human Secreted Proteins: Validation Using Ovarian Cancer Gene Expression Data and a Virtual Secretome Array.

PubMed

Vathipadiekal, Vinod; Wang, Victoria; Wei, Wei; Waldron, Levi; Drapkin, Ronny; Gillette, Michael; Skates, Steven; Birrer, Michael

2015-11-01

To generate a comprehensive "Secretome" of proteins potentially found in the blood and derive a virtual Affymetrix array. To validate the utility of this database for the discovery of novel serum-based biomarkers using ovarian cancer transcriptomic data. The secretome was constructed by aggregating the data from databases of known secreted proteins, transmembrane or membrane proteins, signal peptides, G-protein coupled receptors, or proteins existing in the extracellular region, and the virtual array was generated by mapping them to Affymetrix probeset identifiers. Whole-genome microarray data from ovarian cancer, normal ovarian surface epithelium, and fallopian tube epithelium were used to identify transcripts upregulated in ovarian cancer. We established the secretome from eight public databases and a virtual array consisting of 16,521 Affymetrix U133 Plus 2.0 probesets. Using ovarian cancer transcriptomic data, we identified candidate blood-based biomarkers for ovarian cancer and performed bioinformatic validation by demonstrating rediscovery of known biomarkers including CA125 and HE4. Two novel top biomarkers (FGF18 and GPR172A) were validated in serum samples from an independent patient cohort. We present the secretome, comprising the most comprehensive resource available for protein products that are potentially found in the blood. The associated virtual array can be used to translate gene-expression data into cancer biomarker discovery. A list of blood-based biomarkers for ovarian cancer detection is reported and includes CA125 and HE4. FGF18 and GPR172A were identified and validated by ELISA as being differentially expressed in the serum of ovarian cancer patients compared with controls. ©2015 American Association for Cancer Research.
Comparative genomics of wild type yeast strains unveils important genome diversity

PubMed Central

Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

2008-01-01

Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome. PMID:18983662
Function of the CRISPR-Cas System of the Human Pathogen Clostridium difficile

PubMed Central

Boudry, Pierre; Semenova, Ekaterina; Monot, Marc; Datsenko, Kirill A.; Lopatina, Anna; Sekulovic, Ognjen; Ospina-Bedoya, Maicol; Fortier, Louis-Charles; Severinov, Konstantin; Dupuy, Bruno

2015-01-01

ABSTRACT Clostridium difficile is the cause of most frequently occurring nosocomial diarrhea worldwide. As an enteropathogen, C. difficile must be exposed to multiple exogenous genetic elements in bacteriophage-rich gut communities. CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems allow bacteria to adapt to foreign genetic invaders. Our recent data revealed active expression and processing of CRISPR RNAs from multiple type I-B CRISPR arrays in C. difficile reference strain 630. Here, we demonstrate active expression of CRISPR arrays in strain R20291, an epidemic C. difficile strain. Through genome sequencing and host range analysis of several new C. difficile phages and plasmid conjugation experiments, we provide evidence of defensive function of the CRISPR-Cas system in both C. difficile strains. We further demonstrate that C. difficile Cas proteins are capable of interference in a heterologous host, Escherichia coli. These data set the stage for mechanistic and physiological analyses of CRISPR-Cas-mediated interactions of important global human pathogen with its genetic parasites. PMID:26330515
Integration of Murine and Human Studies for Mapping Periodontitis Susceptibility.

PubMed

Nashef, A; Qabaja, R; Salaymeh, Y; Botzman, M; Munz, M; Dommisch, H; Krone, B; Hoffmann, P; Wellmann, J; Laudes, M; Berger, K; Kocher, T; Loos, B; van der Velde, N; Uitterlinden, A G; de Groot, L C P G M; Franke, A; Offenbacher, S; Lieb, W; Divaris, K; Mott, R; Gat-Viks, I; Wiess, E; Schaefer, A; Iraqi, F A; Haddad, Y H

2018-05-01

Periodontitis is one of the most common inflammatory human diseases with a strong genetic component. Due to the limited sample size of available periodontitis cohorts and the underlying trait heterogeneity, genome-wide association studies (GWASs) of chronic periodontitis (CP) have largely been unsuccessful in identifying common susceptibility factors. A combination of quantitative trait loci (QTL) mapping in mice with association studies in humans has the potential to discover novel risk loci. To this end, we assessed alveolar bone loss in response to experimental periodontal infection in 25 lines (286 mice) from the Collaborative Cross (CC) mouse population using micro-computed tomography (µCT) analysis. The orthologous human chromosomal regions of the significant QTL were analyzed for association using imputed genotype data (OmniExpress BeadChip arrays) derived from case-control samples of aggressive periodontitis (AgP; 896 cases, 7,104 controls) and chronic periodontitis (CP; 2,746 cases, 1,864 controls) of northwest European and European American descent, respectively. In the mouse genome, QTL mapping revealed 2 significant loci (-log P = 5.3; false discovery rate = 0.06) on chromosomes 1 ( Perio3) and 14 ( Perio4). The mapping resolution ranged from ~1.5 to 3 Mb. Perio3 overlaps with a previously reported QTL associated with residual bone volume in F2 cross and includes the murine gene Ccdc121. Its human orthologue showed previously a nominal significant association with CP in humans. Use of variation data from the genomes of the CC founder strains further refined the QTL and suggested 7 candidate genes ( CAPN8, DUSP23, PCDH17, SNORA17, PCDH9, LECT1, and LECT2). We found no evidence of association of these candidates with the human orthologues. In conclusion, the CC populations enabled mapping of confined QTL that confer susceptibility to alveolar bone loss in mice and larger human phenotype-genotype samples and additional expression data from gingival tissues are likely required to identify true positive signals.
16q24.1 microdeletion in a premature newborn: usefulness of array-based comparative genomic hybridization in persistent pulmonary hypertension of the newborn.

PubMed

Zufferey, Flore; Martinet, Danielle; Osterheld, Maria-Chiara; Niel-Bütschi, Florence; Giannoni, Eric; Schmutz, Nathalie Besuchet; Xia, Zhilian; Beckmann, Jacques S; Shaw-Smith, Charles; Stankiewicz, Pawel; Langston, Claire; Fellmann, Florence

2011-11-01

Report of a 16q24.1 deletion in a premature newborn, demonstrating the usefulness of array-based comparative genomic hybridization in persistent pulmonary hypertension of the newborn and multiple congenital malformations. Descriptive case report. Genetic department and neonatal intensive care unit of a tertiary care children's hospital. None. We report the case of a preterm male infant, born at 26 wks of gestation. A cardiac malformation and bilateral hydronephrosis were diagnosed at 19 wks of gestation. Karyotype analysis was normal, and a 22q11.2 microdeletion was excluded by fluorescence in situ hybridization analysis. A cesarean section was performed due to fetal distress. The patient developed persistent pulmonary hypertension unresponsive to mechanical ventilation and nitric oxide treatment and expired at 16 hrs of life. An autopsy revealed partial atrioventricular canal malformation and showed bilateral dilation of the renal pelvocaliceal system with bilateral ureteral stenosis and annular pancreas. Array-based comparative genomic hybridization analysis (Agilent oligoNT 44K, Agilent Technologies, Santa Clara, CA) showed an interstitial microdeletion encompassing the forkhead box gene cluster in 16q24.1. Review of the pulmonary microscopic examination showed the characteristic features of alveolar capillary dysplasia with misalignment of pulmonary veins. Some features were less prominent due to the gestational age. Our review of the literature shows that alveolar capillary dysplasia with misalignment of pulmonary veins is rare but probably underreported. Prematurity is not a usual presentation, and histologic features are difficult to interpret. In our case, array-based comparative genomic hybridization revealed a 16q24.1 deletion, leading to the final diagnosis of alveolar capillary dysplasia with misalignment of pulmonary veins. It emphasizes the usefulness of array-based comparative genomic hybridization analysis as a diagnostic tool with implications for both prognosis and management decisions in newborns with refractory persistent pulmonary hypertension and multiple congenital malformations.
Quantitative genome-wide methylation analysis of high-grade non-muscle invasive bladder cancer

PubMed Central

Kitchen, Mark O.; Bryan, Richard T.; Emes, Richard D.; Glossop, John R.; Luscombe, Christopher; Cheng, K. K.; Zeegers, Maurice P.; James, Nicholas D.; Devall, Adam J.; Mein, Charles A.; Gommersall, Lyndon; Fryer, Anthony A.; Farrell, William E.

2016-01-01

ABSTRACT High-grade non-muscle invasive bladder cancer (HG-NMIBC) is a clinically unpredictable disease with greater risks of recurrence and progression relative to their low-intermediate-grade counterparts. The molecular events, including those affecting the epigenome, that characterize this disease entity in the context of tumor development, recurrence, and progression, are incompletely understood. We therefore interrogated genome-wide DNA methylation using HumanMethylation450 BeadChip arrays in 21 primary HG-NMIBC tumors relative to normal bladder controls. Using strict inclusion-exclusion criteria we identified 1,057 hypermethylated CpGs within gene promoter-associated CpG islands, representing 256 genes. We validated the array data by bisulphite pyrosequencing and examined 25 array-identified candidate genes in an independent cohort of 30 HG-NMIBC and 18 low-intermediate-grade NMIBC. These analyses revealed significantly higher methylation frequencies in high-grade tumors relative to low-intermediate-grade tumors for the ATP5G2, IRX1 and VAX2 genes (P<0.05), and similarly significant increases in mean levels of methylation in high-grade tumors for the ATP5G2, VAX2, INSRR, PRDM14, VSX1, TFAP2b, PRRX1, and HIST1H4F genes (P<0.05). Although inappropriate promoter methylation was not invariantly associated with reduced transcript expression, a significant association was apparent for the ARHGEF4, PON3, STAT5a, and VAX2 gene transcripts (P<0.05). Herein, we present the first genome-wide DNA methylation analysis in a unique HG-NMIBC cohort, showing extensive and discrete methylation changes relative to normal bladder and low-intermediate-grade tumors. The genes we identified hold significant potential as targets for novel therapeutic intervention either alone, or in combination, with more conventional therapeutic options in the treatment of this clinically unpredictable disease. PMID:26929985

Performance of genotype imputation for low frequency and rare variants from the 1000 genomes.

PubMed

Zheng, Hou-Feng; Rong, Jing-Jing; Liu, Ming; Han, Fang; Zhang, Xing-Wei; Richards, J Brent; Wang, Li

2015-01-01

Genotype imputation is now routinely applied in genome-wide association studies (GWAS) and meta-analyses. However, most of the imputations have been run using HapMap samples as reference, imputation of low frequency and rare variants (minor allele frequency (MAF) < 5%) are not systemically assessed. With the emergence of next-generation sequencing, large reference panels (such as the 1000 Genomes panel) are available to facilitate imputation of these variants. Therefore, in order to estimate the performance of low frequency and rare variants imputation, we imputed 153 individuals, each of whom had 3 different genotype array data including 317k, 610k and 1 million SNPs, to three different reference panels: the 1000 Genomes pilot March 2010 release (1KGpilot), the 1000 Genomes interim August 2010 release (1KGinterim), and the 1000 Genomes phase1 November 2010 and May 2011 release (1KGphase1) by using IMPUTE version 2. The differences between these three releases of the 1000 Genomes data are the sample size, ancestry diversity, number of variants and their frequency spectrum. We found that both reference panel and GWAS chip density affect the imputation of low frequency and rare variants. 1KGphase1 outperformed the other 2 panels, at higher concordance rate, higher proportion of well-imputed variants (info>0.4) and higher mean info score in each MAF bin. Similarly, 1M chip array outperformed 610K and 317K. However for very rare variants (MAF ≤ 0.3%), only 0-1% of the variants were well imputed. We conclude that the imputation of low frequency and rare variants improves with larger reference panels and higher density of genome-wide genotyping arrays. Yet, despite a large reference panel size and dense genotyping density, very rare variants remain difficult to impute.
FULL-GENOME ANALYSIS OF ALTERNATIVE SPLICING IN MOUSE LIVER AFTER HEPATOTOXICANT EXPOSURE

EPA Science Inventory

Alternative splicing plays a role in determining gene function and protein diversity. We have employed whole genome exon profiling using Affymetrix Mouse Exon 1.0 ST arrays to understand the significance of alternative splicing on a genome-wide scale in response to multiple toxic...
Microarray-Based Analysis of Subnanogram Quantities of Microbial Community DNAs by Using Whole-Community Genome Amplification†

PubMed Central

Wu, Liyou; Liu, Xueduan; Schadt, Christopher W.; Zhou, Jizhong

2006-01-01

Microarray technology provides the opportunity to identify thousands of microbial genes or populations simultaneously, but low microbial biomass often prevents application of this technology to many natural microbial communities. We developed a whole-community genome amplification-assisted microarray detection approach based on multiple displacement amplification. The representativeness of amplification was evaluated using several types of microarrays and quantitative indexes. Representative detection of individual genes or genomes was obtained with 1 to 100 ng DNA from individual or mixed genomes, in equal or unequal abundance, and with 1 to 500 ng community DNAs from groundwater. Lower concentrations of DNA (as low as 10 fg) could be detected, but the lower template concentrations affected the representativeness of amplification. Robust quantitative detection was also observed by significant linear relationships between signal intensities and initial DNA concentrations ranging from (i) 0.04 to 125 ng (r2 = 0.65 to 0.99) for DNA from pure cultures as detected by whole-genome open reading frame arrays, (ii) 0.1 to 1,000 ng (r2 = 0.91) for genomic DNA using community genome arrays, and (iii) 0.01 to 250 ng (r2 = 0.96 to 0.98) for community DNAs from ethanol-amended groundwater using 50-mer functional gene arrays. This method allowed us to investigate the oligotrophic microbial communities in groundwater contaminated with uranium and other metals. The results indicated that microorganisms containing genes involved in contaminant degradation and immobilization are present in these communities, that their spatial distribution is heterogeneous, and that microbial diversity is greatly reduced in the highly contaminated environment. PMID:16820490
The human-induced pluripotent stem cell initiative-data resources for cellular genetics.

PubMed

Streeter, Ian; Harrison, Peter W; Faulconbridge, Adam; Flicek, Paul; Parkinson, Helen; Clarke, Laura

2017-01-04

The Human Induced Pluripotent Stem Cell Initiative (HipSci) isf establishing a large catalogue of human iPSC lines, arguably the most well characterized collection to date. The HipSci portal enables researchers to choose the right cell line for their experiment, and makes HipSci's rich catalogue of assay data easy to discover and reuse. Each cell line has genomic, transcriptomic, proteomic and cellular phenotyping data. Data are deposited in the appropriate EMBL-EBI archives, including the European Nucleotide Archive (ENA), European Genome-phenome Archive (EGA), ArrayExpress and PRoteomics IDEntifications (PRIDE) databases. The project will make 500 cell lines from healthy individuals, and from 150 patients with rare genetic diseases; these will be available through the European Collection of Authenticated Cell Cultures (ECACC). As of August 2016, 238 cell lines are available for purchase. Project data is presented through the HipSci data portal (http://www.hipsci.org/lines) and is downloadable from the associated FTP site (ftp://ftp.hipsci.ebi.ac.uk/vol1/ftp). The data portal presents a summary matrix of the HipSci cell lines, showing available data types. Each line has its own page containing descriptive metadata, quality information, and links to archived assay data. Analysis results are also available in a Track Hub, allowing visualization in the context of public genomic annotations (http://www.hipsci.org/data/trackhubs). © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
'FloraArray' for screening of specific DNA probes representing the characteristics of a certain microbial community.

PubMed

Yokoi, Takahide; Kaku, Yoshiko; Suzuki, Hiroyuki; Ohta, Masayuki; Ikuta, Hajime; Isaka, Kazuichi; Sumino, Tatsuo; Wagatsuma, Masako

2007-08-01

To investigate uncharacterized microbial communities, a custom DNA microarray named 'FloraArray' was developed for screening specific probes that would represent the characteristics of a microbial community. The array was prepared by spotting 2000 plasmid DNAs from a genomic shotgun library of a sludge sample on a DNA microarray. By comparative hybridization of the array with two different samples of genomic DNA, one from the activated sludge and the other from a nonactivated sludge sample of an anaerobic ammonium oxidation (anammox) bacterial community, specific spots were visualized as a definite fluctuating profile in an MA (differential intensity ratio vs. spot intensity) plot. About 300 spots of the array accounted for the candidate probes to represent anammox reaction of the activated sludge. After sequence analysis of the probes and examination of the results of blastn searches against the reported anammox reference sequence, complete matches were found for 161 probes (58.3%) and >90% matches were found for 242 probes (87.1%). These results demonstrate that 'FloraArray' could be a useful tool for screening specific DNA molecules of unknown microbial communities.
aCGH Local Copy Number Aberrations Associated with Overall Copy Number Genomic Instability in Colorectal Cancer: Coordinate Involvement of the Regions Including BCR and ABL

PubMed Central

Bartos, Jeremy D.; Gaile, Daniel P.; McQuaid, Devin E.; Conroy, Jeffrey M.; Darbary, Huferesh; Nowak, Norma J.; Block, Annemarie; Petrelli, Nicholas J.; Mittelman, Arnold; Stoler, Daniel L.; Anderson, Garth R.

2007-01-01

In order to identify small regions of the genome whose specific copy number alteration is associated with high genomic instability in the form of overall genome-wide copy number aberrations, we have analyzed array-based comparative genomic hybridization (aCGH) data from 33 sporadic colorectal carcinomas. Copy number changes of a small number of specific regions were significantly correlated with elevated overall amplifications and deletions scattered throughout the entire genome. One significant region at 9q34 includes the c-ABL gene Another region spanning 22q11–13 includes the breakpoint cluster region (BCR) of the Philadelphia chromosome Coordinate 22q11–13 alterations were observed in nine of eleven tumors with the 9q34 alteration Additional regions on 1q and 14q were associated with overall genome-wide copy number changes, while copy number aberrations on chromosome 7p, 7q, and 13q21.1–31.3 were found associated with this instability only in tumors from patients with a smoking history Our analysis demonstrates there are a small number of regions of the genome where gain or loss is commonly associated with a tumor’s overall level of copy number aberrations Our finding BCR and ABL located within two of the instability-associated regions, and the involvement of these two regions occurring coordinately, suggests a system akin to the BCR-ABL translocation of CML may be involved in genomic instability in about one-third of human colorectal carcinomas. PMID:17196995
The 1000 Genomes Project: new opportunities for research and social challenges

PubMed Central

2010-01-01

The 1000 Genomes Project, an international collaboration, is sequencing the whole genome of approximately 2,000 individuals from different worldwide populations. The central goal of this project is to describe most of the genetic variation that occurs at a population frequency greater than 1%. The results of this project will allow scientists to identify genetic variation at an unprecedented degree of resolution and will also help improve the imputation methods for determining unobserved genetic variants that are not represented on current genotyping arrays. By identifying novel or rare functional genetic variants, researchers will be able to pinpoint disease-causing genes in genomic regions initially identified by association studies. This level of detailed sequence information will also improve our knowledge of the evolutionary processes and the genomic patterns that have shaped the human species as we know it today. The new data will also lay the foundation for future clinical applications, such as prediction of disease susceptibility and drug response. However, the forthcoming availability of whole genome sequences at affordable prices will raise ethical concerns and pose potential threats to individual privacy. Nevertheless, we believe that these potential risks are outweighed by the benefits in terms of diagnosis and research, so long as rigorous safeguards are kept in place through legislation that prevents discrimination on the basis of the results of genetic testing. PMID:20193048
Digestive tumor bank protocol: from surgical specimens to genomic studies of digestive cancers.

PubMed

Popescu, I; Stroescu, C; Dumitrascu, T; Herlea, V; Paslaru, Liliana; Lazar, V; Boissin, H; Taieb, J; Horeanga, Ionela

2006-01-01

Cancer is a complex polygenic and multifactorial disease, resulting from successive dynamic changes in the genome of somatic cells and from the accumulation of molecular alterations in both tumour cells and host cells. For the majority of cancers, including many malignancies of the gastrointestinal tract, our current means of diagnosis and treatment of the tumors are grossly insufficient. In recent years the development of several gene expression profiling methods such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE) and DNA arrays, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complete cascade of molecular events leading to tumor development and progression. Given the central role played by surgeons in the current management of patients with solid cancers, it is of paramount importance for them to know the principles characterizing this laboratory tools to critically assess the results originating from this biotechnology. We describe in this article the scientific partnership between Fundeni Clinical Institute Bucharest, Romania and RNtech Company, Paris, France for the development of a center of biological resources (Biobank) as well as the standardized protocol of working with the biological samples, the ongoing projects and the future perspectives.
Krüppel-like factors: three fingers in control.

PubMed

Swamynathan, Shivalingappa K

2010-04-01

Krüppel-like factors (KLFs), members of the zinc-finger family of transcription factors capable of binding GC-rich sequences, have emerged as critical regulators of important functions all over the body. They are characterised by a highly conserved C-terminal DNA-binding motif containing three C2H2 zinc-finger domains, with variable N-terminal regulatory domains. Currently, there are 17 KLFs annotated in the human genome. In spite of their structural similarity to one another, the genes encoding different KLFs are scattered all over the genome. By virtue of their ability to activate and/or repress the expression of a large number of genes, KLFs regulate a diverse array of developmental events and cellular processes, such as erythropoiesis, cardiac remodelling, adipogenesis, maintenance of stem cells, epithelial barrier formation, control of cell proliferation and neoplasia, flow-mediated endothelial gene expression, skeletal and smooth muscle development, gluconeogenesis, monocyte activation, intestinal and conjunctival goblet cell development, retinal neuronal regeneration and neonatal lung development. Characteristic features, nomenclature, evolution and functional diversities of the human KLFs are reviewed here.
Differential expression of THOC1 and ALY mRNP biogenesis/export factors in human cancers.

PubMed

Domínguez-Sánchez, María S; Sáez, Carmen; Japón, Miguel A; Aguilera, Andrés; Luna, Rosa

2011-02-17

One key step in gene expression is the biogenesis of mRNA ribonucleoparticle complexes (mRNPs). Formation of the mRNP requires the participation of a number of conserved factors such as the THO complex. THO interacts physically and functionally with the Sub2/UAP56 RNA-dependent ATPase, and the Yra1/REF1/ALY RNA-binding protein linking transcription, mRNA export and genome integrity. Given the link between genome instability and cancer, we have performed a comparative analysis of the expression patterns of THOC1, a THO complex subunit, and ALY in tumor samples. The mRNA levels were measured by quantitative real-time PCR and hybridization of a tumor tissue cDNA array; and the protein levels and distribution by immunostaining of a custom tissue array containing a set of paraffin-embedded samples of different tumor and normal tissues followed by statistical analysis. We show that the expression of two mRNP factors, THOC1 and ALY are altered in several tumor tissues. THOC1 mRNA and protein levels are up-regulated in ovarian and lung tumors and down-regulated in those of testis and skin, whereas ALY is altered in a wide variety of tumors. In contrast to THOC1, ALY protein is highly detected in normal proliferative cells, but poorly in high-grade cancers. These results suggest a differential connection between tumorogenesis and the expression levels of human THO and ALY. This study opens the possibility of defining mRNP biogenesis factors as putative players in cell proliferation that could contribute to tumor development.
SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

PubMed

Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin

2017-01-01

Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.
FOXM1 upregulation is an early event in human squamous cell carcinoma and it is enhanced by nicotine during malignant transformation.

PubMed

Gemenetzidis, Emilios; Bose, Amrita; Riaz, Adeel M; Chaplin, Tracy; Young, Bryan D; Ali, Muhammad; Sugden, David; Thurlow, Johanna K; Cheong, Sok-Ching; Teo, Soo-Hwang; Wan, Hong; Waseem, Ahmad; Parkinson, Eric K; Fortune, Farida; Teh, Muy-Teck

2009-01-01

Cancer associated with smoking and drinking remains a serious health problem worldwide. The survival of patients is very poor due to the lack of effective early biomarkers. FOXM1 overexpression is linked to the majority of human cancers but its mechanism remains unclear in head and neck squamous cell carcinoma (HNSCC). FOXM1 mRNA and protein expressions were investigated in four independent cohorts (total 75 patients) consisting of normal, premalignant and HNSCC tissues and cells using quantitative PCR (qPCR), expression microarray, immunohistochemistry and immunocytochemistry. Effect of putative oral carcinogens on FOXM1 transcriptional activity was dose-dependently assayed and confirmed using a FOXM1-specific luciferase reporter system, qPCR, immunoblotting and short-hairpin RNA interference. Genome-wide single nucleotide polymorphism (SNP) array was used to 'trace' the genomic instability signature pattern in 8 clonal lines of FOXM1-induced malignant human oral keratinocytes. Furthermore, acute FOXM1 upregulation in primary oral keratinocytes directly induced genomic instability. We have shown for the first time that overexpression of FOXM1 precedes HNSCC malignancy. Screening putative carcinogens in human oral keratinocytes surprisingly showed that nicotine, which is not perceived to be a human carcinogen, directly induced FOXM1 mRNA, protein stabilisation and transcriptional activity at concentrations relevant to tobacco chewers. Importantly, nicotine also augmented FOXM1-induced transformation of human oral keratinocytes. A centrosomal protein CEP55 and a DNA helicase/putative stem cell marker HELLS, both located within a consensus loci (10q23), were found to be novel targets of FOXM1 and their expression correlated tightly with HNSCC progression. This study cautions the potential co-carcinogenic effect of nicotine in tobacco replacement therapies. We hypothesise that aberrant upregulation of FOXM1 may be inducing genomic instability through a program of malignant transformation involving the activation of CEP55 and HELLS which may facilitate aberrant mitosis and epigenetic modifications. Our finding that FOXM1 is upregulated early during oral cancer progression renders FOXM1 an attractive diagnostic biomarker for early cancer detection and its candidate mechanistic targets, CEP55 and HELLS, as indicators of malignant conversion and progression.
missMethyl: an R package for analyzing data from Illumina's HumanMethylation450 platform.

PubMed

Phipson, Belinda; Maksimovic, Jovana; Oshlack, Alicia

2016-01-15

DNA methylation is one of the most commonly studied epigenetic modifications due to its role in both disease and development. The Illumina HumanMethylation450 BeadChip is a cost-effective way to profile >450 000 CpGs across the human genome, making it a popular platform for profiling DNA methylation. Here we introduce missMethyl, an R package with a suite of tools for performing normalization, removal of unwanted variation in differential methylation analysis, differential variability testing and gene set analysis for the 450K array. missMethyl is an R package available from the Bioconductor project at www.bioconductor.org. alicia.oshlack@mcri.edu.au Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Customized Oligonucleotide Array-Based Comparative Genomic Hybridization as a Clinical Assay for Genomic Profiling of Chronic Lymphocytic Leukemia

PubMed Central

Sargent, Rachel; Jones, Dan; Abruzzo, Lynne V.; Yao, Hui; Bonderover, Jaime; Cisneros, Marissa; Wierda, William G.; Keating, Michael J.; Luthra, Rajyalakshmi

2009-01-01

Chromosome gains and losses used for risk stratification in chronic lymphocytic leukemia (CLL) are commonly assessed by multiprobe fluorescence in situ hybridization (FISH) studies. We designed and validated a customized array-comparative genomic hybridization (aCGH) platform as a clinical assay for CLL genomic profiling. A 60-mer, 44,000-probe oligonucleotide array with a 50-kb average spatial resolution was augmented with high-density probe tiling at loci that are frequently aberrant in CLL. Aberrations identified by aCGH were compared with those identified by a FISH panel, including locus-specific probes to ATM (11q22.3), the centromeric region of chromosome 12 (12p11.1–q11), D13S319 (13q14.3), LAMP1 (13q34), and TP53 (17p13.1). In 100 CLL samples, aCGH/FISH concordance was seen for 89% of FISH-called aberrations at the ATM (n = 18), D13S319 (n = 42), LAMP (n = 12), and TP53 (n = 22) loci and for chromosome 12 (n = 14). Eighty-four percentage of FISH/aCGH discordant calls were in samples either at or below the limit of aCGH sensitivity (10% to 25% FISH aberration-containing cells). Therefore, aCGH profiling is a feasible routine clinical test with comparable results to multiprobe FISH studies; however, it may be less sensitive than FISH in cases with low-level aberrations. Further, a customized array design can provide comprehensive genomic profiling with additional accuracy in both identifying and defining the extent of small aberrations at target loci. PMID:19074592
Genome-wide comparison of paired fresh frozen and formalin-fixed paraffin-embedded gliomas by custom BAC and oligonucleotide array comparative genomic hybridization: facilitating analysis of archival gliomas

PubMed Central

Mohapatra, Gayatry; Engler, David A.; Starbuck, Kristen D.; Kim, James C.; Bernay, Derek C.; Scangas, George A.; Rousseau, Audrey; Batchelor, Tracy T.; Betensky, Rebecca A.; Louis, David N.

2010-01-01

Molecular genetic analysis of cancer is rapidly evolving as a result of improvement in genomic technologies and the growing applicability of such analyses to clinical oncology. Array based comparative genomic hybridization (aCGH) is a powerful tool for detecting DNA copy number alterations (CNA), particularly in solid tumors, and has been applied to the study of malignant gliomas. In the clinical setting, however, gliomas are often sampled by small biopsies and thus formalin-fixed paraffin-embedded (FFPE) blocks are often the only tissue available for genetic analysis, especially for rare types of gliomas. Moreover, the biological basis for the marked intratumoral heterogeneity in gliomas is most readily addressed in FFPE material. Therefore, for gliomas, the ability to use DNA from FFPE tissue is essential for both clinical and research applications. In this study, we have constructed a custom bacterial artificial chromosome (BAC) array and show excellent sensitivity and specificity for detecting CNAs in a panel of paired frozen and FFPE glioma samples. Our study demonstrates a high concordance rate between CNAs detected in FFPE compared to frozen DNA. We have also developed a method of labeling DNA from FFPE tissue that allows efficient hybridization to oligonucleotide arrays. This labeling technique was applied to a panel of biphasic anaplastic oligoastrocytomas (AOA) to identify genetic changes unique to each component. Together, results from these studies suggest that BAC and oligonucleotide aCGH are sensitive tools for detecting CNAs in FFPE DNA, and can enable genome-wide analysis of rare, small and/or histologically heterogeneous gliomas. PMID:21080181
Analysis of X chromosome genomic DNA sequence copy number variation associated with premature ovarian failure (POF)

PubMed Central

Quilter, C.R.; Karcanias, A.C.; Bagga, M.R.; Duncan, S.; Murray, A.; Conway, G.S.; Sargent, C.A.; Affara, N.A.

2013-01-01

BACKGROUND Premature ovarian failure (POF) is a heterogeneous disease defined as amenorrhoea for >6 months before age 40, with an FSH serum level >40 mIU/ml (menopausal levels). While there is a strong genetic association with POF, familial studies have also indicated that idiopathic POF may also be genetically linked. Conventional cytogenetic analyses have identified regions of the X chromosome that are strongly associated with ovarian function, as well as several POF candidate genes. Cryptic chromosome abnormalities that have been missed might be detected by array comparative genomic hybridization. METHODS In this study, samples from 42 idiopathic POF patients were subjected to a complete end-to-end X/Y chromosome tiling path array to achieve a detailed copy number variation (CNV) analysis of X chromosome involvement in POF. The arrays also contained a 1 Mb autosomal tiling path as a reference control. Quantitative PCR for selected genes contained within the CNVs was used to confirm the majority of the changes detected. The expression pattern of some of these genes in human tissue RNA was examined by reverse transcription (RT)–PCR. RESULTS A number of CNVs were identified on both Xp and Xq, with several being shared among the POF cases. Some CNVs fall within known polymorphic CNV regions, and others span previously identified POF candidate regions and genes. CONCLUSIONS The new data reported in this study reveal further discrete X chromosome intervals not previously associated with the disease and therefore implicate new clusters of candidate genes. Further studies will be required to elucidate their involvement in POF. PMID:20570974
Qualitative assessment of gene expression in affymetrix genechip arrays

NASA Astrophysics Data System (ADS)

Nagarajan, Radhakrishnan; Upreti, Meenakshi

2007-01-01

Affymetrix Genechip microarrays are used widely to determine the simultaneous expression of genes in a given biological paradigm. Probes on the Genechip array are atomic entities which by definition are randomly distributed across the array and in turn govern the gene expression. In the present study, we make several interesting observations. We show that there is considerable correlation between the probe intensities across the array which defy the independence assumption. While the mechanism behind such correlations is unclear, we show that scaling behavior and the profiles of perfect match (PM) as well as mismatch (MM) probes are similar and immune-to-background subtraction. We believe that the observed correlations are possibly an outcome of inherent non-stationarities or patchiness in the array devoid of biological significance. This is demonstrated by inspecting their scaling behavior and profiles of the PM and MM probe intensities obtained from publicly available Genechip arrays from three eukaryotic genomes, namely: Drosophila melanogaster (fruit fly), Homo sapiens (humans) and Mus musculus (house mouse) across distinct biological paradigms and across laboratories, with and without background subtraction. The fluctuation functions were estimated using detrended fluctuation analysis (DFA) with fourth-order polynomial detrending. The results presented in this study provide new insights into correlation signatures of PM and MM probe intensities and suggests the choice of DFA as a tool for qualitative assessment of Affymetrix Genechip microarrays prior to their analysis. A more detailed investigation is necessary in order to understand the source of these correlations.
Next Generation Sequencing at the University of Chicago Genomics Core

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faber, Pieter

2013-04-24

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.
Base resolution methylome profiling: considerations in platform selection, data preprocessing and analysis

PubMed Central

Sun, Zhifu; Cunningham, Julie; Slager, Susan; Kocher, Jean-Pierre

2015-01-01

Bisulfite treatment-based methylation microarray (mainly Illumina 450K Infinium array) and next-generation sequencing (reduced representation bisulfite sequencing, Agilent SureSelect Human Methyl-Seq, NimbleGen SeqCap Epi CpGiant or whole-genome bisulfite sequencing) are commonly used for base resolution DNA methylome research. Although multiple tools and methods have been developed and used for the data preprocessing and analysis, confusions remains for these platforms including how and whether the 450k array should be normalized; which platform should be used to better fit researchers’ needs; and which statistical models would be more appropriate for differential methylation analysis. This review presents the commonly used platforms and compares the pros and cons of each in methylome profiling. We then discuss approaches to study design, data normalization, bias correction and model selection for differentially methylated individual CpGs and regions. PMID:26366945
Development and mapping of DArT markers within the Festuca - Lolium complex

PubMed Central

Kopecký, David; Bartoš, Jan; Lukaszewski, Adam J; Baird, James H; Černoch, Vladimír; Kölliker, Roland; Rognli, Odd Arne; Blois, Helene; Caig, Vanessa; Lübberstedt, Thomas; Studer, Bruno; Shaw, Paul; Doležel, Jaroslav; Kilian, Andrzej

2009-01-01

Background Grasses are among the most important and widely cultivated plants on Earth. They provide high quality fodder for livestock, are used for turf and amenity purposes, and play a fundamental role in environment protection. Among cultivated grasses, species within the Festuca-Lolium complex predominate, especially in temperate regions. To facilitate high-throughput genome profiling and genetic mapping within the complex, we have developed a Diversity Arrays Technology (DArT) array for five grass species: F. pratensis, F. arundinacea, F. glaucescens, L. perenne and L. multiflorum. Results The DArTFest array contains 7680 probes derived from methyl-filtered genomic representations. In a first marker discovery experiment performed on 40 genotypes from each species (with the exception of F. glaucescens for which only 7 genotypes were used), we identified 3884 polymorphic markers. The number of DArT markers identified in every single genotype varied from 821 to 1852. To test the usefulness of DArTFest array for physical mapping, DArT markers were assigned to each of the seven chromosomes of F. pratensis using single chromosome substitution lines while recombinants of F. pratensis chromosome 3 were used to allocate the markers to seven chromosome bins. Conclusion The resources developed in this project will facilitate the development of genetic maps in Festuca and Lolium, the analysis on genetic diversity, and the monitoring of the genomic constitution of the Festuca × Lolium hybrids. They will also enable marker-assisted selection for multiple traits or for specific genome regions. PMID:19832973

Array-CGH Analysis in a Cohort of Phenotypically Well-Characterized Individuals with "Essential" Autism Spectrum Disorders

ERIC Educational Resources Information Center

Napoli, Eleonora; Russo, Serena; Casula, Laura; Alesi, Viola; Amendola, Filomena Alessandra; Angioni, Adriano; Novelli, Antonio; Valeri, Giovanni; Menghini, Deny; Vicari, Stefano

2018-01-01

Copy-number variants (CNVs) are associated with susceptibility to autism spectrum disorder (ASD). To detect the presence of CNVs, we conducted an array-comparative genomic hybridization (array-CGH) analysis in 133 children with "essential" ASD phenotype. Genetic analyses documented that 12 children had causative CNVs (C-CNVs), 29…
PTGBase: an integrated database to study tandem duplicated genes in plants.

PubMed

Yu, Jingyin; Ke, Tao; Tehrim, Sadia; Sun, Fengming; Liao, Boshou; Hua, Wei

2015-01-01

Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available. © The Author(s) 2015. Published by Oxford University Press.
A ddRAD Based Linkage Map of the Cultivated Strawberry, Fragaria xananassa

PubMed Central

Davik, Jahn; Sargent, Daniel James; Brurberg, May Bente; Lien, Sigbjørn; Kent, Matthew; Alsheikh, Muath

2015-01-01

The cultivated strawberry (Fragaria ×ananassa Duch.) is an allo-octoploid considered difficult to disentangle genetically due to its four relatively similar sub-genomic chromosome sets. This has been alleviated by the recent release of the strawberry IStraw90 whole genome genotyping array. However, array resolution relies on the genotypes used in the array construction and may be of limited general use. SNP detection based on reduced genomic sequencing approaches has the potential of providing better coverage in cases where the studied genotypes are only distantly related from the SNP array’s construction foundation. Here we have used double digest restriction-associated DNA sequencing (ddRAD) to identify SNPs in a 145 seedling F1 hybrid population raised from the cross between the cultivars Sonata (♀) and Babette (♂). A linkage map containing 907 markers which spanned 1,581.5 cM across 31 linkage groups representing the 28 chromosomes of the species. Comparing the physical span of the SNP markers with the F. vesca genome sequence, the linkage groups resolved covered 79% of the estimated 830 Mb of the F. ×ananassa genome. Here, we have developed the first linkage map for F. ×ananassa using ddRAD and show that this technique and other related techniques are useful tools for linkage map development and downstream genetic studies in the octoploid strawberry. PMID:26398886
Genome-Wide siRNA-Based Functional Genomics of Pigmentation Identifies Novel Genes and Pathways That Impact Melanogenesis in Human Cells

PubMed Central

Bodemann, Brian; Petersen, Sean; Aruri, Jayavani; Koshy, Shiney; Richardson, Zachary; Le, Lu Q.; Krasieva, Tatiana; Roth, Michael G.; Farmer, Pat; White, Michael A.

2008-01-01

Melanin protects the skin and eyes from the harmful effects of UV irradiation, protects neural cells from toxic insults, and is required for sound conduction in the inner ear. Aberrant regulation of melanogenesis underlies skin disorders (melasma and vitiligo), neurologic disorders (Parkinson's disease), auditory disorders (Waardenburg's syndrome), and opthalmologic disorders (age related macular degeneration). Much of the core synthetic machinery driving melanin production has been identified; however, the spectrum of gene products participating in melanogenesis in different physiological niches is poorly understood. Functional genomics based on RNA-mediated interference (RNAi) provides the opportunity to derive unbiased comprehensive collections of pharmaceutically tractable single gene targets supporting melanin production. In this study, we have combined a high-throughput, cell-based, one-well/one-gene screening platform with a genome-wide arrayed synthetic library of chemically synthesized, small interfering RNAs to identify novel biological pathways that govern melanin biogenesis in human melanocytes. Ninety-two novel genes that support pigment production were identified with a low false discovery rate. Secondary validation and preliminary mechanistic studies identified a large panel of targets that converge on tyrosinase expression and stability. Small molecule inhibition of a family of gene products in this class was sufficient to impair chronic tyrosinase expression in pigmented melanoma cells and UV-induced tyrosinase expression in primary melanocytes. Isolation of molecular machinery known to support autophagosome biosynthesis from this screen, together with in vitro and in vivo validation, exposed a close functional relationship between melanogenesis and autophagy. In summary, these studies illustrate the power of RNAi-based functional genomics to identify novel genes, pathways, and pharmacologic agents that impact a biological phenotype and operate outside of preconceived mechanistic relationships. PMID:19057677
Establishment of a patient-derived orthotopic osteosarcoma mouse model.

PubMed

Blattmann, Claudia; Thiemann, Markus; Stenzinger, Albrecht; Roth, Eva K; Dittmar, Anne; Witt, Hendrik; Lehner, Burkhard; Renker, Eva; Jugold, Manfred; Eichwald, Viktoria; Weichert, Wilko; Huber, Peter E; Kulozik, Andreas E

2015-04-30

Osteosarcoma (OS) is the most common pediatric primary malignant bone tumor. As the prognosis for patients following standard treatment did not improve for almost three decades, functional preclinical models that closely reflect important clinical cancer characteristics are urgently needed to develop and evaluate new treatment strategies. The objective of this study was to establish an orthotopic xenotransplanted mouse model using patient-derived tumor tissue. Fresh tumor tissue from an adolescent female patient with osteosarcoma after relapse was surgically xenografted into the right tibia of 6 immunodeficient BALB/c Nu/Nu mice as well as cultured into medium. Tumor growth was serially assessed by palpation and with magnetic resonance imaging (MRI). In parallel, a primary cell line of the same tumor was established. Histology and high-resolution array-based comparative genomic hybridization (aCGH) were used to investigate both phenotypic and genotypic characteristics of different passages of human xenografts and the cell line compared to the tissue of origin. A primary OS cell line and a primary patient-derived orthotopic xenotranplanted mouse model were established. MRI analyses and histopathology demonstrated an identical architecture in the primary tumor and in the xenografts. Array-CGH analyses of the cell line and all xenografts showed highly comparable patterns of genomic progression. So far, three further primary patient-derived orthotopic xenotranplanted mouse models could be established. We report the first orthotopic OS mouse model generated by transplantation of tumor fragments directly harvested from the patient. This model represents the morphologic and genomic identity of the primary tumor and provides a preclinical platform to evaluate new treatment strategies in OS.
Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

PubMed Central

Griffin, Darren K; Robertson, Lindsay B; Tempest, Helen G; Vignal, Alain; Fillon, Valérie; Crooijmans, Richard PMA; Groenen, Martien AM; Deryusheva, Svetlana; Gaginskaya, Elena; Carré, Wilfrid; Waddington, David; Talbot, Richard; Völker, Martin; Masabanda, Julio S; Burt, Dave W

2008-01-01

Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs) between birds. Results We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres) and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs. Conclusion This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents. PMID:18410676
Detection of human papillomaviruses by polymerase chain reaction and ligation reaction on universal microarray.

PubMed

Ritari, Jarmo; Hultman, Jenni; Fingerroos, Rita; Tarkkanen, Jussi; Pullat, Janne; Paulin, Lars; Kivi, Niina; Auvinen, Petri; Auvinen, Eeva

2012-01-01

Sensitive and specific detection of human papillomaviruses (HPV) in cervical samples is a useful tool for the early diagnosis of epithelial neoplasia and anogenital lesions. Recent studies support the feasibility of HPV DNA testing instead of cytology (Pap smear) as a primary test in population screening for cervical cancer. This is likely to be an option in the near future in many countries, and it would increase the efficiency of screening for cervical abnormalities. We present here a microarray test for the detection and typing of 15 most important high-risk HPV types and two low risk types. The method is based on type specific multiplex PCR amplification of the L1 viral genomic region followed by ligation detection reaction where two specific ssDNA probes, one containing a fluorescent label and the other a flanking ZipCode sequence, are joined by enzymatic ligation in the presence of the correct HPV PCR product. Human beta-globin is amplified in the same reaction to control for sample quality and adequacy. The genotyping capacity of our approach was evaluated against Linear Array test using cervical samples collected in transport medium. Altogether 14 out of 15 valid samples (93%) gave concordant results between our test and Linear Array. One sample was HPV56 positive in our test and high-risk positive in Hybrid Capture 2 but remained negative in Linear Array. The preliminary results suggest that our test has accurate multiple HPV genotyping capability with the additional advantages of generic detection format, and potential for high-throughput screening.
CRISPR Diversity and Microevolution in Clostridium difficile

PubMed Central

Andersen, Joakim M.; Shoup, Madelyn; Robinson, Cathy; Britton, Robert; Olsen, Katharina E.P.; Barrangou, Rodolphe

2016-01-01

Abstract Virulent strains of Clostridium difficile have become a global health problem associated with morbidity and mortality. Traditional typing methods do not provide ideal resolution to track outbreak strains, ascertain genetic diversity between isolates, or monitor the phylogeny of this species on a global basis. Here, we investigate the occurrence and diversity of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) in C. difficile to assess the potential of CRISPR-based phylogeny and high-resolution genotyping. A single Type-IB CRISPR-Cas system was identified in 217 analyzed genomes with cas gene clusters present at conserved chromosomal locations, suggesting vertical evolution of the system, assessing a total of 1,865 CRISPR arrays. The CRISPR arrays, markedly enriched (8.5 arrays/genome) compared with other species, occur both at conserved and variable locations across strains, and thus provide a basis for typing based on locus occurrence and spacer polymorphism. Clustering of strains by array composition correlated with sequence type (ST) analysis. Spacer content and polymorphism within conserved CRISPR arrays revealed phylogenetic relationship across clades and within ST. Spacer polymorphisms of conserved arrays were instrumental for differentiating closely related strains, e.g., ST1/RT027/B1 strains and pathogenicity locus encoding ST3/RT001 strains. CRISPR spacers showed sequence similarity to phage sequences, which is consistent with the native role of CRISPR-Cas as adaptive immune systems in bacteria. Overall, CRISPR-Cas sequences constitute a valuable basis for genotyping of C. difficile isolates, provide insights into the micro-evolutionary events that occur between closely related strains, and reflect the evolutionary trajectory of these genomes. PMID:27576538
Novel origins of copy number variation in the dog genome

PubMed Central

2012-01-01

Background Copy number variants (CNVs) account for substantial variation between genomes and are a major source of normal and pathogenic phenotypic differences. The dog is an ideal model to investigate mutational mechanisms that generate CNVs as its genome lacks a functional ortholog of the PRDM9 gene implicated in recombination and CNV formation in humans. Here we comprehensively assay CNVs using high-density array comparative genomic hybridization in 50 dogs from 17 dog breeds and 3 gray wolves. Results We use a stringent new method to identify a total of 430 high-confidence CNV loci, which range in size from 9 kb to 1.6 Mb and span 26.4 Mb, or 1.08%, of the assayed dog genome, overlapping 413 annotated genes. Of CNVs observed in each breed, 98% are also observed in multiple breeds. CNVs predicted to disrupt gene function are significantly less common than expected by chance. We identify a significant overrepresentation of peaks of GC content, previously shown to be enriched in dog recombination hotspots, in the vicinity of CNV breakpoints. Conclusions A number of the CNVs identified by this study are candidates for generating breed-specific phenotypes. Purifying selection seems to be a major factor shaping structural variation in the dog genome, suggesting that many CNVs are deleterious. Localized peaks of GC content appear to be novel sites of CNV formation in the dog genome by non-allelic homologous recombination, potentially activated by the loss of PRDM9. These sequence features may have driven genome instability and chromosomal rearrangements throughout canid evolution. PMID:22916802
GST-PRIME: an algorithm for genome-wide primer design.

PubMed

Leister, Dario; Varotto, Claudio

2007-01-01

The profiling of mRNA expression based on DNA arrays has become a powerful tool to study genome-wide transcription of genes in a number of organisms. GST-PRIME is a software package created to facilitate large-scale primer design for the amplification of probes to be immobilized on arrays for transcriptome analyses, even though it can be also applied in low-throughput approaches. GST-PRIME allows highly efficient, direct amplification of gene-sequence tags (GSTs) from genomic DNA (gDNA), starting from annotated genome or transcript sequences. GST-PRIME provides a customer-friendly platform for automatic primer design, and despite the relative simplicity of the algorithm, experimental tests in the model plant species Arabidopsis thaliana confirmed the reliability of the software. This chapter describes the algorithm used for primer design, its input and output files, and the installation of the standalone package and its use.
Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives.

PubMed

Zhao, Min; Wang, Qingguo; Wang, Quan; Jia, Peilin; Zhao, Zhongming

2013-01-01

Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development.
Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives

PubMed Central

2013-01-01

Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development. PMID:24564169
A powerful tool for genome analysis in maize: development and evaluation of the high density 600 k SNP genotyping array.

PubMed

Unterseer, Sandra; Bauer, Eva; Haberer, Georg; Seidel, Michael; Knaak, Carsten; Ouzunova, Milena; Meitinger, Thomas; Strom, Tim M; Fries, Ruedi; Pausch, Hubert; Bertani, Christofer; Davassi, Alessandro; Mayer, Klaus Fx; Schön, Chris-Carolin

2014-09-29

High density genotyping data are indispensable for genomic analyses of complex traits in animal and crop species. Maize is one of the most important crop plants worldwide, however a high density SNP genotyping array for analysis of its large and highly dynamic genome was not available so far. We developed a high density maize SNP array composed of 616,201 variants (SNPs and small indels). Initially, 57 M variants were discovered by sequencing 30 representative temperate maize lines and then stringently filtered for sequence quality scores and predicted conversion performance on the array resulting in the selection of 1.2 M polymorphic variants assayed on two screening arrays. To identify high-confidence variants, 285 DNA samples from a broad genetic diversity panel of worldwide maize lines including the samples used for sequencing, important founder lines for European maize breeding, hybrids, and proprietary samples with European, US, semi-tropical, and tropical origin were used for experimental validation. We selected 616 k variants according to their performance during validation, support of genotype calls through sequencing data, and physical distribution for further analysis and for the design of the commercially available Affymetrix® Axiom® Maize Genotyping Array. This array is composed of 609,442 SNPs and 6,759 indels. Among these are 116,224 variants in coding regions and 45,655 SNPs of the Illumina® MaizeSNP50 BeadChip for study comparison. In a subset of 45,974 variants, apart from the target SNP additional off-target variants are detected, which show only a minor bias towards intermediate allele frequencies. We performed principal coordinate and admixture analyses to determine the ability of the array to detect and resolve population structure and investigated the extent of LD within a worldwide validation panel. The high density Affymetrix® Axiom® Maize Genotyping Array is optimized for European and American temperate maize and was developed based on a diverse sample panel by applying stringent quality filter criteria to ensure its suitability for a broad range of applications. With 600 k variants it is the largest currently publically available genotyping array in crop species.
Open access resources for genome-wide association mapping in rice

PubMed Central

McCouch, Susan R.; Wright, Mark H.; Tung, Chih-Wei; Maron, Lyza G.; McNally, Kenneth L.; Fitzgerald, Melissa; Singh, Namrata; DeClerck, Genevieve; Agosto-Perez, Francisco; Korniliev, Pavel; Greenberg, Anthony J.; Naredo, Ma. Elizabeth B.; Mercado, Sheila Mae Q.; Harrington, Sandra E.; Shi, Yuxin; Branchini, Darcy A.; Kuser-Falcão, Paula R.; Leung, Hei; Ebana, Kowaru; Yano, Masahiro; Eizenga, Georgia; McClung, Anna; Mezey, Jason

2016-01-01

Increasing food production is essential to meet the demands of a growing human population, with its rising income levels and nutritional expectations. To address the demand, plant breeders seek new sources of genetic variation to enhance the productivity, sustainability and resilience of crop varieties. Here we launch a high-resolution, open-access research platform to facilitate genome-wide association mapping in rice, a staple food crop. The platform provides an immortal collection of diverse germplasm, a high-density single-nucleotide polymorphism data set tailored for gene discovery, well-documented analytical strategies, and a suite of bioinformatics resources to facilitate biological interpretation. Using grain length, we demonstrate the power and resolution of our new high-density rice array, the accompanying genotypic data set, and an expanded diversity panel for detecting major and minor effect QTLs and subpopulation-specific alleles, with immediate implications for rice improvement. PMID:26842267
Ergothioneine biosynthetic methyltransferase EgtD reveals the structural basis of aromatic amino acid betaine biosynthesis.

PubMed

Vit, Allegra; Misson, Laëtitia; Blankenfeldt, Wulf; Seebeck, Florian P

2015-01-02

Ergothioneine is an N-α-trimethyl-2-thiohistidine derivative that occurs in human, plant, fungal, and bacterial cells. Biosynthesis of this redox-active betaine starts with trimethylation of the α-amino group of histidine. The three consecutive methyl transfers are catalyzed by the S-adenosylmethionine-dependent methyltransferase EgtD. Three crystal structures of this enzyme in the absence and in the presence of N-α-dimethylhistidine and S-adenosylhomocysteine implicate a preorganized array of hydrophilic interactions as the determinants for substrate specificity and apparent processivity. We identified two active site mutations that change the substrate specificity of EgtD 10(7)-fold and transform the histidine-methyltransferase into a proficient tryptophan-methyltransferase. Finally, a genomic search for EgtD homologues in fungal genomes revealed tyrosine and tryptophan trimethylation activity as a frequent trait in ascomycetous and basidomycetous fungi. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Does the evolutionary conservation of microsatellite loci imply function?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shriver, M.D.; Deka, R.; Ferrell, R.E.

Microsatellites are highly polymorphic tandem arrays of short (1-6 bp) sequence motifs which have been found widely distributed in the genomes of all eukaryotes. We have analyzed allele frequency data on 16 microsatellite loci typed in the great apes (human, chimp, orangutan, and gorilla). The majority of these loci (13) were isolated from human genomic libraries; three were cloned from chimpanzee genomic DNA. Most of these loci are not only present in all apes species, but are polymorphic with comparable levels of heterozygosity and have alleles which overlap in size. The extent of divergence of allele frequencies among these fourmore » species were studies using the stepwise-weighted genetic distance (Dsw), which was previously shown to conform to linearity with evolutionary time since divergence for loci where mutations exist in a stepwise fashion. The phylogenetic tree of the great apes constructed from this distance matrix was consistent with the expected topology, with a high bootstrap confidence (82%) for the human/chimp clade. However, the allele frequency distributions of these species are 10 times more similar to each other than expected when they were calibrated with a conservative estimate of the time since separation of humans and the apes. These results are in agreement with sequence-based surveys of microsatellites which have demonstrated that they are highly (90%) conserved over short periods of evolutionary time (< 10 million years) and moderately (30%) conserved over long periods of evolutionary time (> 60-80 million years). This evolutionary conservation has prompted some authors to speculate that there are functional constraints on microsatellite loci. In contrast, the presence of directional bias of mutations with constraints and/or selection against aberrant sized alleles can explain these results.« less
Enhanced Functional Genomic Screening Identifies Novel Mediators of Dual Leucine Zipper Kinase-Dependent Injury Signaling in Neurons.

PubMed

Welsbie, Derek S; Mitchell, Katherine L; Jaskula-Ranga, Vinod; Sluch, Valentin M; Yang, Zhiyong; Kim, Jessica; Buehler, Eugen; Patel, Amit; Martin, Scott E; Zhang, Ping-Wu; Ge, Yan; Duan, Yukan; Fuller, John; Kim, Byung-Jin; Hamed, Eman; Chamling, Xitiz; Lei, Lei; Fraser, Iain D C; Ronai, Ze'ev A; Berlinicke, Cynthia A; Zack, Donald J

2017-06-21

Dual leucine zipper kinase (DLK) has been implicated in cell death signaling secondary to axonal damage in retinal ganglion cells (RGCs) and other neurons. To better understand the pathway through which DLK acts, we developed enhanced functional genomic screens in primary RGCs, including use of arrayed, whole-genome, small interfering RNA libraries. Explaining why DLK inhibition is only partially protective, we identify leucine zipper kinase (LZK) as cooperating with DLK to activate downstream signaling and cell death in RGCs, including in a mouse model of optic nerve injury, and show that the same pathway is active in human stem cell-derived RGCs. Moreover, we identify four transcription factors, JUN, activating transcription factor 2 (ATF2), myocyte-specific enhancer factor 2A (MEF2A), and SRY-Box 11 (SOX11), as being the major downstream mediators through which DLK/LZK activation leads to RGC cell death. Increased understanding of the DLK pathway has implications for understanding and treating neurodegenerative diseases. Copyright © 2017 Elsevier Inc. All rights reserved.
The yeast Pif1 helicase prevents genomic instability caused by G-quadruplex-forming CEB1 sequences in vivo.

PubMed

Ribeyre, Cyril; Lopes, Judith; Boulé, Jean-Baptiste; Piazza, Aurèle; Guédin, Aurore; Zakian, Virginia A; Mergny, Jean-Louis; Nicolas, Alain

2009-05-01

In budding yeast, the Pif1 DNA helicase is involved in the maintenance of both nuclear and mitochondrial genomes, but its role in these processes is still poorly understood. Here, we provide evidence for a new Pif1 function by demonstrating that its absence promotes genetic instability of alleles of the G-rich human minisatellite CEB1 inserted in the Saccharomyces cerevisiae genome, but not of other tandem repeats. Inactivation of other DNA helicases, including Sgs1, had no effect on CEB1 stability. In vitro, we show that CEB1 repeats formed stable G-quadruplex (G4) secondary structures and the Pif1 protein unwinds these structures more efficiently than regular B-DNA. Finally, synthetic CEB1 arrays in which we mutated the potential G4-forming sequences were no longer destabilized in pif1Delta cells. Hence, we conclude that CEB1 instability in pif1Delta cells depends on the potential to form G-quadruplex structures, suggesting that Pif1 could play a role in the metabolism of G4-forming sequences.
Atopic dermatitis in West Highland white terriers is associated with a 1.3-Mb region on CFA 17.

PubMed

Roque, Joana B; O'Leary, Caroline A; Duffy, David L; Kyaw-Tanner, Myat; Gharahkhani, Puya; Vogelnest, Linda; Mason, Kenneth; Shipstone, Michael; Latter, Melanie

2012-03-01

Canine atopic dermatitis (AD) is an allergic inflammatory skin disease that shares similarities with AD in humans. Canine AD is likely to be an inherited disease in dogs and is common in West Highland white terriers (WHWTs). We performed a genome-wide association study using the Affymetrix Canine SNP V2 array consisting of over 42,800 single nucleotide polymorphisms, on 35 atopic and 25 non-atopic WHWTs. A gene-dropping simulation method, using SIB-PAIR, identified a projected 1.3 Mb area of association (genome-wide P = 6 × 10(-5) to P = 7 × 10(-4)) on CFA 17. Nineteen genes on CFA 17, including 1 potential candidate gene (PTPN22), were located less than 0.5 Mb from the interval of association identified on the genome-wide association analysis. Four haplotypes within this locus were differently distributed between cases and controls in this population of dogs. These findings suggest that a major locus for canine AD in WHWTs may be located on, or in close proximity to an area on CFA 17.
Microarray Technology for the Diagnosis of Fetal Chromosomal Aberrations: Which Platform Should We Use?

PubMed Central

Karampetsou, Evangelia; Morrogh, Deborah; Chitty, Lyn

2014-01-01

The advantage of microarray (array) over conventional karyotype for the diagnosis of fetal pathogenic chromosomal anomalies has prompted the use of microarrays in prenatal diagnostics. In this review we compare the performance of different array platforms (BAC, oligonucleotide CGH, SNP) and designs (targeted, whole genome, whole genome, and targeted, custom) and discuss their advantages and disadvantages in relation to prenatal testing. We also discuss the factors to consider when implementing a microarray testing service for the diagnosis of fetal chromosomal aberrations. PMID:26237396

Accuracy and training population design for genomic selection in elite north american oats

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) is a method to estimate the breeding values of individuals by using markers throughout the genome. We evaluated the accuracies of GS using data from five traits on 446 oat lines genotyped with 1005 Diversity Array Technology (DArT) markers and two GS methods (RR-BLUP and Bayes...
Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability.

PubMed

Reggiani, Claudio; Coppens, Sandra; Sekhara, Tayeb; Dimov, Ivan; Pichon, Bruno; Lufin, Nicolas; Addor, Marie-Claude; Belligni, Elga Fabia; Digilio, Maria Cristina; Faletra, Flavio; Ferrero, Giovanni Battista; Gerard, Marion; Isidor, Bertrand; Joss, Shelagh; Niel-Bütschi, Florence; Perrone, Maria Dolores; Petit, Florence; Renieri, Alessandra; Romana, Serge; Topa, Alexandra; Vermeesch, Joris Robert; Lenaerts, Tom; Casimir, Georges; Abramowicz, Marc; Bontempi, Gianluca; Vilain, Catheline; Deconinck, Nicolas; Smits, Guillaume

2017-07-19

Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders. Two pediatric patients with global developmental delay and intellectual disability phenotype underwent array-CGH genetic testing, both showing a partial deletion of the DLG2 gene. From independent human and murine omics datasets, we combined copy number variations, histone modifications, developmental tissue-specific regulation, and protein data to explore the molecular mechanism at play. Integrating genomics, transcriptomics, and epigenomics data, we describe two novel DLG2 promoters and coding first exons expressed in human fetal brain. Their murine conservation and protein-level evidence allowed us to produce new DLG2 gene models for human and mouse. These new genic elements are deleted in 90% of 29 patients (public and in-house) showing partial deletion of the DLG2 gene. The patients' clinical characteristics expand the neurodevelopmental phenotypic spectrum linked to DLG2 gene disruption to cognitive and behavioral categories. While protein-coding genes are regarded as well known, our work shows that integration of multiple omics datasets can unveil novel coding elements. From a clinical perspective, our work demonstrates that two new DLG2 promoters and exons are crucial for the neurodevelopmental phenotypes associated with this gene. In addition, our work brings evidence for the lack of cross-annotation in human versus mouse reference genomes and nucleotide versus protein databases.
A functional gene array for detection of bacterial virulence elements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jaing, C

2007-11-01

We report our development of the first of a series of microarrays designed to detect pathogens with known mechanisms of virulence and antibiotic resistance. By targeting virulence gene families as well as genes unique to specific biothreat agents, these arrays will provide important data about the pathogenic potential and drug resistance profiles of unknown organisms in environmental samples. To validate our approach, we developed a first generation array targeting genes from Escherichia coli strains K12 and CFT073, Enterococcus faecalis and Staphylococcus aureus. We determined optimal probe design parameters for microorganism detection and discrimination, measured the required target concentration, and assessedmore » tolerance for mismatches between probe and target sequences. Mismatch tolerance is a priority for this application, due to DNA sequence variability among members of gene families. Arrays were created using the NimbleGen Maskless Array Synthesizer at Lawrence Livermore National Laboratory. Purified genomic DNA from combinations of one or more of the four target organisms, pure cultures of four related organisms, and environmental aerosol samples with spiked-in genomic DNA were hybridized to the arrays. Based on the success of this prototype, we plan to design further arrays in this series, with the goal of detecting all known virulence and antibiotic resistance gene families in a greatly expanded set of organisms.« less
Assessing copy number from exome sequencing and exome array CGH based on CNV spectrum in a large clinical cohort.

PubMed

Retterer, Kyle; Scuffins, Julie; Schmidt, Daniel; Lewis, Rachel; Pineda-Alvarez, Daniel; Stafford, Amanda; Schmidt, Lindsay; Warren, Stephanie; Gibellini, Federica; Kondakova, Anastasia; Blair, Amanda; Bale, Sherri; Matyakhina, Ludmila; Meck, Jeanne; Aradhya, Swaroop; Haverfield, Eden

2015-08-01

Detection of copy-number variation (CNV) is important for investigating many genetic disorders. Testing a large clinical cohort by array comparative genomic hybridization provides a deep perspective on the spectrum of pathogenic CNV. In this context, we describe a bioinformatics approach to extract CNV information from whole-exome sequencing and demonstrate its utility in clinical testing. Exon-focused arrays and whole-genome chromosomal microarray analysis were used to test 14,228 and 14,000 individuals, respectively. Based on these results, we developed an algorithm to detect deletions/duplications in whole-exome sequencing data and a novel whole-exome array. In the exon array cohort, we observed a positive detection rate of 2.4% (25 duplications, 318 deletions), of which 39% involved one or two exons. Chromosomal microarray analysis identified 3,345 CNVs affecting single genes (18%). We demonstrate that our whole-exome sequencing algorithm resolves CNVs of three or more exons. These results demonstrate the clinical utility of single-exon resolution in CNV assays. Our whole-exome sequencing algorithm approaches this resolution but is complemented by a whole-exome array to unambiguously identify intragenic CNVs and single-exon changes. These data illustrate the next advancements in CNV analysis through whole-exome sequencing and whole-exome array.Genet Med 17 8, 623-629.
Parent-of-origin specific allelic associations among 106 genomic loci for age at menarche

PubMed Central

Thompson, Deborah J; Ferreira, Teresa; He, Chunyan; Chasman, Daniel I; Esko, Tõnu; Thorleifsson, Gudmar; Albrecht, Eva; Ang, Wei Q; Corre, Tanguy; Cousminer, Diana L; Feenstra, Bjarke; Franceschini, Nora; Ganna, Andrea; Johnson, Andrew D; Kjellqvist, Sanela; Lunetta, Kathryn L; McMahon, George; Nolte, Ilja M; Paternoster, Lavinia; Porcu, Eleonora; Smith, Albert V; Stolk, Lisette; Teumer, Alexander; Tšernikova, Natalia; Tikkanen, Emmi; Ulivi, Sheila; Wagner, Erin K; Amin, Najaf; Bierut, Laura J; Byrne, Enda M; Hottenga, Jouke-Jan; Koller, Daniel L; Mangino, Massimo; Pers, Tune H; Yerges-Armstrong, Laura M; Zhao, Jing Hua; Andrulis, Irene L; Anton-Culver, Hoda; Atsma, Femke; Bandinelli, Stefania; Beckmann, Matthias W; Benitez, Javier; Blomqvist, Carl; Bojesen, Stig E; Bolla, Manjeet K; Bonanni, Bernardo; Brauch, Hiltrud; Brenner, Hermann; Buring, Julie E; Chang-Claude, Jenny; Chanock, Stephen; Chen, Jinhui; Chenevix-Trench, Georgia; Collée, J. Margriet; Couch, Fergus J; Couper, David; Coveillo, Andrea D; Cox, Angela; Czene, Kamila; D’adamo, Adamo Pio; Smith, George Davey; De Vivo, Immaculata; Demerath, Ellen W; Dennis, Joe; Devilee, Peter; Dieffenbach, Aida K; Dunning, Alison M; Eiriksdottir, Gudny; Eriksson, Johan G; Fasching, Peter A; Ferrucci, Luigi; Flesch-Janys, Dieter; Flyger, Henrik; Foroud, Tatiana; Franke, Lude; Garcia, Melissa E; García-Closas, Montserrat; Geller, Frank; de Geus, Eco EJ; Giles, Graham G; Gudbjartsson, Daniel F; Gudnason, Vilmundur; Guénel, Pascal; Guo, Suiqun; Hall, Per; Hamann, Ute; Haring, Robin; Hartman, Catharina A; Heath, Andrew C; Hofman, Albert; Hooning, Maartje J; Hopper, John L; Hu, Frank B; Hunter, David J; Karasik, David; Kiel, Douglas P; Knight, Julia A; Kosma, Veli-Matti; Kutalik, Zoltan; Lai, Sandra; Lambrechts, Diether; Lindblom, Annika; Mägi, Reedik; Magnusson, Patrik K; Mannermaa, Arto; Martin, Nicholas G; Masson, Gisli; McArdle, Patrick F; McArdle, Wendy L; Melbye, Mads; Michailidou, Kyriaki; Mihailov, Evelin; Milani, Lili; Milne, Roger L; Nevanlinna, Heli; Neven, Patrick; Nohr, Ellen A; Oldehinkel, Albertine J; Oostra, Ben A; Palotie, Aarno; Peacock, Munro; Pedersen, Nancy L; Peterlongo, Paolo; Peto, Julian; Pharoah, Paul DP; Postma, Dirkje S; Pouta, Anneli; Pylkäs, Katri; Radice, Paolo; Ring, Susan; Rivadeneira, Fernando; Robino, Antonietta; Rose, Lynda M; Rudolph, Anja; Salomaa, Veikko; Sanna, Serena; Schlessinger, David; Schmidt, Marjanka K; Southey, Mellissa C; Sovio, Ulla; Stampfer, Meir J; Stöckl, Doris; Storniolo, Anna M; Timpson, Nicholas J; Tyrer, Jonathan; Visser, Jenny A; Vollenweider, Peter; Völzke, Henry; Waeber, Gerard; Waldenberger, Melanie; Wallaschofski, Henri; Wang, Qin; Willemsen, Gonneke; Winqvist, Robert; Wolffenbuttel, Bruce HR; Wright, Margaret J; Boomsma, Dorret I; Econs, Michael J; Khaw, Kay-Tee; Loos, Ruth JF; McCarthy, Mark I; Montgomery, Grant W; Rice, John P; Streeten, Elizabeth A; Thorsteinsdottir, Unnur; van Duijn, Cornelia M; Alizadeh, Behrooz Z; Bergmann, Sven; Boerwinkle, Eric; Boyd, Heather A; Crisponi, Laura; Gasparini, Paolo; Gieger, Christian; Harris, Tamara B; Ingelsson, Erik; Järvelin, Marjo-Riitta; Kraft, Peter; Lawlor, Debbie; Metspalu, Andres; Pennell, Craig E; Ridker, Paul M; Snieder, Harold; Sørensen, Thorkild IA; Spector, Tim D; Strachan, David P; Uitterlinden, André G; Wareham, Nicholas J; Widen, Elisabeth; Zygmunt, Marek; Murray, Anna; Easton, Douglas F

2014-01-01

Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality1. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation2,3, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P<5×10−8) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1/WDR25, MKRN3/MAGEL2 and KCNK9) demonstrating parent-of-origin specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and gamma-aminobutyric acid-B2 receptor signaling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition. PMID:25231870
The Pathogen-Host Interactions database (PHI-base): additions and future developments.

PubMed

Urban, Martin; Pant, Rashmi; Raghunath, Arathi; Irvine, Alistair G; Pedro, Helder; Hammond-Kosack, Kim E

2015-01-01

Rapidly evolving pathogens cause a diverse array of diseases and epidemics that threaten crop yield, food security as well as human, animal and ecosystem health. To combat infection greater comparative knowledge is required on the pathogenic process in multiple species. The Pathogen-Host Interactions database (PHI-base) catalogues experimentally verified pathogenicity, virulence and effector genes from bacterial, fungal and protist pathogens. Mutant phenotypes are associated with gene information. The included pathogens infect a wide range of hosts including humans, animals, plants, insects, fish and other fungi. The current version, PHI-base 3.6, available at http://www.phi-base.org, stores information on 2875 genes, 4102 interactions, 110 host species, 160 pathogenic species (103 plant, 3 fungal and 54 animal infecting species) and 181 diseases drawn from 1243 references. Phenotypic and gene function information has been obtained by manual curation of the peer-reviewed literature. A controlled vocabulary consisting of nine high-level phenotype terms permits comparisons and data analysis across the taxonomic space. PHI-base phenotypes were mapped via their associated gene information to reference genomes available in Ensembl Genomes. Virulence genes and hotspots can be visualized directly in genome browsers. Future plans for PHI-base include development of tools facilitating community-led curation and inclusion of the corresponding host target(s). © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche.

PubMed

Perry, John Rb; Day, Felix; Elks, Cathy E; Sulem, Patrick; Thompson, Deborah J; Ferreira, Teresa; He, Chunyan; Chasman, Daniel I; Esko, Tõnu; Thorleifsson, Gudmar; Albrecht, Eva; Ang, Wei Q; Corre, Tanguy; Cousminer, Diana L; Feenstra, Bjarke; Franceschini, Nora; Ganna, Andrea; Johnson, Andrew D; Kjellqvist, Sanela; Lunetta, Kathryn L; McMahon, George; Nolte, Ilja M; Paternoster, Lavinia; Porcu, Eleonora; Smith, Albert V; Stolk, Lisette; Teumer, Alexander; Tšernikova, Natalia; Tikkanen, Emmi; Ulivi, Sheila; Wagner, Erin K; Amin, Najaf; Bierut, Laura J; Byrne, Enda M; Hottenga, Jouke-Jan; Koller, Daniel L; Mangino, Massimo; Pers, Tune H; Yerges-Armstrong, Laura M; Zhao, Jing Hua; Andrulis, Irene L; Anton-Culver, Hoda; Atsma, Femke; Bandinelli, Stefania; Beckmann, Matthias W; Benitez, Javier; Blomqvist, Carl; Bojesen, Stig E; Bolla, Manjeet K; Bonanni, Bernardo; Brauch, Hiltrud; Brenner, Hermann; Buring, Julie E; Chang-Claude, Jenny; Chanock, Stephen; Chen, Jinhui; Chenevix-Trench, Georgia; Collée, J Margriet; Couch, Fergus J; Couper, David; Coveillo, Andrea D; Cox, Angela; Czene, Kamila; D'adamo, Adamo Pio; Smith, George Davey; De Vivo, Immaculata; Demerath, Ellen W; Dennis, Joe; Devilee, Peter; Dieffenbach, Aida K; Dunning, Alison M; Eiriksdottir, Gudny; Eriksson, Johan G; Fasching, Peter A; Ferrucci, Luigi; Flesch-Janys, Dieter; Flyger, Henrik; Foroud, Tatiana; Franke, Lude; Garcia, Melissa E; García-Closas, Montserrat; Geller, Frank; de Geus, Eco Ej; Giles, Graham G; Gudbjartsson, Daniel F; Gudnason, Vilmundur; Guénel, Pascal; Guo, Suiqun; Hall, Per; Hamann, Ute; Haring, Robin; Hartman, Catharina A; Heath, Andrew C; Hofman, Albert; Hooning, Maartje J; Hopper, John L; Hu, Frank B; Hunter, David J; Karasik, David; Kiel, Douglas P; Knight, Julia A; Kosma, Veli-Matti; Kutalik, Zoltan; Lai, Sandra; Lambrechts, Diether; Lindblom, Annika; Mägi, Reedik; Magnusson, Patrik K; Mannermaa, Arto; Martin, Nicholas G; Masson, Gisli; McArdle, Patrick F; McArdle, Wendy L; Melbye, Mads; Michailidou, Kyriaki; Mihailov, Evelin; Milani, Lili; Milne, Roger L; Nevanlinna, Heli; Neven, Patrick; Nohr, Ellen A; Oldehinkel, Albertine J; Oostra, Ben A; Palotie, Aarno; Peacock, Munro; Pedersen, Nancy L; Peterlongo, Paolo; Peto, Julian; Pharoah, Paul Dp; Postma, Dirkje S; Pouta, Anneli; Pylkäs, Katri; Radice, Paolo; Ring, Susan; Rivadeneira, Fernando; Robino, Antonietta; Rose, Lynda M; Rudolph, Anja; Salomaa, Veikko; Sanna, Serena; Schlessinger, David; Schmidt, Marjanka K; Southey, Mellissa C; Sovio, Ulla; Stampfer, Meir J; Stöckl, Doris; Storniolo, Anna M; Timpson, Nicholas J; Tyrer, Jonathan; Visser, Jenny A; Vollenweider, Peter; Völzke, Henry; Waeber, Gerard; Waldenberger, Melanie; Wallaschofski, Henri; Wang, Qin; Willemsen, Gonneke; Winqvist, Robert; Wolffenbuttel, Bruce Hr; Wright, Margaret J; Boomsma, Dorret I; Econs, Michael J; Khaw, Kay-Tee; Loos, Ruth Jf; McCarthy, Mark I; Montgomery, Grant W; Rice, John P; Streeten, Elizabeth A; Thorsteinsdottir, Unnur; van Duijn, Cornelia M; Alizadeh, Behrooz Z; Bergmann, Sven; Boerwinkle, Eric; Boyd, Heather A; Crisponi, Laura; Gasparini, Paolo; Gieger, Christian; Harris, Tamara B; Ingelsson, Erik; Järvelin, Marjo-Riitta; Kraft, Peter; Lawlor, Debbie; Metspalu, Andres; Pennell, Craig E; Ridker, Paul M; Snieder, Harold; Sørensen, Thorkild Ia; Spector, Tim D; Strachan, David P; Uitterlinden, André G; Wareham, Nicholas J; Widen, Elisabeth; Zygmunt, Marek; Murray, Anna; Easton, Douglas F; Stefansson, Kari; Murabito, Joanne M; Ong, Ken K

2014-10-02

Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P < 5 × 10(-8)) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1-WDR25, MKRN3-MAGEL2 and KCNK9) demonstrating parent-of-origin-specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and γ-aminobutyric acid-B2 receptor signalling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition.
Hormone escape is associated with genomic instability in a human prostate cancer model.

PubMed

Legrier, Marie-Emmanuelle; Guyader, Charlotte; Céraline, Jocelyn; Dutrillaux, Bernard; Oudard, Stéphane; Poupon, Marie-France; Auger, Nathalie

2009-03-01

Lack of hormone dependency in prostate cancers is an irreversible event that occurs through generation of genomic instability induced by androgen deprivation. Indeed, the cytogenetic profile of hormone-dependent (HD) prostate cancer remains stable as long as it received a hormone supply, whereas the profile of hormone-independent (HID) variants acquired new and various alterations. This is demonstrated here using a HD xenografted model of a human prostate cancer, PAC120, transplanted for 11 years into male nude mice and 4 HID variants obtained by surgical castration. Cytogenetic analysis, done by karyotype, FISH, CGH and array-CGH, shows that PAC120 at early passage presents numerous chromosomal alterations. Very few additional alterations were found between the 5th and 47th passages, indicating the stability of the parental tumor. HID variants largely maintained the core of chromosomal alterations of PAC120 - losses at 6q, 7p, 12q, 15q and 17q sites. However, each HID variant displayed a number of new alterations, almost all being specific to each variant and very few shared by all. None of the HID had androgen receptor mutations. Our study indicates that hormone castration is responsible for genomic instability generating new cytogenetic abnormalities susceptible to alter the properties of cancer cell associated with tumor progression, such as increased cell survival and ability to metastasize.
High-throughput single-molecule telomere characterization.

PubMed

McCaffrey, Jennifer; Young, Eleanor; Lassahn, Katy; Sibert, Justin; Pastor, Steven; Riethman, Harold; Xiao, Ming

2017-11-01

We have developed a novel method that enables global subtelomere and haplotype-resolved analysis of telomere lengths at the single-molecule level. An in vitro CRISPR/Cas9 RNA-directed nickase system directs the specific labeling of human (TTAGGG)n DNA tracts in genomes that have also been barcoded using a separate nickase enzyme that recognizes a 7-bp motif genome-wide. High-throughput imaging and analysis of large DNA single molecules from genomes labeled in this fashion using a nanochannel array system permits mapping through subtelomere repeat element (SRE) regions to unique chromosomal DNA while simultaneously measuring the (TTAGGG)n tract length at the end of each large telomere-terminal DNA segment. The methodology also permits subtelomere and haplotype-resolved analyses of SRE organization and variation, providing a window into the population dynamics and potential functions of these complex and structurally variant telomere-adjacent DNA regions. At its current stage of development, the assay can be used to identify and characterize telomere length distributions of 30-35 discrete telomeres simultaneously and accurately. The assay's utility is demonstrated using early versus late passage and senescent human diploid fibroblasts, documenting the anticipated telomere attrition on a global telomere-by-telomere basis as well as identifying subtelomere-specific biases for critically short telomeres. Similarly, we present the first global single-telomere-resolved analyses of two cancer cell lines. © 2017 McCaffrey et al.; Published by Cold Spring Harbor Laboratory Press.
Characterization of hemizygous deletions in Citrus using array-Comparative Genomic Hybridization and microsynteny comparisons with the poplar genome

PubMed Central

Ríos, Gabino; Naranjo, Miguel A; Iglesias, Domingo J; Ruiz-Rivero, Omar; Geraud, Marion; Usach, Antonio; Talón, Manuel

2008-01-01

Background Many fruit-tree species, including relevant Citrus spp varieties exhibit a reproductive biology that impairs breeding and strongly constrains genetic improvements. In citrus, juvenility increases the generation time while sexual sterility, inbreeding depression and self-incompatibility prevent the production of homozygous cultivars. Genomic technology may provide citrus researchers with a new set of tools to address these various restrictions. In this work, we report a valuable genomics-based protocol for the structural analysis of deletion mutations on an heterozygous background. Results Two independent fast neutron mutants of self-incompatible clementine (Citrus clementina Hort. Ex Tan. cv. Clemenules) were the subject of the study. Both mutants, named 39B3 and 39E7, were expected to carry DNA deletions in hemizygous dosage. Array-based Comparative Genomic Hybridization (array-CGH) using a Citrus cDNA microarray allowed the identification of underrepresented genes in these two mutants. Subsequent comparison of citrus deleted genes with annotated plant genomes, especially poplar, made possible to predict the presence of a large deletion in 39B3 of about 700 kb and at least two deletions of approximately 100 and 500 kb in 39E7. The deletion in 39B3 was further characterized by PCR on available Citrus BACs, which helped us to build a partial physical map of the deletion. Among the deleted genes, ClpC-like gene coding for a putative subunit of a multifunctional chloroplastic protease involved in the regulation of chlorophyll b synthesis was directly related to the mutated phenotype since the mutant showed a reduced chlorophyll a/b ratio in green tissues. Conclusion In this work, we report the use of array-CGH for the successful identification of genes included in a hemizygous deletion induced by fast neutron irradiation on Citrus clementina. The study of gene content and order into the 39B3 deletion also led to the unexpected conclusion that microsynteny and local gene colinearity in this species were higher with Populus trichocarpa than with the phylogenetically closer Arabidopsis thaliana. This work corroborates the potential of Citrus genomic resources to assist mutagenesis-based approaches for functional genetics, structural studies and comparative genomics, and hence to facilitate citrus variety improvement. PMID:18691431
A Six Months Exercise Intervention Influences the Genome-wide DNA Methylation Pattern in Human Adipose Tissue

PubMed Central

Rönn, Tina; Volkov, Petr; Davegårdh, Cajsa; Dayeh, Tasnim; Hall, Elin; Olsson, Anders H.; Nilsson, Emma; Tornberg, Åsa; Dekker Nitert, Marloes; Eriksson, Karl-Fredrik; Jones, Helena A.; Groop, Leif; Ling, Charlotte

2013-01-01

Epigenetic mechanisms are implicated in gene regulation and the development of different diseases. The epigenome differs between cell types and has until now only been characterized for a few human tissues. Environmental factors potentially alter the epigenome. Here we describe the genome-wide pattern of DNA methylation in human adipose tissue from 23 healthy men, with a previous low level of physical activity, before and after a six months exercise intervention. We also investigate the differences in adipose tissue DNA methylation between 31 individuals with or without a family history of type 2 diabetes. DNA methylation was analyzed using Infinium HumanMethylation450 BeadChip, an array containing 485,577 probes covering 99% RefSeq genes. Global DNA methylation changed and 17,975 individual CpG sites in 7,663 unique genes showed altered levels of DNA methylation after the exercise intervention (q<0.05). Differential mRNA expression was present in 1/3 of gene regions with altered DNA methylation, including RALBP1, HDAC4 and NCOR2 (q<0.05). Using a luciferase assay, we could show that increased DNA methylation in vitro of the RALBP1 promoter suppressed the transcriptional activity (p = 0.03). Moreover, 18 obesity and 21 type 2 diabetes candidate genes had CpG sites with differences in adipose tissue DNA methylation in response to exercise (q<0.05), including TCF7L2 (6 CpG sites) and KCNQ1 (10 CpG sites). A simultaneous change in mRNA expression was seen for 6 of those genes. To understand if genes that exhibit differential DNA methylation and mRNA expression in human adipose tissue in vivo affect adipocyte metabolism, we silenced Hdac4 and Ncor2 respectively in 3T3-L1 adipocytes, which resulted in increased lipogenesis both in the basal and insulin stimulated state. In conclusion, exercise induces genome-wide changes in DNA methylation in human adipose tissue, potentially affecting adipocyte metabolism. PMID:23825961
Holocentromeres in Rhynchospora are associated with genome-wide centromere-specific repeat arrays interspersed among euchromatin.

PubMed

Marques, André; Ribeiro, Tiago; Neumann, Pavel; Macas, Jiří; Novák, Petr; Schubert, Veit; Pellino, Marco; Fuchs, Jörg; Ma, Wei; Kuhlmann, Markus; Brandt, Ronny; Vanzela, André L L; Beseda, Tomáš; Šimková, Hana; Pedrosa-Harand, Andrea; Houben, Andreas

2015-11-03

Holocentric chromosomes lack a primary constriction, in contrast to monocentrics. They form kinetochores distributed along almost the entire poleward surface of the chromatids, to which spindle fibers attach. No centromere-specific DNA sequence has been found for any holocentric organism studied so far. It was proposed that centromeric repeats, typical for many monocentric species, could not occur in holocentrics, most likely because of differences in the centromere organization. Here we show that the holokinetic centromeres of the Cyperaceae Rhynchospora pubera are highly enriched by a centromeric histone H3 variant-interacting centromere-specific satellite family designated "Tyba" and by centromeric retrotransposons (i.e., CRRh) occurring as genome-wide interspersed arrays. Centromeric arrays vary in length from 3 to 16 kb and are intermingled with gene-coding sequences and transposable elements. We show that holocentromeres of metaphase chromosomes are composed of multiple centromeric units rather than possessing a diffuse organization, thus favoring the polycentric model. A cell-cycle-dependent shuffling of multiple centromeric units results in the formation of functional (poly)centromeres during mitosis. The genome-wide distribution of centromeric repeat arrays interspersing the euchromatin provides a previously unidentified type of centromeric chromatin organization among eukaryotes. Thus, different types of holocentromeres exist in different species, namely with and without centromeric repetitive sequences.
Whole-genome random sequencing and assembly of Haemophilus influenzae Rd

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fleischmann, R.D.; Adams, M.D.; White, O.

1995-07-28

An approach for genome analysis based on sequencing and assembly of unselected pieces of DNA from the whole chromosome has been applied to obtain the complete nucleotide sequence (1,830,137 base pairs) of the genome from the bacterium Haemophilus influenzae Rd. This approach eliminates the need for initial mapping efforts and is therefore applicable to the vast array of microbial species for which genome maps are unavailable. The H. influenzae Rd genome sequence (Genome Sequence DataBase accession number L42023) represents the only complete genome sequence from a free-living organism. 46 refs., 4 figs., 4 tabs.
Refinement of light-responsive transcript lists using rice oligonucleotide arrays: evaluation of gene-redundancy.

PubMed

Jung, Ki-Hong; Dardick, Christopher; Bartley, Laura E; Cao, Peijian; Phetsom, Jirapa; Canlas, Patrick; Seo, Young-Su; Shultz, Michael; Ouyang, Shu; Yuan, Qiaoping; Frank, Bryan C; Ly, Eugene; Zheng, Li; Jia, Yi; Hsia, An-Ping; An, Kyungsook; Chou, Hui-Hsien; Rocke, David; Lee, Geun Cheol; Schnable, Patrick S; An, Gynheung; Buell, C Robin; Ronald, Pamela C

2008-10-06

Studies of gene function are often hampered by gene-redundancy, especially in organisms with large genomes such as rice (Oryza sativa). We present an approach for using transcriptomics data to focus functional studies and address redundancy. To this end, we have constructed and validated an inexpensive and publicly available rice oligonucleotide near-whole genome array, called the rice NSF45K array. We generated expression profiles for light- vs. dark-grown rice leaf tissue and validated the biological significance of the data by analyzing sources of variation and confirming expression trends with reverse transcription polymerase chain reaction. We examined trends in the data by evaluating enrichment of gene ontology terms at multiple false discovery rate thresholds. To compare data generated with the NSF45K array with published results, we developed publicly available, web-based tools (www.ricearray.org). The Oligo and EST Anatomy Viewer enables visualization of EST-based expression profiling data for all genes on the array. The Rice Multi-platform Microarray Search Tool facilitates comparison of gene expression profiles across multiple rice microarray platforms. Finally, we incorporated gene expression and biochemical pathway data to reduce the number of candidate gene products putatively participating in the eight steps of the photorespiration pathway from 52 to 10, based on expression levels of putatively functionally redundant genes. We confirmed the efficacy of this method to cope with redundancy by correctly predicting participation in photorespiration of a gene with five paralogs. Applying these methods will accelerate rice functional genomics.
Targeted capture and resequencing of 1040 genes reveal environmentally driven functional variation in grey wolves.

PubMed

Schweizer, Rena M; Robinson, Jacqueline; Harrigan, Ryan; Silva, Pedro; Galverni, Marco; Musiani, Marco; Green, Richard E; Novembre, John; Wayne, Robert K

2016-01-01

In an era of ever-increasing amounts of whole-genome sequence data for individuals and populations, the utility of traditional single nucleotide polymorphisms (SNPs) array-based genome scans is uncertain. We previously performed a SNP array-based genome scan to identify candidate genes under selection in six distinct grey wolf (Canis lupus) ecotypes. Using this information, we designed a targeted capture array for 1040 genes, including all exons and flanking regions, as well as 5000 1-kb nongenic neutral regions, and resequenced these regions in 107 wolves. Selection tests revealed striking patterns of variation within candidate genes relative to noncandidate regions and identified potentially functional variants related to local adaptation. We found 27% and 47% of candidate genes from the previous SNP array study had functional changes that were outliers in sweed and bayenv analyses, respectively. This result verifies the use of genomewide SNP surveys to tag genes that contain functional variants between populations. We highlight nonsynonymous variants in APOB, LIPG and USH2A that occur in functional domains of these proteins, and that demonstrate high correlation with precipitation seasonality and vegetation. We find Arctic and High Arctic wolf ecotypes have higher numbers of genes under selection, which highlight their conservation value and heightened threat due to climate change. This study demonstrates that combining genomewide genotyping arrays with large-scale resequencing and environmental data provides a powerful approach to discern candidate functional variants in natural populations. © 2015 John Wiley & Sons Ltd.
Development of an Influenza virus protein array using Sortagging technology

PubMed Central

Sinisi, Antonia; Popp, Maximilian Wei-Lin; Antos, John M.; Pansegrau, Werner; Savino, Silvana; Nissum, Mikkel; Rappuoli, Rino; Ploegh, Hidde L.; Buti, Ludovico

2013-01-01

Protein array technology is an emerging tool that enables high throughput screening of protein-protein or protein-lipid interactions and identification of immunodominant antigens during the course of a bacterial or viral infection. In this work we developed an Influenza virus protein array using the sortase-mediated transpeptidation reaction known as “Sortagging”. LPETG-tagged Influenza virus proteins from bacterial and eukaryotic cellular extracts were immobilized at their carboxyl-termini onto a pre-activated amine-glass slide coated with a Gly3 linker. Immobilized proteins were revealed by specific antibodies and the newly generated Sortag-protein chip can be used as a device for antigen and/or antibody screening. The specificity of the Sortase A (SrtA) reaction avoids purification steps in array building and allows immobilization of proteins in an oriented fashion. Previously, this versatile technology has been successfully employed for protein labeling and protein conjugation. Here, the tool is implemented to covalently link proteins of a viral genome onto a solid support. The system could readily be scaled up to proteins of larger genomes in order to develop protein arrays for high throughput screening. PMID:22594688
Gene Chips: A New Tool for Biology

NASA Astrophysics Data System (ADS)

Botstein, David

2005-03-01

The knowledge of many complete genomic sequences has led to a ``grand unification of biology,'' consisting of direct evidence that most of the basic cellular functions of all organisms are carried out by genes and proteins whose primary sequences are directly related by descent (i.e. orthologs). Further, genome sequences have made it possible to study all the genes of a single organism simultaneously. We have been using DNA microarrays (sometime referred to as ``gene chips'') to study patterns of gene expression and genome rearrangement in yeast and human cells under a variety of conditions and in human tumors and normal tissues. These experiments produce huge volumes of data; new computational and statistical methods are required to analyze them properly. Examples from this work will be presented to illustrate how genome-scale experiments and analysis can result in new biological insights not obtainable by traditional analyses of genes and proteins one by one. For lymphomas, breast tumors, lung tumors, liver tumors, gastric tumors, brain tumors and soft tissue tumors we have been able, by the application of clustering algorithms, to subclassify tumors of similar anatomical origin on the basis of their gene expression patterns. These subclassifications appear to be reproducible and clinically as well as biologically meaningful. By studying synchronized cells growing in culture, we have identified many hundreds of yeast and human genes that are expressed periodically, at characteristically different points in the cell division cycle. In humans, it turns out that most of these genes are the same genes that comprise the ``proliferation cluster,'' i.e. the genes whose expression is specifically associated with the proliferativeness of tumors and tumor cell lines. Finally, we have been applying a variant of our DNA microarray technology (which we call ``array comparative hybridization'') to follow the DNA copy number of genes, both in tumors and in yeast cells undergoing adaptive evolution during hundreds of generations of growth in continuous culture. These studies suggest a basic similarity in mechanism between adaptive evolution in yeast and tumor progression in humans.
The Glaciozyma antarctica genome reveals an array of systems that provide sustained responses towards temperature variations in a persistently cold habitat

PubMed Central

Hashim, Noor Haza Fazlin; Bharudin, Izwan; Abu Bakar, Mohd Faizal; Huang, Kie Kyon; Alias, Halimah; Lee, Bernard K. B.; Mat Isa, Mohd Noor; Mat-Sharani, Shuhaila; Sulaiman, Suhaila; Tay, Lih Jinq; Zolkefli, Radziah; Muhammad Noor, Yusuf; Law, Douglas Sie Nguong; Abdul Rahman, Siti Hamidah; Md-Illias, Rosli; Abu Bakar, Farah Diba; Najimudin, Nazalan; Abdul Murad, Abdul Munir; Mahadi, Nor Muhammad

2018-01-01

Extremely low temperatures present various challenges to life that include ice formation and effects on metabolic capacity. Psyhcrophilic microorganisms typically have an array of mechanisms to enable survival in cold temperatures. In this study, we sequenced and analysed the genome of a psychrophilic yeast isolated in the Antarctic region, Glaciozyma antarctica. The genome annotation identified 7857 protein coding sequences. From the genome sequence analysis we were able to identify genes that encoded for proteins known to be associated with cold survival, in addition to annotating genes that are unique to G. antarctica. For genes that are known to be involved in cold adaptation such as anti-freeze proteins (AFPs), our gene expression analysis revealed that they were differentially transcribed over time and in response to different temperatures. This indicated the presence of an array of adaptation systems that can respond to a changing but persistent cold environment. We were also able to validate the activity of all the AFPs annotated where the recombinant AFPs demonstrated anti-freeze capacity. This work is an important foundation for further collective exploration into psychrophilic microbiology where among other potential, the genes unique to this species may represent a pool of novel mechanisms for cold survival. PMID:29385175
Colloidal silica films for high-capacity DNA arrays

NASA Astrophysics Data System (ADS)

Glazer, Marc Irving

The human genome project has greatly expanded the amount of genetic information available to researchers, but before this vast new source of data can be fully utilized, techniques for rapid, large-scale analysis of DNA and RNA must continue to develop. DNA arrays have emerged as a powerful new technology for analyzing genomic samples in a highly parallel format. The detection sensitivity of these arrays is dependent on the quantity and density of immobilized probe molecules. We have investigated substrates with a porous, "three-dimensional" surface layer as a means of increasing the surface area available for the synthesis of oligonucleotide probes, thereby increasing the number of available probes and the amount of detectable bound target. Porous colloidal silica films were created by two techniques. In the first approach, films were deposited by spin-coating silica colloid suspensions onto flat glass substrates, with the pores being formed by the natural voids between the solid particles (typically 23nm pores, 35% porosity). In the second approach, latex particles were co-deposited with the silica and then pyrolyzed, creating films with larger pores (36 nm), higher porosity (65%), and higher surface area. For 0.3 mum films, enhancements of eight to ten-fold and 12- to 14-fold were achieved with the pure silica films and the films "templated" with polymer latex, respectively. In gene expression assays for up to 7,000 genes using complex biological samples, the high-capacity films provided enhanced signals and performed equivalently or better than planar glass on all other functional measures, confirming that colloidal silica films are a promising platform for high-capacity DNA arrays. We have also investigated the kinetics of hybridization on planar glass and high-capacity substrates. Adsorption on planar arrays is similar to ideal Langmuir-type adsorption, although with an "overshoot" at high solution concentration. Hybridization on high-capacity films is controlled by traditional adsorption (ka) and desorption (kd) coefficients, as well as morphology factors and transient binding interactions between the target and probes. The strength of the transient probe/target binding interactions are on the order of 5--7 DNA base pairs, which suggests the formation of nucleation or other metastable complexes, rather than fully-zippered duplexes.
Chromosome copy number variation in telomerized human bone marrow stromal cells; insights for monitoring safe ex-vivo expansion of adult stem cells.

PubMed

Burns, Jorge S; Harkness, Linda; Aldahmash, Abdullah; Gautier, Laurent; Kassem, Moustapha

2017-12-01

Adult human bone marrow stromal cells (hBMSC) cultured for cell therapy require evaluation of potency and stability for safe use. Chromosomal aberrations upsetting genomic integrity in such cells have been contrastingly described as "Limited" or "Significant". Previously reported stepwise acquisition of a spontaneous neoplastic phenotype during three-year continuous culture of telomerized cells (hBMSC-TERT20) didn't alter a diploid karyotype measured by spectral karyotype analysis (SKY). Such screening may not adequately monitor abnormal and potentially tumorigenic hBMSC in clinical scenarios. We here used array comparative genomic hybridization (aCGH) to more stringently compare non-tumorigenic parental hBMSC-TERT strains with their tumorigenic subcloned populations. Confirmation of a known chromosome 9p21 microdeletion at locus CDKN2A/B, showed it also impinged upon the adjacent MTAP gene. Compared to reference diploid human fibroblast genomic DNA, the non-tumorigenic hBMSC-TERT4 cells had a copy number variation (CNV) in at least 14 independent loci. The pre-tumorigenic hBMSC-TERT20 cell strain had further CNV including 1q44 gain enhancing SMYD3 expression and 11q13.1 loss downregulating MUS81 expression. Bioinformatic analysis of gene products reflecting 11p15.5 CNV gain in tumorigenic hBMSC-TERT20 cells highlighted networks implicated in tumorigenic progression involving cell cycle control and mis-match repair. We provide novel biomarkers for prospective risk assessment of expanded stem cell cultures. Copyright © 2017. Published by Elsevier B.V.

A comprehensive catalog of human KRAB-associated zinc finger genes: Insights into the evolutionary history of a large family of transcriptional repressors

PubMed Central

Huntley, Stuart; Baggott, Daniel M.; Hamilton, Aaron T.; Tran-Gyamfi, Mary; Yang, Shan; Kim, Joomyeong; Gordon, Laurie; Branscomb, Elbert; Stubbs, Lisa

2006-01-01

Krüppel-type zinc finger (ZNF) motifs are prevalent components of transcription factor proteins in all eukaryotes. KRAB-ZNF proteins, in which a potent repressor domain is attached to a tandem array of DNA-binding zinc-finger motifs, are specific to tetrapod vertebrates and represent the largest class of ZNF proteins in mammals. To define the full repertoire of human KRAB-ZNF proteins, we searched the genome sequence for key motifs and then constructed and manually curated gene models incorporating those sequences. The resulting gene catalog contains 423 KRAB-ZNF protein-coding loci, yielding alternative transcripts that altogether predict at least 742 structurally distinct proteins. Active rounds of segmental duplication, involving single genes or larger regions and including both tandem and distributed duplication events, have driven the expansion of this mammalian gene family. Comparisons between the human genes and ZNF loci mined from the draft mouse, dog, and chimpanzee genomes not only identified 103 KRAB-ZNF genes that are conserved in mammals but also highlighted a substantial level of lineage-specific change; at least 136 KRAB-ZNF coding genes are primate specific, including many recent duplicates. KRAB-ZNF genes are widely expressed and clustered genes are typically not coregulated, indicating that paralogs have evolved to fill roles in many different biological processes. To facilitate further study, we have developed a Web-based public resource with access to gene models, sequences, and other data, including visualization tools to provide genomic context and interaction with other public data sets. PMID:16606702
Brains, genes, and primates.

PubMed

Izpisua Belmonte, Juan Carlos; Callaway, Edward M; Caddick, Sarah J; Churchland, Patricia; Feng, Guoping; Homanics, Gregg E; Lee, Kuo-Fen; Leopold, David A; Miller, Cory T; Mitchell, Jude F; Mitalipov, Shoukhrat; Moutri, Alysson R; Movshon, J Anthony; Okano, Hideyuki; Reynolds, John H; Ringach, Dario; Sejnowski, Terrence J; Silva, Afonso C; Strick, Peter L; Wu, Jun; Zhang, Feng

2015-05-06

One of the great strengths of the mouse model is the wide array of genetic tools that have been developed. Striking examples include methods for directed modification of the genome, and for regulated expression or inactivation of genes. Within neuroscience, it is now routine to express reporter genes, neuronal activity indicators, and opsins in specific neuronal types in the mouse. However, there are considerable anatomical, physiological, cognitive, and behavioral differences between the mouse and the human that, in some areas of inquiry, limit the degree to which insights derived from the mouse can be applied to understanding human neurobiology. Several recent advances have now brought into reach the goal of applying these tools to understanding the primate brain. Here we describe these advances, consider their potential to advance our understanding of the human brain and brain disorders, discuss bioethical considerations, and describe what will be needed to move forward. Copyright © 2015 Elsevier Inc. All rights reserved.
Brains, Genes and Primates

PubMed Central

Belmonte, Juan Carlos Izpisua; Callaway, Edward M.; Churchland, Patricia; Caddick, Sarah J.; Feng, Guoping; Homanics, Gregg E.; Lee, Kuo-Fen; Leopold, David A.; Miller, Cory T.; Mitchell, Jude F.; Mitalipov, Shoukhrat; Moutri, Alysson R.; Movshon, J. Anthony; Okano, Hideyuki; Reynolds, John H.; Ringach, Dario; Sejnowski, Terrence J.; Silva, Afonso C.; Strick, Peter L.; Wu, Jun; Zhang, Feng

2015-01-01

One of the great strengths of the mouse model is the wide array of genetic tools that have been developed. Striking examples include methods for directed modification of the genome, and for regulated expression or inactivation of genes. Within neuroscience, it is now routine to express reporter genes, neuronal activity indicators and opsins in specific neuronal types in the mouse. However, there are considerable anatomical, physiological, cognitive and behavioral differences between the mouse and the human that, in some areas of inquiry, limit the degree to which insights derived from the mouse can be applied to understanding human neurobiology. Several recent advances have now brought into reach the goal of applying these tools to understanding the primate brain. Here we describe these advances, consider their potential to advance our understanding of the human brain and brain disorders, discuss bioethical considerations, and describe what will be needed to move forward. PMID:25950631
Discovery of 100K SNP array and its utilization in sugarcane

USDA-ARS?s Scientific Manuscript database

Next generation sequencing (NGS) enable us to identify thousands of single nucleotide polymorphisms (SNPs) marker for genotyping and fingerprinting. However, the process requires very precise bioinformatics analysis and filtering process. High throughput SNP array with predefined genomic location co...
High-density single nucleotide polymorphism (SNP) array mapping in Brassica oleracea: identification of QTL associated with carotenoid variation in broccoli florets.

PubMed

Brown, Allan F; Yousef, Gad G; Chebrolu, Kranthi K; Byrd, Robert W; Everhart, Koyt W; Thomas, Aswathy; Reid, Robert W; Parkin, Isobel A P; Sharpe, Andrew G; Oliver, Rebekah; Guzman, Ivette; Jackson, Eric W

2014-09-01

A high-resolution genetic linkage map of B. oleracea was developed from a B. napus SNP array. The work will facilitate genetic and evolutionary studies in Brassicaceae. A broccoli population, VI-158 × BNC, consisting of 150 F2:3 families was used to create a saturated Brassica oleracea (diploid: CC) linkage map using a recently developed rapeseed (Brassica napus) (tetraploid: AACC) Illumina Infinium single nucleotide polymorphism (SNP) array. The map consisted of 547 non-redundant SNP markers spanning 948.1 cM across nine chromosomes with an average interval size of 1.7 cM. As the SNPs are anchored to the genomic reference sequence of the rapid cycling B. oleracea TO1000, we were able to estimate that the map provides 96 % coverage of the diploid genome. Carotenoid analysis of 2 years data identified 3 QTLs on two chromosomes that are associated with up to half of the phenotypic variation associated with the accumulation of total or individual compounds. By searching the genome sequences of the two related diploid species (B. oleracea and B. rapa), we further identified putative carotenoid candidate genes in the region of these QTLs. This is the first description of the use of a B. napus SNP array to rapidly construct high-density genetic linkage maps of one of the constituent diploid species. The unambiguous nature of these markers with regard to genomic sequences provides evidence to the nature of genes underlying the QTL, and demonstrates the value and impact this resource will have on Brassica research.
Genomic Tools in Pea Breeding Programs: Status and Perspectives

PubMed Central

Tayeh, Nadim; Aubert, Grégoire; Pilet-Nayel, Marie-Laure; Lejeune-Hénaut, Isabelle; Warkentin, Thomas D.; Burstin, Judith

2015-01-01

Pea (Pisum sativum L.) is an annual cool-season legume and one of the oldest domesticated crops. Dry pea seeds contain 22–25% protein, complex starch and fiber constituents, and a rich array of vitamins, minerals, and phytochemicals which make them a valuable source for human consumption and livestock feed. Dry pea ranks third to common bean and chickpea as the most widely grown pulse in the world with more than 11 million tons produced in 2013. Pea breeding has achieved great success since the time of Mendel's experiments in the mid-1800s. However, several traits still require significant improvement for better yield stability in a larger growing area. Key breeding objectives in pea include improving biotic and abiotic stress resistance and enhancing yield components and seed quality. Taking advantage of the diversity present in the pea genepool, many mapping populations have been constructed in the last decades and efforts have been deployed to identify loci involved in the control of target traits and further introgress them into elite breeding materials. Pea now benefits from next-generation sequencing and high-throughput genotyping technologies that are paving the way for genome-wide association studies and genomic selection approaches. This review covers the significant development and deployment of genomic tools for pea breeding in recent years. Future prospects are discussed especially in light of current progress toward deciphering the pea genome. PMID:26640470
Brachypodium distachyon genetic resources

USDA-ARS?s Scientific Manuscript database

Brachypodium distachyon is a well-established model species for the grass family Poaceae. It possesses an array of features that make it suited for this purpose, including a small sequenced genome, simple transformation methods, and additional functional genomics tools. However, the most critical to...
Low-Pass Genome-Wide Sequencing and Variant Inference Using Identity-by-Descent in an Isolated Human Population

PubMed Central

Gusev, A.; Shah, M. J.; Kenny, E. E.; Ramachandran, A.; Lowe, J. K.; Salit, J.; Lee, C. C.; Levandowsky, E. C.; Weaver, T. N.; Doan, Q. C.; Peckham, H. E.; McLaughlin, S. F.; Lyons, M. R.; Sheth, V. N.; Stoffel, M.; De La Vega, F. M.; Friedman, J. M.; Breslow, J. L.

2012-01-01

Whole-genome sequencing in an isolated population with few founders directly ascertains variants from the population bottleneck that may be rare elsewhere. In such populations, shared haplotypes allow imputation of variants in unsequenced samples without resorting to complex statistical methods as in studies of outbred cohorts. We focus on an isolated population cohort from the Pacific Island of Kosrae, Micronesia, where we previously collected SNP array and rich phenotype data for the majority of the population. We report identification of long regions with haplotypes co-inherited between pairs of individuals and methodology to leverage such shared genetic content for imputation. Our estimates show that sequencing as few as 40 personal genomes allows for inference in up to 60% of the 3000-person cohort at the average locus. We ascertained a pilot data set of whole-genome sequences from seven Kosraean individuals, with average 5× coverage. This assay identified 5,735,306 unique sites of which 1,212,831 were previously unknown. Additionally, these variants are unusually enriched for alleles that are rare in other populations when compared to geographic neighbors (published Korean genome SJK). We used the presence of shared haplotypes between the seven Kosraen individuals to estimate expected imputation accuracy of known and novel homozygous variants at 99.6% and 97.3%, respectively. This study presents whole-genome analysis of a homogenous isolate population with emphasis on optimal rare variant inference. PMID:22135348
Genome-Wide Association Meta-Analysis Reveals Novel Juvenile Idiopathic Arthritis Susceptibility Loci.

PubMed

McIntosh, Laura A; Marion, Miranda C; Sudman, Marc; Comeau, Mary E; Becker, Mara L; Bohnsack, John F; Fingerlin, Tasha E; Griffin, Thomas A; Haas, J Peter; Lovell, Daniel J; Maier, Lisa A; Nigrovic, Peter A; Prahalad, Sampath; Punaro, Marilynn; Rosé, Carlos D; Wallace, Carol A; Wise, Carol A; Moncrieffe, Halima; Howard, Timothy D; Langefeld, Carl D; Thompson, Susan D

2017-11-01

Juvenile idiopathic arthritis (JIA) is the most common childhood rheumatic disease and has a strong genomic component. To date, JIA genetic association studies have had limited sample sizes, used heterogeneous patient populations, or included only candidate regions. The aim of this study was to identify new associations between JIA patients with oligoarticular disease and those with IgM rheumatoid factor (RF)-negative polyarticular disease, which are clinically similar and the most prevalent JIA disease subtypes. Three cohorts comprising 2,751 patients with oligoarticular or RF-negative polyarticular JIA were genotyped using the Affymetrix Genome-Wide SNP Array 6.0 or the Illumina HumanCoreExome-12+ Array. Overall, 15,886 local and out-of-study controls, typed on these platforms or the Illumina HumanOmni2.5, were used for association analyses. High-quality single-nucleotide polymorphisms (SNPs) were used for imputation to 1000 Genomes prior to SNP association analysis. Meta-analysis showed evidence of association (P < 1 × 10 -6 ) at 9 regions: PRR9_LOR (P = 5.12 × 10 -8 ), ILDR1_CD86 (P = 6.73 × 10 -8 ), WDFY4 (P = 1.79 × 10 -7 ), PTH1R (P = 1.87 × 10 -7 ), RNF215 (P = 3.09 × 10 -7 ), AHI1_LINC00271 (P = 3.48 × 10 -7 ), JAK1 (P = 4.18 × 10 -7 ), LINC00951 (P = 5.80 × 10 -7 ), and HBP1 (P = 7.29 × 10 -7 ). Of these, PRR9_LOR, ILDR1_CD86, RNF215, LINC00951, and HBP1 were shown, for the first time, to be autoimmune disease susceptibility loci. Furthermore, associated SNPs included cis expression quantitative trait loci for WDFY4, CCDC12, MTP18, SF3A1, AHI1, COG5, HBP1, and GPR22. This study provides evidence of both unique JIA risk loci and risk loci overlapping between JIA and other autoimmune diseases. These newly associated SNPs are shown to influence gene expression, and their bounding regions tie into molecular pathways of immunologic relevance. Thus, they likely represent regions that contribute to the pathology of oligoarticular JIA and RF-negative polyarticular JIA. © 2017, American College of Rheumatology.
Nature Neuroscience Review

PubMed Central

Maze, Ian; Shen, Li; Zhang, Bin; Garcia, Benjamin A.; Shao, Ningyi; Mitchell, Amanda; Sun, HaoSheng; Akbarian, Schahram; Allis, C. David; Nestler, Eric J.

2014-01-01

Over the past decade, rapid advances in epigenomics research have extensively characterized critical roles for chromatin regulatory events during normal periods of eukaryotic cell development and plasticity, as well as part of aberrant processes implicated in human disease. Application of such approaches to studies of the central nervous system (CNS), however, is more recent. Here, we provide a comprehensive overview of currently available tools to analyze neuroepigenomics data, as well as a discussion of pending challenges specific to the field of neuroscience. Integration of numerous unbiased genome-wide and proteomic approaches will be necessary to fully understand the neuroepigenome and the extraordinarily complex nature of the human brain. This will be critical to the development of future diagnostic and therapeutic strategies aimed at alleviating the vast array of heterogeneous and genetically distinct disorders of the CNS. PMID:25349914
Application of Array Comparative Genomic Hybridization in Newborns with Multiple Congenital Anomalies.

PubMed

Szczałuba, Krzysztof; Nowakowska, Beata; Sobecka, Katarzyna; Smyk, Marta; Castaneda, Jennifer; Klapecki, Jakub; Kutkowska-Kaźmierczak, Anna; Śmigiel, Robert; Bocian, Ewa; Radkowski, Marek; Demkow, Urszula

2016-01-01

Major congenital anomalies are detectable in 2-3 % of the newborn population. Some of their genetic causes are attributable to copy number variations identified by array comparative genomic hybridization (aCGH). The value of aCGH screening as a first-tier test in children with multiple congenital anomalies has been studied and consensus adopted. However, array resolution has not been agreed upon, specifically in the newborn or infant population. Moreover, most array studies have been focused on mixed populations of intellectual disability/developmental delay with or without multiple congenital anomalies, making it difficult to assess the value of microarrays in newborns. The aim of the study was to determine the optimal quality and clinical sensitivity of high-resolution array comparative genomic hybridization in neonates with multiple congenital anomalies. We investigated a group of 54 newborns with multiple congenital anomalies defined as two or more birth defects from more than one organ system. Cytogenetic studies were performed using OGT CytoSure 8 × 60 K microarray. We found ten rearrangements in ten newborns. Of these, one recurrent syndromic microduplication was observed, whereas all other changes were unique. Six rearrangements were definitely pathogenic, including one submicroscopic and five that could be seen on routine karyotype analysis. Four other copy number variants were likely pathogenic. The candidate genes that may explain the phenotype were discussed. In conclusion, high-resolution array comparative hybridization can be applied successfully in newborns with multiple congenital anomalies as the method detects a significant number of pathogenic changes, resulting in early diagnoses. We hypothesize that small changes previously considered benign or even inherited rearrangements should be classified as potentially pathogenic at least until a subsequent clinical assessment would exclude a developmental delay or dysmorphism.
CRISPR Diversity and Microevolution in Clostridium difficile.

PubMed

Andersen, Joakim M; Shoup, Madelyn; Robinson, Cathy; Britton, Robert; Olsen, Katharina E P; Barrangou, Rodolphe

2016-09-19

Virulent strains of Clostridium difficile have become a global health problem associated with morbidity and mortality. Traditional typing methods do not provide ideal resolution to track outbreak strains, ascertain genetic diversity between isolates, or monitor the phylogeny of this species on a global basis. Here, we investigate the occurrence and diversity of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) in C. difficile to assess the potential of CRISPR-based phylogeny and high-resolution genotyping. A single Type-IB CRISPR-Cas system was identified in 217 analyzed genomes with cas gene clusters present at conserved chromosomal locations, suggesting vertical evolution of the system, assessing a total of 1,865 CRISPR arrays. The CRISPR arrays, markedly enriched (8.5 arrays/genome) compared with other species, occur both at conserved and variable locations across strains, and thus provide a basis for typing based on locus occurrence and spacer polymorphism. Clustering of strains by array composition correlated with sequence type (ST) analysis. Spacer content and polymorphism within conserved CRISPR arrays revealed phylogenetic relationship across clades and within ST. Spacer polymorphisms of conserved arrays were instrumental for differentiating closely related strains, e.g., ST1/RT027/B1 strains and pathogenicity locus encoding ST3/RT001 strains. CRISPR spacers showed sequence similarity to phage sequences, which is consistent with the native role of CRISPR-Cas as adaptive immune systems in bacteria. Overall, CRISPR-Cas sequences constitute a valuable basis for genotyping of C. difficile isolates, provide insights into the micro-evolutionary events that occur between closely related strains, and reflect the evolutionary trajectory of these genomes. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evaluation of the X-Linked High-Grade Myopia Locus (MYP1) with Cone Dysfunction and Color Vision Deficiencies

PubMed Central

Metlapally, Ravikanth; Michaelides, Michel; Bulusu, Anuradha; Li, Yi-Ju; Schwartz, Marianne; Rosenberg, Thomas; Hunt, David M.; Moore, Anthony T.; Züchner, Stephan; Rickman, Catherine Bowes; Young, Terri L.

2014-01-01

Purpose X-linked high myopia with mild cone dysfunction and color vision defects has been mapped to chromosome Xq28 (MYP1 locus). CXorf2/TEX28 is a nested, intercalated gene within the red-green opsin cone pigment gene tandem array on Xq28. The authors investigated whether TEX28 gene alterations were associated with the Xq28-linked myopia phenotype. Genomic DNA from five pedigrees (with high myopia and either protanopia or deuteranopia) that mapped to Xq28 were screened for TEX28 copy number variations (CNVs) and sequence variants. Methods To examine for CNVs, ultra-high resolution array-comparative genomic hybridization (array-CGH) assays were performed comparing the subject genomic DNA with control samples (two pairs from two pedigrees). Opsin or TEX28 gene-targeted quantitative real-time gene expression assays (comparative CT method) were performed to validate the array-CGH findings. All exons of TEX28, including intron/exon boundaries, were amplified and sequenced using standard techniques. Results Array-CGH findings revealed predicted duplications in affected patient samples. Although only three copies of TEX28 were previously reported within the opsin array, quantitative real-time analysis of the TEX28 targeted assay of affected male or carrier female individuals in these pedigrees revealed either fewer (one) or more (four or five) copies than did related and control unaffected individuals. Sequence analysis of TEX28 did not reveal any variants associated with the disease status. Conclusions CNVs have been proposed to play a role in disease inheritance and susceptibility as they affect gene dosage. TEX28 gene CNVs appear to be associated with the MYP1 X-linked myopia phenotypes. PMID:19098318
Response of Human Skin to Aesthetic Scarification

PubMed Central

Gabriel, Vincent A.; McClellan, Elizabeth A.; Scheuermann, Richard H.

2014-01-01

This study was undertaken to investigate changes in RNA expression in previously healthy adult human skin following thermal injury induced by contact with hot metal that was undertaken as part of aesthetic scarification, a body modification practice. Subjects were recruited to have pre-injury skin and serial wound biopsies performed. 4 mm punch biopsies were taken prior to branding and 1 hour, 1 week, and 1, 2 and 3 months post injury. RNA was extracted and quality assured prior to the use of a whole-genome based bead array platform to describe expression changes in the samples using the pre-injury skin as a comparator. Analysis of the array data was performed using k-means clustering and a hypergeometric probability distribution without replacement and corrections for multiple comparisons were done. Confirmatory q-PCR was performed. Using a k of 10, several clusters of genes were shown to co-cluster together based on Gene Ontology classification with probabilities unlikely to occur by chance alone. OF particular interest were clusters relating to cell cycle, proteinaceous extracellular matrix and keratinization. Given the consistent expression changes at one week following injury in the cell cycle cluster, there is an opportunity to intervene early following burn injury to influence scar development. PMID:24582755
Interpretation of clinical relevance of X-chromosome copy number variations identified in a large cohort of individuals with cognitive disorders and/or congenital anomalies.

PubMed

Willemsen, Marjolein H; de Leeuw, Nicole; de Brouwer, Arjan P M; Pfundt, Rolph; Hehir-Kwa, Jayne Y; Yntema, Helger G; Nillesen, Willy M; de Vries, Bert B A; van Bokhoven, Hans; Kleefstra, Tjitske

2012-11-01

Genome-wide array studies are now routinely being used in the evaluation of patients with cognitive disorders (CD) and/or congenital anomalies (CA). Therefore, inevitably each clinician is confronted with the challenging task of the interpretation of copy number variations detected by genome-wide array platforms in a diagnostic setting. Clinical interpretation of autosomal copy number variations is already challenging, but assessment of the clinical relevance of copy number variations of the X-chromosome is even more complex. This study provides an overview of the X-Chromosome copy number variations that we have identified by genome-wide array analysis in a large cohort of 4407 male and female patients. We have made an interpretation of the clinical relevance of each of these copy number variations based on well-defined criteria and previous reports in literature and databases. The prevalence of X-chromosome copy number variations in this cohort was 57/4407 (∼1.3%), of which 15 (0.3%) were interpreted as (likely) pathogenic. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Functional Analysis With a Barcoder Yeast Gene Overexpression System

PubMed Central

Douglas, Alison C.; Smith, Andrew M.; Sharifpoor, Sara; Yan, Zhun; Durbic, Tanja; Heisler, Lawrence E.; Lee, Anna Y.; Ryan, Owen; Göttert, Hendrikje; Surendra, Anu; van Dyk, Dewald; Giaever, Guri; Boone, Charles; Nislow, Corey; Andrews, Brenda J.

2012-01-01

Systematic analysis of gene overexpression phenotypes provides an insight into gene function, enzyme targets, and biological pathways. Here, we describe a novel functional genomics platform that enables a highly parallel and systematic assessment of overexpression phenotypes in pooled cultures. First, we constructed a genome-level collection of ~5100 yeast barcoder strains, each of which carries a unique barcode, enabling pooled fitness assays with a barcode microarray or sequencing readout. Second, we constructed a yeast open reading frame (ORF) galactose-induced overexpression array by generating a genome-wide set of yeast transformants, each of which carries an individual plasmid-born and sequence-verified ORF derived from the Saccharomyces cerevisiae full-length EXpression-ready (FLEX) collection. We combined these collections genetically using synthetic genetic array methodology, generating ~5100 strains, each of which is barcoded and overexpresses a specific ORF, a set we termed “barFLEX.” Additional synthetic genetic array allows the barFLEX collection to be moved into different genetic backgrounds. As a proof-of-principle, we describe the properties of the barFLEX overexpression collection and its application in synthetic dosage lethality studies under different environmental conditions. PMID:23050238
Rapid, sensitive and label-free detection of Shiga-toxin producing Escherichia coli O157 using carbon nanotube biosensors.

PubMed

Subramanian, Sowmya; Aschenbach, Konrad H; Evangelista, Jennifer P; Najjar, Mohamed Badaoui; Song, Wenxia; Gomez, Romel D

2012-02-15

An electronic platform to detect very small amounts of genomic DNA from bacteria without the need for PCR amplification and molecular labeling is described. The system uses carbon nanotube field-effect transistor (FET) arrays whose electrical properties are affected by minute electrical charges localized on their active regions. Two pathogenic strains of E. coli are used to evaluate the detection properties of the transistor arrays. Described herein are the results for detection of synthetic oligomers, unpurified and highly purified genomic DNA at various concentrations and their comparison against non-specific binding. In particular, the capture of genomic DNA of E. coli O157:H7 by a specific oligonucleotide probe coated onto the transistor array results in a significant shift in the threshold (gate-source) voltage (V(th)). By contrast the signal under the same procedure using a different strain, E. coli O45 that is non-complementary to the probe remained nearly constant. This work highlights the detection sensitivity and efficacy of this biosensor without stringent requirement for DNA sample preparation. Copyright © 2011 Elsevier B.V. All rights reserved.
Differential expression of THOC1 and ALY mRNP biogenesis/export factors in human cancers

PubMed Central

2011-01-01

Background One key step in gene expression is the biogenesis of mRNA ribonucleoparticle complexes (mRNPs). Formation of the mRNP requires the participation of a number of conserved factors such as the THO complex. THO interacts physically and functionally with the Sub2/UAP56 RNA-dependent ATPase, and the Yra1/REF1/ALY RNA-binding protein linking transcription, mRNA export and genome integrity. Given the link between genome instability and cancer, we have performed a comparative analysis of the expression patterns of THOC1, a THO complex subunit, and ALY in tumor samples. Methods The mRNA levels were measured by quantitative real-time PCR and hybridization of a tumor tissue cDNA array; and the protein levels and distribution by immunostaining of a custom tissue array containing a set of paraffin-embedded samples of different tumor and normal tissues followed by statistical analysis. Results We show that the expression of two mRNP factors, THOC1 and ALY are altered in several tumor tissues. THOC1 mRNA and protein levels are up-regulated in ovarian and lung tumors and down-regulated in those of testis and skin, whereas ALY is altered in a wide variety of tumors. In contrast to THOC1, ALY protein is highly detected in normal proliferative cells, but poorly in high-grade cancers. Conclusions These results suggest a differential connection between tumorogenesis and the expression levels of human THO and ALY. This study opens the possibility of defining mRNP biogenesis factors as putative players in cell proliferation that could contribute to tumor development. PMID:21329510
Genomic expression patterns in medication overuse headaches

PubMed Central

Hershey, Andrew D; Burdine, Danny; Kabbouche, Marielle A; Powers, Scott W

2016-01-01

Background Chronic daily headache (CDH) and chronic migraine (CM) are one of the most frequent problems encountered in neurology, are often difficult to treat, and frequently complicated by medication-overuse headache (MOH). Proper recognition of MOH may alter treatment outcome and prevent long term disability. Objective This study identifies the unique genomic expression pattern MOH that respond to cessation of the overused medication. Methods Baseline occurrence of MOH and typical pattern of response to medication cessation were measured from a large database. Whole blood samples from patients with CM with or without MOH were obtained and their genomic profile was assessed. Affymetrix human U133 plus2 arrays were used to examine the genomic expression patterns prior to treatment and 6–12 weeks later. Headache characterisation and response to treatment based on headache frequency and disability were compared. Results Of 1311 patients reporting daily or continuous headaches, 513 (39.1%) reported overusing analgesic medication. At follow-up, 44.5% had a 50% or greater reduction in headache frequency, while 41.6% had no change. Blood genomic expression patterns were obtained on 33 patients with 19 (57.6%) overusing analgesic medication with a unique genomic expression pattern in MOH that responded to cessation of analgesics. Gene ontology of these samples indicated a significant number were involved with brain and immunological tissues, including multiple signalling pathways and apoptosis. Conclusions Blood genomic patterns can accurately identify MOH patients that respond to medication cessation. These results suggest that MOH involves a unique molecular biology pathway that can be identified with a specific biomarker. PMID:20974594
Enhanced Methods for Local Ancestry Assignment in Sequenced Admixed Individuals

PubMed Central

Brown, Robert; Pasaniuc, Bogdan

2014-01-01

Inferring the ancestry at each locus in the genome of recently admixed individuals (e.g., Latino Americans) plays a major role in medical and population genetic inferences, ranging from finding disease-risk loci, to inferring recombination rates, to mapping missing contigs in the human genome. Although many methods for local ancestry inference have been proposed, most are designed for use with genotyping arrays and fail to make use of the full spectrum of data available from sequencing. In addition, current haplotype-based approaches are very computationally demanding, requiring large computational time for moderately large sample sizes. Here we present new methods for local ancestry inference that leverage continent-specific variants (CSVs) to attain increased performance over existing approaches in sequenced admixed genomes. A key feature of our approach is that it incorporates the admixed genomes themselves jointly with public datasets, such as 1000 Genomes, to improve the accuracy of CSV calling. We use simulations to show that our approach attains accuracy similar to widely used computationally intensive haplotype-based approaches with large decreases in runtime. Most importantly, we show that our method recovers comparable local ancestries, as the 1000 Genomes consensus local ancestry calls in the real admixed individuals from the 1000 Genomes Project. We extend our approach to account for low-coverage sequencing and show that accurate local ancestry inference can be attained at low sequencing coverage. Finally, we generalize CSVs to sub-continental population-specific variants (sCSVs) and show that in some cases it is possible to determine the sub-continental ancestry for short chromosomal segments on the basis of sCSVs. PMID:24743331

CGI: Java Software for Mapping and Visualizing Data from Array-based Comparative Genomic Hybridization and Expression Profiling

PubMed Central

Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H.; Lau, Ching C.; Behl, Sanjiv; Man, Tsz-Kwong

2007-01-01

With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License. PMID:19936083
CGI: Java software for mapping and visualizing data from array-based comparative genomic hybridization and expression profiling.

PubMed

Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H; Lau, Ching C; Behl, Sanjiv; Man, Tsz-Kwong

2007-10-06

With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.
Semi-supervised prediction of SH2-peptide interactions from imbalanced high-throughput data.

PubMed

Kundu, Kousik; Costa, Fabrizio; Huber, Michael; Reth, Michael; Backofen, Rolf

2013-01-01

Src homology 2 (SH2) domains are the largest family of the peptide-recognition modules (PRMs) that bind to phosphotyrosine containing peptides. Knowledge about binding partners of SH2-domains is key for a deeper understanding of different cellular processes. Given the high binding specificity of SH2, in-silico ligand peptide prediction is of great interest. Currently however, only a few approaches have been published for the prediction of SH2-peptide interactions. Their main shortcomings range from limited coverage, to restrictive modeling assumptions (they are mainly based on position specific scoring matrices and do not take into consideration complex amino acids inter-dependencies) and high computational complexity. We propose a simple yet effective machine learning approach for a large set of known human SH2 domains. We used comprehensive data from micro-array and peptide-array experiments on 51 human SH2 domains. In order to deal with the high data imbalance problem and the high signal-to-noise ration, we casted the problem in a semi-supervised setting. We report competitive predictive performance w.r.t. state-of-the-art. Specifically we obtain 0.83 AUC ROC and 0.93 AUC PR in comparison to 0.71 AUC ROC and 0.87 AUC PR previously achieved by the position specific scoring matrices (PSSMs) based SMALI approach. Our work provides three main contributions. First, we showed that better models can be obtained when the information on the non-interacting peptides (negative examples) is also used. Second, we improve performance when considering high order correlations between the ligand positions employing regularization techniques to effectively avoid overfitting issues. Third, we developed an approach to tackle the data imbalance problem using a semi-supervised strategy. Finally, we performed a genome-wide prediction of human SH2-peptide binding, uncovering several findings of biological relevance. We make our models and genome-wide predictions, for all the 51 SH2-domains, freely available to the scientific community under the following URLs: http://www.bioinf.uni-freiburg.de/Software/SH2PepInt/SH2PepInt.tar.gz and http://www.bioinf.uni-freiburg.de/Software/SH2PepInt/Genome-wide-predictions.tar.gz, respectively.
Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

PubMed

Lane, William J; Westhoff, Connie M; Gleadall, Nicholas S; Aguad, Maria; Smeland-Wagman, Robin; Vege, Sunitha; Simmons, Daimon P; Mah, Helen H; Lebo, Matthew S; Walter, Klaudia; Soranzo, Nicole; Di Angelantonio, Emanuele; Danesh, John; Roberts, David J; Watkins, Nick A; Ouwehand, Willem H; Butterworth, Adam S; Kaufman, Richard M; Rehm, Heidi L; Silberstein, Leslie E; Green, Robert C

2018-06-01

There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens. This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons. We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 MedSeq genomes. Additional modifications led to the final algorithm, which was 99·2% concordant across 200 INTERVAL genomes (or 99·9% after adjustment for the lower depth of coverage). By enabling more precise antigen-matching of patients with blood donors, antigen typing based on whole-genome sequencing provides a novel approach to improve transfusion outcomes with the potential to transform the practice of transfusion medicine. National Human Genome Research Institute, Doris Duke Charitable Foundation, National Health Service Blood and Transplant, National Institute for Health Research, and Wellcome Trust. Copyright © 2018 Elsevier Ltd. All rights reserved.
CRF: detection of CRISPR arrays using random forest.

PubMed

Wang, Kai; Liang, Chun

2017-01-01

CRISPRs (clustered regularly interspaced short palindromic repeats) are particular repeat sequences found in wide range of bacteria and archaea genomes. Several tools are available for detecting CRISPR arrays in the genomes of both domains. Here we developed a new web-based CRISPR detection tool named CRF (CRISPR Finder by Random Forest). Different from other CRISPR detection tools, a random forest classifier was used in CRF to filter out invalid CRISPR arrays from all putative candidates and accordingly enhanced detection accuracy. In CRF, particularly, triplet elements that combine both sequence content and structure information were extracted from CRISPR repeats for classifier training. The classifier achieved high accuracy and sensitivity. Moreover, CRF offers a highly interactive web interface for robust data visualization that is not available among other CRISPR detection tools. After detection, the query sequence, CRISPR array architecture, and the sequences and secondary structures of CRISPR repeats and spacers can be visualized for visual examination and validation. CRF is freely available at http://bioinfolab.miamioh.edu/crf/home.php.
Entamoeba histolytica: construction and applications of subgenomic databases.

PubMed

Hofer, Margit; Duchêne, Michael

2005-07-01

Knowledge about the influence of environmental stress such as the action of chemotherapeutic agents on gene expression in Entamoeba histolytica is limited. We plan to use oligonucleotide microarray hybridization to approach these questions. As the basis for our array, sequence data from the genome project carried out by the Institute for Genomic Research (TIGR) and the Sanger Institute were used to annotate parts of the parasite genome. Three subgenomic databases containing enzymes, cytoskeleton genes, and stress genes were compiled with the help of the ExPASy proteomics website and the BLAST servers at the two genome project sites. The known sequences from reference species, mostly human and Escherichia coli, were searched against TIGR and Sanger E. histolytica sequence contigs and the homologs were copied into a Microsoft Access database. In a similar way, two additional databases of cytoskeletal genes and stress genes were generated. Metabolic pathways could be assembled from our enzyme database, but sometimes they were incomplete as is the case for the sterol biosynthesis pathway. The raw databases contained a significant number of duplicate entries which were merged to obtain curated non-redundant databases. This procedure revealed that some E. histolytica genes may have several putative functions. Representative examples such as the case of the delta-aminolevulinate synthase/serine palmitoyltransferase are discussed.
Genome-wide association study of response to cognitive-behavioural therapy in children with anxiety disorders.

PubMed

Coleman, Jonathan R I; Lester, Kathryn J; Keers, Robert; Roberts, Susanna; Curtis, Charles; Arendt, Kristian; Bögels, Susan; Cooper, Peter; Creswell, Cathy; Dalgleish, Tim; Hartman, Catharina A; Heiervang, Einar R; Hötzel, Katrin; Hudson, Jennifer L; In-Albon, Tina; Lavallee, Kristen; Lyneham, Heidi J; Marin, Carla E; Meiser-Stedman, Richard; Morris, Talia; Nauta, Maaike H; Rapee, Ronald M; Schneider, Silvia; Schneider, Sophie C; Silverman, Wendy K; Thastum, Mikael; Thirlwall, Kerstin; Waite, Polly; Wergeland, Gro Janne; Breen, Gerome; Eley, Thalia C

2016-09-01

Anxiety disorders are common, and cognitive-behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. No variants passed a genome-wide significance threshold (P = 5 × 10(-8)) in either analysis. Four variants met criteria for suggestive significance (P<5 × 10(-6)) in association with response post-treatment, and three variants in the 6-month follow-up analysis. This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts. © The Royal College of Psychiatrists 2016.
Detection of clinically relevant copy number alterations in oral cancer progression using multiplexed droplet digital PCR.

PubMed

Hughesman, Curtis B; Lu, X J David; Liu, Kelly Y P; Zhu, Yuqi; Towle, Rebecca M; Haynes, Charles; Poh, Catherine F

2017-09-19

Copy number alterations (CNAs), a common genomic event during carcinogenesis, are known to affect a large fraction of the genome. Common recurrent gains or losses of specific chromosomal regions occur at frequencies that they may be considered distinctive features of tumoral cells. Here we introduce a novel multiplexed droplet digital PCR (ddPCR) assay capable of detecting recurrent CNAs that drive tumorigenesis of oral squamous cell carcinoma. Applied to DNA extracted from oral cell lines and clinical samples of various disease stages, we found good agreement between CNAs detected by our ddPCR assay with those previously reported using comparative genomic hybridization or single nucleotide polymorphism arrays. Furthermore, we demonstrate that the ability to target specific locations of the genome permits detection of clinically relevant oncogenic events such as small, submicroscopic homozygous deletions. Additional capabilities of the multiplexed ddPCR assay include the ability to infer ploidy level, quantify the change in copy number of target loci with high-level gains, and simultaneously assess the status and viral load for high-risk human papillomavirus types 16 and 18. This novel multiplexed ddPCR assay therefore may have clinical value in differentiating between benign oral lesions from those that are at risk of progressing to oral cancer.
Brief Overview of a Decade of Genome-Wide Association Studies on Primary Hypertension.

PubMed

Azam, Afifah Binti; Azizan, Elena Aisha Binti

2018-01-01

Primary hypertension is widely believed to be a complex polygenic disorder with the manifestation influenced by the interactions of genomic and environmental factors making identification of susceptibility genes a major challenge. With major advancement in high-throughput genotyping technology, genome-wide association study (GWAS) has become a powerful tool for researchers studying genetically complex diseases. GWASs work through revealing links between DNA sequence variation and a disease or trait with biomedical importance. The human genome is a very long DNA sequence which consists of billions of nucleotides arranged in a unique way. A single base-pair change in the DNA sequence is known as a single nucleotide polymorphism (SNP). With the help of modern genotyping techniques such as chip-based genotyping arrays, thousands of SNPs can be genotyped easily. Large-scale GWASs, in which more than half a million of common SNPs are genotyped and analyzed for disease association in hundreds of thousands of cases and controls, have been broadly successful in identifying SNPs associated with heart diseases, diabetes, autoimmune diseases, and psychiatric disorders. It is however still debatable whether GWAS is the best approach for hypertension. The following is a brief overview on the outcomes of a decade of GWASs on primary hypertension.
The genomic ancestry, landscape genetics and invasion history of introduced mice in New Zealand

PubMed Central

Russell, James C.; King, Carolyn M.

2018-01-01

The house mouse (Mus musculus) provides a fascinating system for studying both the genomic basis of reproductive isolation, and the patterns of human-mediated dispersal. New Zealand has a complex history of mouse invasions, and the living descendants of these invaders have genetic ancestry from all three subspecies, although most are primarily descended from M. m. domesticus. We used the GigaMUGA genotyping array (approximately 135 000 loci) to describe the genomic ancestry of 161 mice, sampled from 34 locations from across New Zealand (and one Australian city—Sydney). Of these, two populations, one in the south of the South Island, and one on Chatham Island, showed complete mitochondrial lineage capture, featuring two different lineages of M. m. castaneus mitochondrial DNA but with only M. m. domesticus nuclear ancestry detectable. Mice in the northern and southern parts of the North Island had small traces (approx. 2–3%) of M. m. castaneus nuclear ancestry, and mice in the upper South Island had approximately 7–8% M. m. musculus nuclear ancestry including some Y-chromosomal ancestry—though no detectable M. m. musculus mitochondrial ancestry. This is the most thorough genomic study of introduced populations of house mice yet conducted, and will have relevance to studies of the isolation mechanisms separating subspecies of mice. PMID:29410804
Genome-wide association study of response to cognitive–behavioural therapy in children with anxiety disorders

PubMed Central

Coleman, Jonathan R. I.; Lester, Kathryn J.; Keers, Robert; Roberts, Susanna; Curtis, Charles; Arendt, Kristian; Bögels, Susan; Cooper, Peter; Creswell, Cathy; Dalgleish, Tim; Hartman, Catharina A.; Heiervang, Einar R.; Hötzel, Katrin; Hudson, Jennifer L.; In-Albon, Tina; Lavallee, Kristen; Lyneham, Heidi J.; Marin, Carla E.; Meiser-Stedman, Richard; Morris, Talia; Nauta, Maaike H.; Rapee, Ronald M.; Schneider, Silvia; Schneider, Sophie C.; Silverman, Wendy K.; Thastum, Mikael; Thirlwall, Kerstin; Waite, Polly; Wergeland, Gro Janne; Breen, Gerome; Eley, Thalia C.

2016-01-01

Background Anxiety disorders are common, and cognitive–behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. Aims To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Method Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. Results No variants passed a genome-wide significance threshold (P = 5 × 10−8) in either analysis. Four variants met criteria for suggestive significance (P<5 × 10−6) in association with response post-treatment, and three variants in the 6-month follow-up analysis. Conclusions This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts. PMID:26989097
Chromosome Transfer Induced Aneuploidy Results in Complex Dysregulation of the Cellular Transcriptome in Immortalized and Cancer Cells

PubMed Central

Upender, Madhvi B.; Habermann, Jens K.; McShane, Lisa M.; Korn, Edward L.; Barrett, J. Carl; Difilippantonio, Michael J.; Ried, Thomas

2016-01-01

Chromosomal aneuploidies are observed in essentially all sporadic carcinomas. These aneuploidies result in tumor-specific patterns of genomic imbalances that are acquired early during tumorigenesis, continuously selected for and faithfully maintained in cancer cells. Although the paradigm of translocation induced oncogene activation in hematologic malignancies is firmly established, it is not known how genomic imbalances affect chromosome-specific gene expression patterns in particular and how chromosomal aneuploidy dysregulates the genetic equilibrium of cells in general. To model specific chromosomal aneuploidies in cancer cells and dissect the immediate consequences of genomic imbalances on the transcriptome, we generated artificial trisomies in a karyotypically stable diploid yet mismatch repair-deficient, colorectal cancer cell line and in telomerase immortalized, cytogenetically normal human breast epithelial cells using microcell-mediated chromosome transfer. The global consequences on gene expression levels were analyzed using cDNA arrays. Our results show that regardless of chromosome or cell type, chromosomal trisomies result in a significant increase in the average transcriptional activity of the trisomic chromosome. This increase affects the expression of numerous genes on other chromosomes as well. We therefore postulate that the genomic imbalances observed in cancer cells exert their effect through a complex pattern of transcriptional dysregulation. PMID:15466185
A Large Maize (Zea mays L.) SNP Genotyping Array: Development and Germplasm Genotyping, and Genetic Mapping to Compare with the B73 Reference Genome

PubMed Central

Ganal, Martin W.; Durstewitz, Gregor; Polley, Andreas; Bérard, Aurélie; Buckler, Edward S.; Charcosset, Alain; Clarke, Joseph D.; Graner, Eva-Maria; Hansen, Mark; Joets, Johann; Le Paslier, Marie-Christine; McMullen, Michael D.; Montalent, Pierre; Rose, Mark; Schön, Chris-Carolin; Sun, Qi; Walter, Hildrun; Martin, Olivier C.; Falque, Matthieu

2011-01-01

SNP genotyping arrays have been useful for many applications that require a large number of molecular markers such as high-density genetic mapping, genome-wide association studies (GWAS), and genomic selection. We report the establishment of a large maize SNP array and its use for diversity analysis and high density linkage mapping. The markers, taken from more than 800,000 SNPs, were selected to be preferentially located in genes and evenly distributed across the genome. The array was tested with a set of maize germplasm including North American and European inbred lines, parent/F1 combinations, and distantly related teosinte material. A total of 49,585 markers, including 33,417 within 17,520 different genes and 16,168 outside genes, were of good quality for genotyping, with an average failure rate of 4% and rates up to 8% in specific germplasm. To demonstrate this array's use in genetic mapping and for the independent validation of the B73 sequence assembly, two intermated maize recombinant inbred line populations – IBM (B73×Mo17) and LHRF (F2×F252) – were genotyped to establish two high density linkage maps with 20,913 and 14,524 markers respectively. 172 mapped markers were absent in the current B73 assembly and their placement can be used for future improvements of the B73 reference sequence. Colinearity of the genetic and physical maps was mostly conserved with some exceptions that suggest errors in the B73 assembly. Five major regions containing non-colinearities were identified on chromosomes 2, 3, 6, 7 and 9, and are supported by both independent genetic maps. Four additional non-colinear regions were found on the LHRF map only; they may be due to a lower density of IBM markers in those regions or to true structural rearrangements between lines. Given the array's high quality, it will be a valuable resource for maize genetics and many aspects of maize breeding. PMID:22174790
Exome sequencing and arrayCGH detection of gene sequence and copy number variation between ILS and ISS mouse strains.

PubMed

Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M

2014-06-01

It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to contribute to the alcohol-related phenotypic differences associated with these strains.
University of Texas Southwestern Medical Center: High-Throughput siRNA Screening of a Non-Small Cell Lung Cancer (NSCLC) Cell Line Panel | Office of Cancer Genomics

Cancer.gov

The goal of this project is to use siRNA screens to identify NSCLC-selective siRNAs from two genome-wide libraries that will allow us to functionally define genetic dependencies of subtypes of NSCLC. Using bioinformatics tools, the CTD2 center at the University of Texas Southwestern Medical Center are discovering associations between this functional data (siRNAs) and NSCLC mutational status, methylation arrays, gene expression arrays, and copy number variation data that will help us identify new targets and enrollment biomarkers.
University of Texas Southwestern Medical Center (UTSW): High-Throughput siRNA Screening of a Non-Small Cell Lung Cancer (NSCLC) Cell Line Panel | Office of Cancer Genomics

Cancer.gov

The goal of this project is to use siRNA screens to identify NSCLC-selective siRNAs from two genome-wide libraries that will allow us to functionally define genetic dependencies of subtypes of NSCLC. Using bioinformatics tools, the CTD2 center at the University of Texas Southwestern Medical Center are discovering associations between this functional data (siRNAs) and NSCLC mutational status, methylation arrays, gene expression arrays, and copy number variation data that will help us identify new targets and enrollment biomarkers.
BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes.

PubMed

Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

2016-07-01

The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Whole-genome single-nucleotide polymorphism (SNP) marker discovery and association analysis with the eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content in Larimichthys crocea

PubMed Central

Xiao, Shijun; Wang, Panpan; Dong, Linsong; Zhang, Yaguang; Han, Zhaofang; Wang, Qiurong

2016-01-01

Whole-genome single-nucleotide polymorphism (SNP) markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS) provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms. PMID:28028455
Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors

PubMed Central

Kiu, Raymond; Caim, Shabhonam; Alexander, Sarah; Pachori, Purnima; Hall, Lindsay J.

2017-01-01

Clostridium perfringens is an important cause of animal and human infections, however information about the genetic makeup of this pathogenic bacterium is currently limited. In this study, we sought to understand and characterise the genomic variation, pangenomic diversity, and key virulence traits of 56 C. perfringens strains which included 51 public, and 5 newly sequenced and annotated genomes using Whole Genome Sequencing. Our investigation revealed that C. perfringens has an “open” pangenome comprising 11667 genes and 12.6% of core genes, identified as the most divergent single-species Gram-positive bacterial pangenome currently reported. Our computational analyses also defined C. perfringens phylogeny (16S rRNA gene) in relation to some 25 Clostridium species, with C. baratii and C. sardiniense determined to be the closest relatives. Profiling virulence-associated factors confirmed presence of well-characterised C. perfringens-associated exotoxins genes including α-toxin (plc), enterotoxin (cpe), and Perfringolysin O (pfo or pfoA), although interestingly there did not appear to be a close correlation with encoded toxin type and disease phenotype. Furthermore, genomic analysis indicated significant horizontal gene transfer events as defined by presence of prophage genomes, and notably absence of CRISPR defence systems in >70% (40/56) of the strains. In relation to antimicrobial resistance mechanisms, tetracycline resistance genes (tet) and anti-defensins genes (mprF) were consistently detected in silico (tet: 75%; mprF: 100%). However, pre-antibiotic era strain genomes did not encode for tet, thus implying antimicrobial selective pressures in C. perfringens evolutionary history over the past 80 years. This study provides new genomic understanding of this genetically divergent multi-host bacterium, and further expands our knowledge on this medically and veterinary important pathogen. PMID:29312194
Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors.

PubMed

Kiu, Raymond; Caim, Shabhonam; Alexander, Sarah; Pachori, Purnima; Hall, Lindsay J

2017-01-01

Clostridium perfringens is an important cause of animal and human infections, however information about the genetic makeup of this pathogenic bacterium is currently limited. In this study, we sought to understand and characterise the genomic variation, pangenomic diversity, and key virulence traits of 56 C. perfringens strains which included 51 public, and 5 newly sequenced and annotated genomes using Whole Genome Sequencing. Our investigation revealed that C. perfringens has an "open" pangenome comprising 11667 genes and 12.6% of core genes, identified as the most divergent single-species Gram-positive bacterial pangenome currently reported. Our computational analyses also defined C. perfringens phylogeny (16S rRNA gene) in relation to some 25 Clostridium species, with C. baratii and C. sardiniense determined to be the closest relatives. Profiling virulence-associated factors confirmed presence of well-characterised C. perfringens -associated exotoxins genes including α-toxin ( plc ), enterotoxin ( cpe ), and Perfringolysin O ( pfo or pfoA ), although interestingly there did not appear to be a close correlation with encoded toxin type and disease phenotype. Furthermore, genomic analysis indicated significant horizontal gene transfer events as defined by presence of prophage genomes, and notably absence of CRISPR defence systems in >70% (40/56) of the strains. In relation to antimicrobial resistance mechanisms, tetracycline resistance genes ( tet ) and anti-defensins genes ( mprF ) were consistently detected in silico ( tet : 75%; mprF : 100%). However, pre-antibiotic era strain genomes did not encode for tet , thus implying antimicrobial selective pressures in C. perfringens evolutionary history over the past 80 years. This study provides new genomic understanding of this genetically divergent multi-host bacterium, and further expands our knowledge on this medically and veterinary important pathogen.

Generation of an arrayed CRISPR-Cas9 library targeting epigenetic regulators: from high-content screens to in vivo assays

PubMed Central

2017-01-01

ABSTRACT The CRISPR-Cas9 system has revolutionized genome engineering, allowing precise modification of DNA in various organisms. The most popular method for conducting CRISPR-based functional screens involves the use of pooled lentiviral libraries in selection screens coupled with next-generation sequencing. Screens employing genome-scale pooled small guide RNA (sgRNA) libraries are demanding, particularly when complex assays are used. Furthermore, pooled libraries are not suitable for microscopy-based high-content screens or for systematic interrogation of protein function. To overcome these limitations and exploit CRISPR-based technologies to comprehensively investigate epigenetic mechanisms, we have generated a focused sgRNA library targeting 450 epigenetic regulators with multiple sgRNAs in human cells. The lentiviral library is available both in an arrayed and pooled format and allows temporally-controlled induction of gene knock-out. Characterization of the library showed high editing activity of most sgRNAs and efficient knock-out at the protein level in polyclonal populations. The sgRNA library can be used for both selection and high-content screens, as well as for targeted investigation of selected proteins without requiring isolation of knock-out clones. Using a variety of functional assays we show that the library is suitable for both in vitro and in vivo applications, representing a unique resource to study epigenetic mechanisms in physiological and pathological conditions. PMID:29327641
Genome-wide common and rare variant analysis provides novel insights into clozapine-associated neutropenia.

PubMed

Legge, S E; Hamshere, M L; Ripke, S; Pardinas, A F; Goldstein, J I; Rees, E; Richards, A L; Leonenko, G; Jorskog, L F; Chambert, K D; Collier, D A; Genovese, G; Giegling, I; Holmans, P; Jonasdottir, A; Kirov, G; McCarroll, S A; MacCabe, J H; Mantripragada, K; Moran, J L; Neale, B M; Stefansson, H; Rujescu, D; Daly, M J; Sullivan, P F; Owen, M J; O'Donovan, M C; Walters, J T R

2017-10-01

The antipsychotic clozapine is uniquely effective in the management of schizophrenia; however, its use is limited by its potential to induce agranulocytosis. The causes of this, and of its precursor neutropenia, are largely unknown, although genetic factors have an important role. We sought risk alleles for clozapine-associated neutropenia in a sample of 66 cases and 5583 clozapine-treated controls, through a genome-wide association study (GWAS), imputed human leukocyte antigen (HLA) alleles, exome array and copy-number variation (CNV) analyses. We then combined associated variants in a meta-analysis with data from the Clozapine-Induced Agranulocytosis Consortium (up to 163 cases and 7970 controls). In the largest combined sample to date, we identified a novel association with rs149104283 (odds ratio (OR)=4.32, P=1.79 × 10 -8 ), intronic to transcripts of SLCO1B3 and SLCO1B7, members of a family of hepatic transporter genes previously implicated in adverse drug reactions including simvastatin-induced myopathy and docetaxel-induced neutropenia. Exome array analysis identified gene-wide associations of uncommon non-synonymous variants within UBAP2 and STARD9. We additionally provide independent replication of a previously identified variant in HLA-DQB1 (OR=15.6, P=0.015, positive predictive value=35.1%). These results implicate biological pathways through which clozapine may act to cause this serious adverse effect.
Epidemiology of transmissible diseases: Array hybridization and next generation sequencing as universal nucleic acid-mediated typing tools.

PubMed

Michael Dunne, W; Pouseele, Hannes; Monecke, Stefan; Ehricht, Ralf; van Belkum, Alex

2017-09-21

The magnitude of interest in the epidemiology of transmissible human diseases is reflected in the vast number of tools and methods developed recently with the expressed purpose to characterize and track evolutionary changes that occur in agents of these diseases over time. Within the past decade a new suite of such tools has become available with the emergence of the so-called "omics" technologies. Among these, two are exponents of the ongoing genomic revolution. Firstly, high-density nucleic acid probe arrays have been proposed and developed using various chemical and physical approaches. Via hybridization-mediated detection of entire genes or genetic polymorphisms in such genes and intergenic regions these so called "DNA chips" have been successfully applied for distinguishing very closely related microbial species and strains. Second and even more phenomenal, next generation sequencing (NGS) has facilitated the assessment of the complete nucleotide sequence of entire microbial genomes. This technology currently provides the most detailed level of bacterial genotyping and hence allows for the resolution of microbial spread and short-term evolution in minute detail. We will here review the very recent history of these two technologies, sketch their usefulness in the elucidation of the spread and epidemiology of mostly hospital-acquired infections and discuss future developments. Copyright © 2017 Elsevier B.V. All rights reserved.
Genome-wide common and rare variant analysis provides novel insights into clozapine-associated neutropenia

PubMed Central

Legge, S E; Hamshere, M L; Ripke, S; Pardinas, A F; Goldstein, J I; Rees, E; Richards, A L; Leonenko, G; Jorskog, L F; Goldstein, Jacqueline I; Jarskog, L Fredrik; Hilliard, Chris; Alfirevic, Ana; Duncan, Laramie; Fourches, Denis; Huang, Hailiang; Lek, Monkol; Neale, Benjamin M; Ripke, Stephan; Shianna, Kevin; Szatkiewicz, Jin P; Tropsha, Alexander; van den Oord, Edwin JCG; Cascorbi, Ingolf; Dettling, Michael; Gazit, Ephraim; Goff, Donald C; Holden, Arthur L; Kelly, Deanna L; Malhotra, Anil K; Nielsen, Jimmi; Pirmohamed, Munir; Rujescu, Dan; Werge, Thomas; Levy, Deborah L; Josiassen, Richard C; Kennedy, James L; Lieberman, Jeffrey A; Daly, Mark J; Sullivan, Patrick F; Chambert, K D; Collier, D A; Genovese, G; Giegling, I; Holmans, P; Jonasdottir, A; Kirov, G; McCarroll, S A; MacCabe, J H; Mantripragada, K; Moran, J L; Neale, B M; Stefansson, H; Rujescu, D; Daly, M J; Sullivan, P F; Owen, M J; O'Donovan, M C; Walters, J T R

2017-01-01

The antipsychotic clozapine is uniquely effective in the management of schizophrenia; however, its use is limited by its potential to induce agranulocytosis. The causes of this, and of its precursor neutropenia, are largely unknown, although genetic factors have an important role. We sought risk alleles for clozapine-associated neutropenia in a sample of 66 cases and 5583 clozapine-treated controls, through a genome-wide association study (GWAS), imputed human leukocyte antigen (HLA) alleles, exome array and copy-number variation (CNV) analyses. We then combined associated variants in a meta-analysis with data from the Clozapine-Induced Agranulocytosis Consortium (up to 163 cases and 7970 controls). In the largest combined sample to date, we identified a novel association with rs149104283 (odds ratio (OR)=4.32, P=1.79 × 10−8), intronic to transcripts of SLCO1B3 and SLCO1B7, members of a family of hepatic transporter genes previously implicated in adverse drug reactions including simvastatin-induced myopathy and docetaxel-induced neutropenia. Exome array analysis identified gene-wide associations of uncommon non-synonymous variants within UBAP2 and STARD9. We additionally provide independent replication of a previously identified variant in HLA-DQB1 (OR=15.6, P=0.015, positive predictive value=35.1%). These results implicate biological pathways through which clozapine may act to cause this serious adverse effect. PMID:27400856
Mobile phone radiation causes changes in gene and protein expression in human endothelial cell lines and the response seems to be genome- and proteome-dependent.

PubMed

Nylund, Reetta; Leszczynski, Dariusz

2006-09-01

We have examined in vitro cell response to mobile phone radiation (900 MHz GSM signal) using two variants of human endothelial cell line: EA.hy926 and EA.hy926v1. Gene expression changes were examined in three experiments using cDNA Expression Arrays and protein expression changes were examined in ten experiments using 2-DE and PDQuest software. Obtained results show that gene and protein expression were altered, in both examined cell lines, in response to one hour mobile phone radiation exposure at an average specific absorption rate of 2.8 W/kg. However, the same genes and proteins were differently affected by the exposure in each of the cell lines. This suggests that the cell response to mobile phone radiation might be genome- and proteome-dependent. Therefore, it is likely that different types of cells and from different species might respond differently to mobile phone radiation or might have different sensitivity to this weak stimulus. Our findings might also explain, at least in part, the origin of discrepancies in replication studies between different laboratories.
Bubble-chip analysis of human origin distributions demonstrates on a genomic scale significant clustering into zones and significant association with transcription

PubMed Central

Mesner, Larry D.; Valsakumar, Veena; Karnani, Neerja; Dutta, Anindya; Hamlin, Joyce L.; Bekiranov, Stefan

2011-01-01

We have used a novel bubble-trapping procedure to construct nearly pure and comprehensive human origin libraries from early S- and log-phase HeLa cells, and from log-phase GM06990, a karyotypically normal lymphoblastoid cell line. When hybridized to ENCODE tiling arrays, these libraries illuminated 15.3%, 16.4%, and 21.8% of the genome in the ENCODE regions, respectively. Approximately half of the origin fragments cluster into zones, and their signals are generally higher than those of isolated fragments. Interestingly, initiation events are distributed about equally between genic and intergenic template sequences. While only 13.2% and 14.0% of genes within the ENCODE regions are actually transcribed in HeLa and GM06990 cells, 54.5% and 25.6% of zonal origin fragments overlap transcribed genes, most with activating chromatin marks in their promoters. Our data suggest that cell synchronization activates a significant number of inchoate origins. In addition, HeLa and GM06990 cells activate remarkably different origin populations. Finally, there is only moderate concordance between the log-phase HeLa bubble map and published maps of small nascent strands for this cell line. PMID:21173031
An epigenome-wide study of body mass index and DNA methylation in blood using participants from the Sister Study cohort.

PubMed

Wilson, L E; Harlid, S; Xu, Z; Sandler, D P; Taylor, J A

2017-01-01

The relationship between obesity and chronic disease risk is well-established; the underlying biological mechanisms driving this risk increase may include obesity-related epigenetic modifications. To explore this hypothesis, we conducted a genome-wide analysis of DNA methylation and body mass index (BMI) using data from a subset of women in the Sister Study. The Sister Study is a cohort of 50 884 US women who had a sister with breast cancer but were free of breast cancer themselves at enrollment. Study participants completed examinations which included measurements of height and weight, and provided blood samples. Blood DNA methylation data generated with the Illumina Infinium HumanMethylation27 BeadChip array covering 27,589 CpG sites was available for 871 women from a prior study of breast cancer and DNA methylation. To identify differentially methylated CpG sites associated with BMI, we analyzed this methylation data using robust linear regression with adjustment for age and case status. For those CpGs passing the false discovery rate significance level, we examined the association in a replication set comprised of a non-overlapping group of 187 women from the Sister Study who had DNA methylation data generated using the Infinium HumanMethylation450 BeadChip array. Analysis of this expanded 450 K array identified additional BMI-associated sites which were investigated with targeted pyrosequencing. Four CpG sites reached genome-wide significance (false discovery rate (FDR) q<0.05) in the discovery set and associations for all four were significant at strict Bonferroni correction in the replication set. An additional 23 sites passed FDR in the replication set and five were replicated by pyrosequencing in the discovery set. Several of the genes identified including ANGPT4, RORC, SOCS3, FSD2, XYLT1, ABCG1, STK39, ASB2 and CRHR2 have been linked to obesity and obesity-related chronic diseases. Our findings support the hypothesis that obesity-related epigenetic differences are detectable in blood and may be related to risk of chronic disease.
Genome-Wide Prediction and Validation of Peptides That Bind Human Prosurvival Bcl-2 Proteins

PubMed Central

DeBartolo, Joe; Taipale, Mikko; Keating, Amy E.

2014-01-01

Programmed cell death is regulated by interactions between pro-apoptotic and prosurvival members of the Bcl-2 family. Pro-apoptotic family members contain a weakly conserved BH3 motif that can adopt an alpha-helical structure and bind to a groove on prosurvival partners Bcl-xL, Bcl-w, Bcl-2, Mcl-1 and Bfl-1. Peptides corresponding to roughly 13 reported BH3 motifs have been verified to bind in this manner. Due to their short lengths and low sequence conservation, BH3 motifs are not detected using standard sequence-based bioinformatics approaches. Thus, it is possible that many additional proteins harbor BH3-like sequences that can mediate interactions with the Bcl-2 family. In this work, we used structure-based and data-based Bcl-2 interaction models to find new BH3-like peptides in the human proteome. We used peptide SPOT arrays to test candidate peptides for interaction with one or more of the prosurvival proteins Bcl-xL, Bcl-w, Bcl-2, Mcl-1 and Bfl-1. For the 36 most promising array candidates, we quantified binding to all five human receptors using direct and competition binding assays in solution. All 36 peptides showed evidence of interaction with at least one prosurvival protein, and 22 peptides bound at least one prosurvival protein with a dissociation constant between 1 and 500 nM; many peptides had specificity profiles not previously observed. We also screened the full-length parent proteins of a subset of array-tested peptides for binding to Bcl-xL and Mcl-1. Finally, we used the peptide binding data, in conjunction with previously reported interactions, to assess the affinity and specificity prediction performance of different models. PMID:24967846
GENOME-WIDE GENETIC INTERACTION ANALYSIS OF GLAUCOMA USING EXPERT KNOWLEDGE DERIVED FROM HUMAN PHENOTYPE NETWORKS

PubMed Central

HU, TING; DARABOS, CHRISTIAN; CRICCO, MARIA E.; KONG, EMILY; MOORE, JASON H.

2014-01-01

The large volume of GWAS data poses great computational challenges for analyzing genetic interactions associated with common human diseases. We propose a computational framework for characterizing epistatic interactions among large sets of genetic attributes in GWAS data. We build the human phenotype network (HPN) and focus around a disease of interest. In this study, we use the GLAUGEN glaucoma GWAS dataset and apply the HPN as a biological knowledge-based filter to prioritize genetic variants. Then, we use the statistical epistasis network (SEN) to identify a significant connected network of pairwise epistatic interactions among the prioritized SNPs. These clearly highlight the complex genetic basis of glaucoma. Furthermore, we identify key SNPs by quantifying structural network characteristics. Through functional annotation of these key SNPs using Biofilter, a software accessing multiple publicly available human genetic data sources, we find supporting biomedical evidences linking glaucoma to an array of genetic diseases, proving our concept. We conclude by suggesting hypotheses for a better understanding of the disease. PMID:25592582
Development and Evaluation of a 9K SNP Array for Peach by Internationally Coordinated SNP Detection and Validation in Breeding Germplasm

PubMed Central

Scalabrin, Simone; Gilmore, Barbara; Lawley, Cynthia T.; Gasic, Ksenija; Micheletti, Diego; Rosyara, Umesh R.; Cattonaro, Federica; Vendramin, Elisa; Main, Dorrie; Aramini, Valeria; Blas, Andrea L.; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Troggio, Michela; Sosinski, Bryon; Aranzana, Maria José; Arús, Pere; Iezzoni, Amy; Morgante, Michele; Peace, Cameron

2012-01-01

Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs. The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species. PMID:22536421
Array-Based Comparative Genomic Hybridization Analysis Reveals Chromosomal Copy Number Aberrations Associated with Clinical Outcome in Canine Diffuse Large B-Cell Lymphoma

PubMed Central

Bresolin, Silvia; Marconato, Laura; Comazzi, Stefano; Te Kronnie, Geertruy; Aresu, Luca

2014-01-01

Canine Diffuse Large B-cell Lymphoma (cDLBCL) is an aggressive cancer with variable clinical response. Despite recent attempts by gene expression profiling to identify the dog as a potential animal model for human DLBCL, this tumor remains biologically heterogeneous with no prognostic biomarkers to predict prognosis. The aim of this work was to identify copy number aberrations (CNAs) by high-resolution array comparative genomic hybridization (aCGH) in 12 dogs with newly diagnosed DLBCL. In a subset of these dogs, the genetic profiles at the end of therapy and at relapse were also assessed. In primary DLBCLs, 90 different genomic imbalances were counted, consisting of 46 gains and 44 losses. Two gains in chr13 were significantly correlated with clinical stage. In addition, specific regions of gains and losses were significantly associated to duration of remission. In primary DLBCLs, individual variability was found, however 14 recurrent CNAs (>30%) were identified. Losses involving IGK, IGL and IGH were always found, and gains along the length of chr13 and chr31 were often observed (>41%). In these segments, MYC, LDHB, HSF1, KIT and PDGFRα are annotated. At the end of therapy, dogs in remission showed four new CNAs, whereas three new CNAs were observed in dogs at relapse compared with the previous profiles. One ex novo CNA, involving TCR, was present in dogs in remission after therapy, possibly induced by the autologous vaccine. Overall, aCGH identified small CNAs associated with outcome, which, along with future expression studies, may reveal target genes relevant to cDLBCL. PMID:25372838
Molecular characterization of immortalized normal and dysplastic oral cell lines.

PubMed

Dickman, Christopher T D; Towle, Rebecca; Saini, Rajan; Garnis, Cathie

2015-05-01

Cell lines have been developed for modeling cancer and cancer progression. The molecular background of these cell lines is often unknown to those using them to model disease behaviors. As molecular alterations are the ultimate drivers of cell phenotypes, having an understanding of the molecular make-up of these systems is critical for understanding the disease biology modeled. Six immortalized normal, one immortalized dysplasia, one self-immortalized dysplasia, and two primary normal cell lines derived from oral tissues were analyzed for DNA copy number changes and changes in both mRNA and miRNA expression using SMRT-v.2 genome-wide tiling comparative genomic hybridization arrays, Agilent Whole Genome 4x44k expression arrays, and Exiqon V2.M-RT-PCR microRNA Human panels. DNA copy number alterations were detected in both normal and dysplastic immortalized cell lines-as well as in the single non-immortalized dysplastic cell line. These lines were found to have changes in expression of genes related to cell cycle control as well as alterations in miRNAs that are deregulated in clinical oral squamous cell carcinoma tissues. Immortal lines-whether normal or dysplastic-had increased disruption in expression relative to primary lines. All data are available as a public resource. Molecular profiling experiments have identified DNA, mRNA, and miRNA alterations for a panel of normal and dysplastic oral tissue cell lines. These data are a valuable resource to those modeling diseases of the oral mucosa, and give insight into the selection of model cell lines and the interpretation of data from those lines. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The Utility of Chromosomal Microarray Analysis in Developmental and Behavioral Pediatrics

ERIC Educational Resources Information Center

Beaudet, Arthur L.

2013-01-01

Chromosomal microarray analysis (CMA) has emerged as a powerful new tool to identify genomic abnormalities associated with a wide range of developmental disabilities including congenital malformations, cognitive impairment, and behavioral abnormalities. CMA includes array comparative genomic hybridization (CGH) and single nucleotide polymorphism…
Selecting sequence variants to improve genomic predictions for dairy cattle

USDA-ARS?s Scientific Manuscript database

Millions of genetic variants have been identified by population-scale sequencing projects, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Methods of selecting sequence variants were compared using both simulated sequence genotypes and actual data from run ...
High throughput SNP discovery and genotyping in hexaploid wheat.

PubMed

Rimbert, Hélène; Darrier, Benoît; Navarro, Julien; Kitt, Jonathan; Choulet, Frédéric; Leveugle, Magalie; Duarte, Jorge; Rivière, Nathalie; Eversole, Kellye; Le Gouis, Jacques; Davassi, Alessandro; Balfourier, François; Le Paslier, Marie-Christine; Berard, Aurélie; Brunel, Dominique; Feuillet, Catherine; Poncet, Charles; Sourdille, Pierre; Paux, Etienne

2018-01-01

Because of their abundance and their amenability to high-throughput genotyping techniques, Single Nucleotide Polymorphisms (SNPs) are powerful tools for efficient genetics and genomics studies, including characterization of genetic resources, genome-wide association studies and genomic selection. In wheat, most of the previous SNP discovery initiatives targeted the coding fraction, leaving almost 98% of the wheat genome largely unexploited. Here we report on the use of whole-genome resequencing data from eight wheat lines to mine for SNPs in the genic, the repetitive and non-repetitive intergenic fractions of the wheat genome. Eventually, we identified 3.3 million SNPs, 49% being located on the B-genome, 41% on the A-genome and 10% on the D-genome. We also describe the development of the TaBW280K high-throughput genotyping array containing 280,226 SNPs. Performance of this chip was examined by genotyping a set of 96 wheat accessions representing the worldwide diversity. Sixty-nine percent of the SNPs can be efficiently scored, half of them showing a diploid-like clustering. The TaBW280K was proven to be a very efficient tool for diversity analyses, as well as for breeding as it can discriminate between closely related elite varieties. Finally, the TaBW280K array was used to genotype a population derived from a cross between Chinese Spring and Renan, leading to the construction a dense genetic map comprising 83,721 markers. The results described here will provide the wheat community with powerful tools for both basic and applied research.
High-resolution single-nucleotide polymorphism array-profiling in myeloproliferative neoplasms identifies novel genomic aberrations

PubMed Central

Stegelmann, Frank; Bullinger, Lars; Griesshammer, Martin; Holzmann, Karlheinz; Habdank, Marianne; Kuhn, Susanne; Maile, Carmen; Schauer, Stefanie; Döhner, Hartmut; Döhner, Konstanze

2010-01-01

Single-nucleotide polymorphism arrays allow for genome-wide profiling of copy-number alterations and copy-neutral runs of homozygosity at high resolution. To identify novel genetic lesions in myeloproliferative neoplasms, a large series of 151 clinically well characterized patients was analyzed in our study. Copy-number alterations were rare in essential thrombocythemia and polycythemia vera. In contrast, approximately one third of myelofibrosis patients exhibited small genomic losses (less than 5 Mb). In 2 secondary myelofibrosis cases the tumor suppressor gene NF1 in 17q11.2 was affected. Sequencing analyses revealed a mutation in the remaining NF1 allele of one patient. In terms of copy-neutral aberrations, no chromosomes other than 9p were recurrently affected. In conclusion, novel genomic aberrations were identified in our study, in particular in patients with myelofibrosis. Further analyses on single-gene level are necessary to uncover the mechanisms that are involved in the pathogenesis of myeloproliferative neoplasms. PMID:20015882
Complete genome sequence of the xylan-degrading subseafloor bacterium Microcella alkaliphila JAM-AC0309.

PubMed

Kurata, Atsushi; Hirose, Yuu; Misawa, Naomi; Wakazuki, Sachiko; Kishimoto, Noriaki; Kobayashi, Tohru

2016-03-10

Here we report the complete genome sequence of Microcella alkaliphila JAM-AC0309, which was newly isolated from the deep subseafloor core sediment from offshore of the Shimokita Peninsula of Japan. An array of genes related to utilization of xylan in this bacterium was identified by whole genome analysis. Copyright © 2016 Elsevier B.V. All rights reserved.
High-resolution mapping and sequence analysis of 597 cDNA clones transcribed from the 1 Mb region in human chromosome 4q16.3 containing Huntington disease gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hadano, S.; Ishida, Y.; Tomiyasu, H.

1994-09-01

To complete a transcription map of the 1 Mb region in human chromosome 4p16.3 containing the Huntington disease (HD) gene, the isolation of cDNA clones are being performed throughout. Our method relies on a direct screening of the cDNA libraries probed with single copy microclones from 3 YAC clones spanning 1 Mbp of the HD gene region. AC-DNAs were isolated by a preparative pulsed-field gel electrophoresis, amplified by both a single unique primer (SUP)-PCR and a linker ligation PCR, and 6 microclone-DNA libraries were generated. Then, 8,640 microclones from these libraries were independently amplified by PCR, and arrayed onto themore » membranes. 800-900 microclones that were not cross-hybridized with total human and yeast genomic DNA, TAC vector DNA, and ribosomal cDNA on a dot hybridization (putatively carrying single copy sequences) were pooled to make 9 probe pools. A total of {approximately}1.8x10{sup 7} plaques from the human brain cDNA libraries was screened with 9 pool-probes, and then 672 positive cDNA clones were obtained. So far, 597 cDNA clones were defined and arrayed onto a map of the 1 Mbp of the HD gene region by hybridization with HD region-specific cosmid contigs and YAC clones. Further characterization including a DNA sequencing and Northern blot analysis is currently underway.« less
An integrated bioinformatics infrastructure essential for advancing pharmacogenomics and personalized medicine in the context of the FDA's Critical Path Initiative.

PubMed

Tong, Weida; Harris, Stephen C; Fang, Hong; Shi, Leming; Perkins, Roger; Goodsaid, Federico; Frueh, Felix W

2007-01-01

Pharmacogenomics (PGx) is identified in the FDA Critical Path document as a major opportunity for advancing medical product development and personalized medicine. An integrated bioinformatics infrastructure for use in FDA data review is crucial to realize the benefits of PGx for public health. We have developed an integrated bioinformatics tool, called ArrayTrack, for managing, analyzing and interpreting genomic and other biomarker data (e.g. proteomic and metabolomic data). ArrayTrack is a highly flexible and robust software platform, which allows evolving with technological advances and changing user needs. ArrayTrack is used in the routine review of genomic data submitted to the FDA; here, three hypothetical examples of its use in the Voluntary eXploratory Data Submission (VXDS) program are illustrated.: © Published by Elsevier Ltd.
Genomic Organization of the Drosophila Telomere RetrotransposableElements

DOE Office of Scientific and Technical Information (OSTI.GOV)

George, J.A.; DeBaryshe, P.G.; Traverse, K.L.

2006-10-16

The emerging sequence of the heterochromatic portion of the Drosophila melanogaster genome, with the most recent update of euchromatic sequence, gives the first genome-wide view of the chromosomal distribution of the telomeric retrotransposons, HeT-A, TART, and Tahre. As expected, these elements are entirely excluded from euchromatin, although sequence fragments of HeT-A and TART 3 untranslated regions are found in nontelomeric heterochromatin on the Y chromosome. The proximal ends of HeT-A/TART arrays appear to be a transition zone because only here do other transposable elements mix in the array. The sharp distinction between the distribution of telomeric elements and that ofmore » other transposable elements suggests that chromatin structure is important in telomere element localization. Measurements reported here show (1) D. melanogaster telomeres are very long, in the size range reported for inbred mouse strains (averaging 46 kb per chromosome end in Drosophila stock 2057). As in organisms with telomerase, their length varies depending on genotype. There is also slight under-replication in polytene nuclei. (2) Surprisingly, the relationship between the number of HeT-A and TART elements is not stochastic but is strongly correlated across stocks, supporting the idea that the two elements are interdependent. Although currently assembled portions of the HeT-A/TART arrays are from the most-proximal part of long arrays, {approx}61% of the total HeT-A sequence in these regions consists of intact, potentially active elements with little evidence of sequence decay, making it likely that the content of the telomere arrays turns over more extensively than has been thought.« less

Genome-wide DNA methylation profiling in infants born to gestational diabetes mellitus.

PubMed

Weng, Xiaoling; Liu, Fatao; Zhang, Hong; Kan, Mengyuan; Wang, Ting; Dong, Mingyue; Liu, Yun

2018-03-26

Offspring exposed to gestational diabetes mellitus (GDM) are at a high risk for metabolic diseases. The mechanisms behind the association between offspring exposed to GDM in utero and an increased risk of health consequences later in life remain unclear. The aim of this study was to clarify the changes in methylation levels in the foetuses of women with GDM and to explore the possible mechanisms linking maternal GDM with a high risk of metabolic diseases in offspring later in life. A genome-wide comparative methylome analysis on the umbilical cord blood of infants born to 30 women with GDM and 33 women with normal pregnancy was performed using Infinium HumanMethylation 450 BeadChip assays. A quantitative methylation analysis of 18 CpG dinucleotides was verified in the validation umbilical cord blood samples from 102 newborns exposed to GDM and 103 newborns who experienced normal pregnancy by MassARRAY EpiTYPER. A total of 4485 differentially methylated sites (DMSs), including 2150 hypermethylated sites and 2335 hypomethylated sites, with a mean β-value difference of >0.05, were identified by the 450k array. Good agreement was observed between the massarray validation data and the 450k array data (R 2 > 0.99; P < 0.0001). Thirty-seven CpGs (representing 20 genes) with a β-value difference of >0.15 between the GDM and healthy groups were identified and showed potential as clinical biomarkers for GDM. "hsa04940: Type I diabetes mellitus" was the most significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, with a P-value = 3.20E-07 and 1.36E-02 in the hypermethylated and hypomethylated genepathway enrichment analyses, respectively. In the Gene Ontology (GO) pathway analyses, immune MHC-related pathways and neuron development-related pathways were significantly enriched. Our results suggest that GDM has epigenetic effects on genes that are preferentially involved in the Type I diabetes mellitus pathway, immune MHC (major histocompatibility complex)-related pathways and neuron development-related pathways, with consequences on fetal growth and development, and provide supportive evidence that DNA methylation is involved in fetal metabolic programming. Copyright © 2018. Published by Elsevier B.V.
Replicative age induces mitotic recombination in the ribosomal RNA gene cluster of Saccharomyces cerevisiae.

PubMed

Lindstrom, Derek L; Leverich, Christina K; Henderson, Kiersten A; Gottschling, Daniel E

2011-03-01

Somatic mutations contribute to the development of age-associated disease. In earlier work, we found that, at high frequency, aging Saccharomyces cerevisiae diploid cells produce daughters without mitochondrial DNA, leading to loss of respiration competence and increased loss of heterozygosity (LOH) in the nuclear genome. Here we used the recently developed Mother Enrichment Program to ask whether aging cells that maintain the ability to produce respiration-competent daughters also experience increased genomic instability. We discovered that this population exhibits a distinct genomic instability phenotype that primarily affects the repeated ribosomal RNA gene array (rDNA array). As diploid cells passed their median replicative life span, recombination rates between rDNA arrays on homologous chromosomes progressively increased, resulting in mutational events that generated LOH at >300 contiguous open reading frames on the right arm of chromosome XII. We show that, while these recombination events were dependent on the replication fork block protein Fob1, the aging process that underlies this phenotype is Fob1-independent. Furthermore, we provide evidence that this aging process is not driven by mechanisms that modulate rDNA recombination in young cells, including loss of cohesion within the rDNA array or loss of Sir2 function. Instead, we suggest that the age-associated increase in rDNA recombination is a response to increasing DNA replication stress generated in aging cells.
Canine urothelial carcinoma: genomically aberrant and comparatively relevant

PubMed Central

Shapiro, S. G.; Raghunath, S.; Williams, C.; Motsinger-Reif, A. A.; Cullen, J. M.; Liu, T.; Albertson, D.; Ruvolo, M.; Lucas, A. Bergstrom; Jin, J.; Knapp, D. W.; Schiffman, J. D.

2015-01-01

Urothelial carcinoma (UC), also referred to as transitional cell carcinoma (TCC), is the most common bladder malignancy in both human and canine populations. In human UC, numerous studies have demonstrated the prevalence of chromosomal imbalances. Although the histopathology of the disease is similar in both species, studies evaluating the genomic profile of canine UC are lacking, limiting the discovery of key comparative molecular markers associated with driving UC pathogenesis. In the present study, we evaluated 31 primary canine UC biopsies by oligonucleotide array comparative genomic hybridization (oaCGH). Results highlighted the presence of three highly recurrent numerical aberrations: gain of dog chromosome (CFA) 13 and 36 and loss of CFA 19. Regional gains of CFA 13 and 36 were present in 97% and 84% of cases, respectively, and losses on CFA 19 were present in 77% of cases. Fluorescence in situ hybridization (FISH), using targeted bacterial artificial chromosome (BAC) clones and custom Agilent SureFISH probes, was performed to detect and quantify these regions in paraffin-embedded biopsy sections and urine-derived urothelial cells. The data indicate that these three aberrations are potentially diagnostic of UC. Comparison of our canine oaCGH data with that of 285 human cases identified a series of shared copy number aberrations. Using an informatics approach to interrogate the frequency of copy number aberrations across both species, we identified those that had the highest joint probability of association with UC. The most significant joint region contained the gene PABPC1, which should be considered further for its role in UC progression. In addition, cross-species filtering of genome-wide copy number data highlighted several genes as high-profile candidates for further analysis, including CDKN2A, S100A8/9, and LRP1B. We propose that these common aberrations are indicative of an evolutionarily conserved mechanism of pathogenesis and harbor genes key to urothelial neoplasia, warranting investigation for diagnostic, prognostic, and therapeutic applications. PMID:25783786
Genome characterization and population genetic structure of the zoonotic pathogen, Streptococcus canis

PubMed Central

2012-01-01

Background Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Results Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection). A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs) [plasmid, phage, integrative conjugative element (ICE)] and comparison to other species provided convincing evidence for lateral gene transfer (LGT) between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae), with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST) of a subset of the isolates (n = 45) detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types]), suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates) occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. Conclusion This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human bacteria (Streptococcus urinalis) is cause for concern, as it highlights the possibility for continued acquisition of human virulence factors for this emerging zoonotic pathogen. PMID:23244770
Canine urothelial carcinoma: genomically aberrant and comparatively relevant.

PubMed

Shapiro, S G; Raghunath, S; Williams, C; Motsinger-Reif, A A; Cullen, J M; Liu, T; Albertson, D; Ruvolo, M; Bergstrom Lucas, A; Jin, J; Knapp, D W; Schiffman, J D; Breen, M

2015-06-01

Urothelial carcinoma (UC), also referred to as transitional cell carcinoma (TCC), is the most common bladder malignancy in both human and canine populations. In human UC, numerous studies have demonstrated the prevalence of chromosomal imbalances. Although the histopathology of the disease is similar in both species, studies evaluating the genomic profile of canine UC are lacking, limiting the discovery of key comparative molecular markers associated with driving UC pathogenesis. In the present study, we evaluated 31 primary canine UC biopsies by oligonucleotide array comparative genomic hybridization (oaCGH). Results highlighted the presence of three highly recurrent numerical aberrations: gain of dog chromosome (CFA) 13 and 36 and loss of CFA 19. Regional gains of CFA 13 and 36 were present in 97 % and 84 % of cases, respectively, and losses on CFA 19 were present in 77 % of cases. Fluorescence in situ hybridization (FISH), using targeted bacterial artificial chromosome (BAC) clones and custom Agilent SureFISH probes, was performed to detect and quantify these regions in paraffin-embedded biopsy sections and urine-derived urothelial cells. The data indicate that these three aberrations are potentially diagnostic of UC. Comparison of our canine oaCGH data with that of 285 human cases identified a series of shared copy number aberrations. Using an informatics approach to interrogate the frequency of copy number aberrations across both species, we identified those that had the highest joint probability of association with UC. The most significant joint region contained the gene PABPC1, which should be considered further for its role in UC progression. In addition, cross-species filtering of genome-wide copy number data highlighted several genes as high-profile candidates for further analysis, including CDKN2A, S100A8/9, and LRP1B. We propose that these common aberrations are indicative of an evolutionarily conserved mechanism of pathogenesis and harbor genes key to urothelial neoplasia, warranting investigation for diagnostic, prognostic, and therapeutic applications.
FOXM1 Upregulation Is an Early Event in Human Squamous Cell Carcinoma and it Is Enhanced by Nicotine during Malignant Transformation

PubMed Central

Gemenetzidis, Emilios; Bose, Amrita; Riaz, Adeel M.; Chaplin, Tracy; Young, Bryan D.; Ali, Muhammad; Sugden, David; Thurlow, Johanna K.; Cheong, Sok-Ching; Teo, Soo-Hwang; Wan, Hong; Waseem, Ahmad; Parkinson, Eric K.; Fortune, Farida; Teh, Muy-Teck

2009-01-01

Background Cancer associated with smoking and drinking remains a serious health problem worldwide. The survival of patients is very poor due to the lack of effective early biomarkers. FOXM1 overexpression is linked to the majority of human cancers but its mechanism remains unclear in head and neck squamous cell carcinoma (HNSCC). Methodology/Principal Findings FOXM1 mRNA and protein expressions were investigated in four independent cohorts (total 75 patients) consisting of normal, premalignant and HNSCC tissues and cells using quantitative PCR (qPCR), expression microarray, immunohistochemistry and immunocytochemistry. Effect of putative oral carcinogens on FOXM1 transcriptional activity was dose-dependently assayed and confirmed using a FOXM1-specific luciferase reporter system, qPCR, immunoblotting and short-hairpin RNA interference. Genome-wide single nucleotide polymorphism (SNP) array was used to ‘trace’ the genomic instability signature pattern in 8 clonal lines of FOXM1-induced malignant human oral keratinocytes. Furthermore, acute FOXM1 upregulation in primary oral keratinocytes directly induced genomic instability. We have shown for the first time that overexpression of FOXM1 precedes HNSCC malignancy. Screening putative carcinogens in human oral keratinocytes surprisingly showed that nicotine, which is not perceived to be a human carcinogen, directly induced FOXM1 mRNA, protein stabilisation and transcriptional activity at concentrations relevant to tobacco chewers. Importantly, nicotine also augmented FOXM1-induced transformation of human oral keratinocytes. A centrosomal protein CEP55 and a DNA helicase/putative stem cell marker HELLS, both located within a consensus loci (10q23), were found to be novel targets of FOXM1 and their expression correlated tightly with HNSCC progression. Conclusions/Significance This study cautions the potential co-carcinogenic effect of nicotine in tobacco replacement therapies. We hypothesise that aberrant upregulation of FOXM1 may be inducing genomic instability through a program of malignant transformation involving the activation of CEP55 and HELLS which may facilitate aberrant mitosis and epigenetic modifications. Our finding that FOXM1 is upregulated early during oral cancer progression renders FOXM1 an attractive diagnostic biomarker for early cancer detection and its candidate mechanistic targets, CEP55 and HELLS, as indicators of malignant conversion and progression. PMID:19287496
A functional genomics tool for the Pacific bluefin tuna: Development of a 44K oligonucleotide microarray from whole-genome sequencing data for global transcriptome analysis.

PubMed

Yasuike, Motoshige; Fujiwara, Atushi; Nakamura, Yoji; Iwasaki, Yuki; Nishiki, Issei; Sugaya, Takuma; Shimizu, Akio; Sano, Motohiko; Kobayashi, Takanori; Ototake, Mitsuru

2016-02-01

Bluefin tunas are one of the most important fishery resources worldwide. Because of high market values, bluefin tuna farming has been rapidly growing during recent years. At present, the most common form of the tuna farming is based on the stocking of wild-caught fish. Therefore, concerns have been raised about the negative impact of the tuna farming on wild stocks. Recently, the Pacific bluefin tuna (PBT), Thunnus orientalis, has succeeded in completing the reproduction cycle under aquaculture conditions, but production bottlenecks remain to be solved because of very little biological information on bluefin tunas. Functional genomics approaches promise to rapidly increase our knowledge on biological processes in the bluefin tuna. Here, we describe the development of the first 44K PBT oligonucleotide microarray (oligo-array), based on whole-genome shotgun (WGS) sequencing and large-scale expressed sequence tags (ESTs) data. In addition, we also introduce an initial 44K PBT oligo-array experiment using in vitro grown peripheral blood leukocytes (PBLs) stimulated with immunostimulants such as lipopolysaccharide (LPS: a cell wall component of Gram-negative bacteria) or polyinosinic:polycytidylic acid (poly I:C: a synthetic mimic of viral infection). This pilot 44K PBT oligo-array analysis successfully addressed distinct immune processes between LPS- and poly I:C- stimulated PBLs. Thus, we expect that this oligo-array will provide an excellent opportunity to analyze global gene expression profiles for a better understanding of diseases and stress, as well as for reproduction, development and influence of nutrition on tuna aquaculture production. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Genome-wide analysis of endogenously expressed ZEB2 binding sites reveals inverse correlations between ZEB2 and GalNAc-transferase GALNT3 in human tumors.

PubMed

Balcik-Ercin, Pelin; Cetin, Metin; Yalim-Camci, Irem; Odabas, Gorkem; Tokay, Nurettin; Sayan, A Emre; Yagci, Tamer

2018-03-07

ZEB2 is a transcriptional repressor that regulates epithelial-to-mesenchymal transition (EMT) through binding to bipartite E-box motifs in gene regulatory regions. Despite the abundant presence of E-boxes within the human genome and the multiplicity of pathophysiological processes regulated during ZEB2-induced EMT, only a small fraction of ZEB2 targets has been identified so far. Hence, we explored genome-wide ZEB2 binding by chromatin immunoprecipitation-sequencing (ChIP-seq) under endogenous ZEB2 expression conditions. For ChIP-Seq we used an anti-ZEB2 monoclonal antibody, clone 6E5, in SNU398 hepatocellular carcinoma cells exhibiting a high endogenous ZEB2 expression. The ChIP-Seq targets were validated using ChIP-qPCR, whereas ZEB2-dependent expression of target genes was assessed by RT-qPCR and Western blotting in shRNA-mediated ZEB2 silenced SNU398 cells and doxycycline-induced ZEB2 overexpressing colorectal carcinoma DLD1 cells. Changes in target gene expression were also assessed using primary human tumor cDNA arrays in conjunction with RT-qPCR. Additional differential expression and correlation analyses were performed using expO and Human Protein Atlas datasets. Over 500 ChIP-Seq positive genes were annotated, and intervals related to these genes were found to include the ZEB2 binding motif CACCTG according to TOMTOM motif analysis in the MEME Suite database. Assessment of ZEB2-dependent expression of target genes in ZEB2-silenced SNU398 cells and ZEB2-induced DLD1 cells revealed that the GALNT3 gene serves as a ZEB2 target with the highest, but inversely correlated, expression level. Remarkably, GALNT3 also exhibited the highest enrichment in the ChIP-qPCR validation assays. Through the analyses of primary tumor cDNA arrays and expO datasets a significant differential expression and a significant inverse correlation between ZEB2 and GALNT3 expression were detected in most of the tumors. We also explored ZEB2 and GALNT3 protein expression using the Human Protein Atlas dataset and, again, observed an inverse correlation in all analyzed tumor types, except malignant melanoma. In contrast to a generally negative or weak ZEB2 expression, we found that most tumor tissues exhibited a strong or moderate GALNT3 expression. Our observation that ZEB2 negatively regulates a GalNAc-transferase (GALNT3) that is involved in O-glycosylation adds another layer of complexity to the role of ZEB2 in cancer progression and metastasis. Proteins glycosylated by GALNT3 may be exploited as novel diagnostics and/or therapeutic targets.
Genomewide single nucleotide polymorphism discovery in Atlantic salmon (Salmo salar): validation in wild and farmed American and European populations.

PubMed

Yáñez, J M; Naswa, S; López, M E; Bassini, L; Correa, K; Gilbey, J; Bernatchez, L; Norris, A; Neira, R; Lhorente, J P; Schnable, P S; Newman, S; Mileham, A; Deeb, N; Di Genova, A; Maass, A

2016-07-01

A considerable number of single nucleotide polymorphisms (SNPs) are required to elucidate genotype-phenotype associations and determine the molecular basis of important traits. In this work, we carried out de novo SNP discovery accounting for both genome duplication and genetic variation from American and European salmon populations. A total of 9 736 473 nonredundant SNPs were identified across a set of 20 fish by whole-genome sequencing. After applying six bioinformatic filtering steps, 200 K SNPs were selected to develop an Affymetrix Axiom(®) myDesign Custom Array. This array was used to genotype 480 fish representing wild and farmed salmon from Europe, North America and Chile. A total of 159 099 (79.6%) SNPs were validated as high quality based on clustering properties. A total of 151 509 validated SNPs showed a unique position in the genome. When comparing these SNPs against 238 572 markers currently available in two other Atlantic salmon arrays, only 4.6% of the SNP overlapped with the panel developed in this study. This novel high-density SNP panel will be very useful for the dissection of economically and ecologically relevant traits, enhancing breeding programmes through genomic selection as well as supporting genetic studies in both wild and farmed populations of Atlantic salmon using high-resolution genomewide information. © 2016 John Wiley & Sons Ltd.
Construction and Annotation of a High Density SNP Linkage Map of the Atlantic Salmon (Salmo salar) Genome.

PubMed

Tsai, Hsin Y; Robledo, Diego; Lowe, Natalie R; Bekaert, Michael; Taggart, John B; Bron, James E; Houston, Ross D

2016-07-07

High density linkage maps are useful tools for fine-scale mapping of quantitative trait loci, and characterization of the recombination landscape of a species' genome. Genomic resources for Atlantic salmon (Salmo salar) include a well-assembled reference genome, and high density single nucleotide polymorphism (SNP) arrays. Our aim was to create a high density linkage map, and to align it with the reference genome assembly. Over 96,000 SNPs were mapped and ordered on the 29 salmon linkage groups using a pedigreed population comprising 622 fish from 60 nuclear families, all genotyped with the 'ssalar01' high density SNP array. The number of SNPs per group showed a high positive correlation with physical chromosome length (r = 0.95). While the order of markers on the genetic and physical maps was generally consistent, areas of discrepancy were identified. Approximately 6.5% of the previously unmapped reference genome sequence was assigned to chromosomes using the linkage map. Male recombination rate was lower than females across the vast majority of the genome, but with a notable peak in subtelomeric regions. Finally, using RNA-Seq data to annotate the reference genome, the mapped SNPs were categorized according to their predicted function, including annotation of ∼2500 putative nonsynonymous variants. The highest density SNP linkage map for any salmonid species has been created, annotated, and integrated with the Atlantic salmon reference genome assembly. This map highlights the marked heterochiasmy of salmon, and provides a useful resource for salmonid genetics and genomics research. Copyright © 2016 Tsai et al.
Incoming human papillomavirus 16 genome is lost in PML protein-deficient HaCaT keratinocytes.

PubMed

Bienkowska-Haba, Malgorzata; Luszczek, Wioleta; Keiffer, Timothy R; Guion, Lucile G M; DiGiuseppe, Stephen; Scott, Rona S; Sapp, Martin

2017-05-01

Human papillomaviruses (HPVs) target promyelocytic leukemia (PML) nuclear bodies (NBs) during infectious entry and PML protein is important for efficient transcription of incoming viral genome. However, the transcriptional down regulation was shown to be promoter-independent in that heterologous promoters delivered by papillomavirus particles were also affected. To further investigate the role of PML protein in HPV entry, we used small hairpin RNA to knockdown PML protein in HaCaT keratinocytes. Confirming previous findings, PML knockdown in HaCaT cells reduced HPV16 transcript levels significantly following infectious entry without impairing binding and trafficking. However, when we quantified steady-state levels of pseudogenomes in interphase cells, we found strongly reduced genome levels compared with parental HaCaT cells. Because nuclear delivery was comparable in both cell lines, we conclude that viral pseudogenome must be removed after successful nuclear delivery. Transcriptome analysis by gene array revealed that PML knockdown in clonal HaCaT cells was associated with a constitutive interferon response. Abrogation of JAK1/2 signaling prevented genome loss, however, did not restore viral transcription. In contrast, knockdown of PML protein in HeLa cells did not affect HPV genome delivery and transcription. HeLa cells are transformed by HPV18 oncogenes E6 and E7, which have been shown to interfere with the JAK/Stat signaling pathway. Our data imply that PML NBs protect incoming HPV genomes. Furthermore, they provide evidence that PML NBs are key regulators of the innate immune response in keratinocytes. Promyelocytic leukemia nuclear bodies (PML NBs) are important for antiviral defense. Many DNA viruses target these subnuclear structures and reorganize them. Reorganization of PML NBs by viral proteins is important for establishment of infection. In contrast, HPVs require the presence of PML protein for efficient transcription of incoming viral genome. Our finding that PML protein prevents the loss of HPV genome following infection implies that the host cell may be able to recognize chromatinized HPV genome or the associated capsid proteins. A constitutively active interferon response in absence of PML protein suggests that PML NBs are key regulators of the innate immune response in keratinocytes. © 2016 John Wiley & Sons Ltd.
Chromosomal imbalances are associated with outcome of Helicobacter pylori eradication in t(11;18)(q21;q21) negative gastric mucosa-associated lymphoid tissue lymphomas.

PubMed

Fukuhara, Noriko; Nakamura, Tsuneya; Nakagawa, Masao; Tagawa, Hiroyuki; Takeuchi, Ichiro; Yatabe, Yasushi; Morishima, Yasuo; Nakamura, Shigeo; Seto, Masao

2007-08-01

Approximately 70% of gastric mucosa-associated lymphoid tissue (MALT) lymphomas can be successfully treated with H. pylori eradication. The translocation t(11;18)(q21;q21) characteristic of MALT lymphoma is recognized as a marker for H. pylori independency, but this marker is found in only a half of the MALT lymphomas resistant to H. pylori eradication. Detailed analyses of the genomic features of eradication resistant as well as responsive groups are important for understanding their molecular basis. We performed array-based comparative genomic hybridization (array-CGH) for 29 gastric MALT lymphomas treated with H. pylori eradication. These comprised ten cases of t(11;18) positive MALT, nine cases of t(11;18) negative MALT with H. pylori dependency, and ten cases of t(11;18) negative MALT with H. pylori independency. Array-CGH analysis demonstrated that no significant genetic alterations were found in t(11;18) positive MALT lymphomas, but numerous genomic alterations were detected in t(11;18) negative MALT lymphomas. Many of these alterations were similar to those found in diffuse large B-cell lymphoma with trisomy 3 being the most recurrent alteration. Within the t(11;18) negative MALT lymphoma without large cell components group, genomic imbalances occurred more frequently in the H. pylori independent than in the H. pylori dependent group (P = 0.02). Genomic imbalances are associated with H. pylori independency in t(11;18) negative gastric MALT lymphomas. They may thus play an important role in the development of H. pylori independency.
Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa.

PubMed

Bassil, Nahla V; Davis, Thomas M; Zhang, Hailong; Ficklin, Stephen; Mittmann, Mike; Webster, Teresa; Mahoney, Lise; Wood, David; Alperin, Elisabeth S; Rosyara, Umesh R; Koehorst-Vanc Putten, Herma; Monfort, Amparo; Sargent, Daniel J; Amaya, Iraida; Denoyes, Beatrice; Bianco, Luca; van Dijk, Thijs; Pirani, Ali; Iezzoni, Amy; Main, Dorrie; Peace, Cameron; Yang, Yilong; Whitaker, Vance; Verma, Sujeet; Bellon, Laurent; Brew, Fiona; Herrera, Raul; van de Weg, Eric

2015-03-07

A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.
Small Deletion Variants Have Stable Breakpoints Commonly Associated with Alu Elements

PubMed Central

Coin, Lachlan J. M.; Steinfeld, Israel; Yakhini, Zohar; Sladek, Rob; Froguel, Philippe; Blakemore, Alexandra I. F.

2008-01-01

Copy number variants (CNVs) contribute significantly to human genomic variation, with over 5000 loci reported, covering more than 18% of the euchromatic human genome. Little is known, however, about the origin and stability of variants of different size and complexity. We investigated the breakpoints of 20 small, common deletions, representing a subset of those originally identified by array CGH, using Agilent microarrays, in 50 healthy French Caucasian subjects. By sequencing PCR products amplified using primers designed to span the deleted regions, we determined the exact size and genomic position of the deletions in all affected samples. For each deletion studied, all individuals carrying the deletion share identical upstream and downstream breakpoints at the sequence level, suggesting that the deletion event occurred just once and later became common in the population. This is supported by linkage disequilibrium (LD) analysis, which has revealed that most of the deletions studied are in moderate to strong LD with surrounding SNPs, and have conserved long-range haplotypes. Analysis of the sequences flanking the deletion breakpoints revealed an enrichment of microhomology at the breakpoint junctions. More significantly, we found an enrichment of Alu repeat elements, the overwhelming majority of which intersected deletion breakpoints at their poly-A tails. We found no enrichment of LINE elements or segmental duplications, in contrast to other reports. Sequence analysis revealed enrichment of a conserved motif in the sequences surrounding the deletion breakpoints, although whether this motif has any mechanistic role in the formation of some deletions has yet to be determined. Considered together with existing information on more complex inherited variant regions, and reports of de novo variants associated with autism, these data support the presence of different subgroups of CNV in the genome which may have originated through different mechanisms. PMID:18769679
Transcriptome Profiling of In-Vivo Produced Bovine Pre-implantation Embryos Using Two-color Microarray Platform.

PubMed

Salehi, Reza; Tsoi, Stephen C M; Colazo, Marcos G; Ambrose, Divakar J; Robert, Claude; Dyck, Michael K

2017-01-30

Early embryonic loss is a large contributor to infertility in cattle. Moreover, bovine becomes an interesting model to study human preimplantation embryo development due to their similar developmental process. Although genetic factors are known to affect early embryonic development, the discovery of such factors has been a serious challenge. Microarray technology allows quantitative measurement and gene expression profiling of transcript levels on a genome-wide basis. One of the main decisions that have to be made when planning a microarray experiment is whether to use a one- or two-color approach. Two-color design increases technical replication, minimizes variability, improves sensitivity and accuracy as well as allows having loop designs, defining the common reference samples. Although microarray is a powerful biological tool, there are potential pitfalls that can attenuate its power. Hence, in this technical paper we demonstrate an optimized protocol for RNA extraction, amplification, labeling, hybridization of the labeled amplified RNA to the array, array scanning and data analysis using the two-color analysis strategy.
High-throughput multiplex HLA-typing by ligase detection reaction (LDR) and universal array (UA) approach.

PubMed

Consolandi, Clarissa

2009-01-01

One major goal of genetic research is to understand the role of genetic variation in living systems. In humans, by far the most common type of such variation involves differences in single DNA nucleotides, and is thus termed single nucleotide polymorphism (SNP). The need for improvement in throughput and reliability of traditional techniques makes it necessary to develop new technologies. Thus the past few years have witnessed an extraordinary surge of interest in DNA microarray technology. This new technology offers the first great hope for providing a systematic way to explore the genome. It permits a very rapid analysis of thousands genes for the purpose of gene discovery, sequencing, mapping, expression, and polymorphism detection. We generated a series of analytical tools to address the manufacturing, detection and data analysis components of a microarray experiment. In particular, we set up a universal array approach in combination with a PCR-LDR (polymerase chain reaction-ligation detection reaction) strategy for allele identification in the HLA gene.
Micro-ultrasound for preclinical imaging

PubMed Central

Foster, F. Stuart; Hossack, John; Adamson, S. Lee

2011-01-01

Over the past decade, non-invasive preclinical imaging has emerged as an important tool to facilitate biomedical discovery. Not only have the markets for these tools accelerated, but the numbers of peer-reviewed papers in which imaging end points and biomarkers have been used have grown dramatically. High frequency ‘micro-ultrasound’ has steadily evolved in the post-genomic era as a rapid, comparatively inexpensive imaging tool for studying normal development and models of human disease in small animals. One of the fundamental barriers to this development was the technological hurdle associated with high-frequency array transducers. Recently, new approaches have enabled the upper limits of linear and phased arrays to be pushed from about 20 to over 50 MHz enabling a broad range of new applications. The innovations leading to the new transducer technology and scanner architecture are reviewed. Applications of preclinical micro-ultrasound are explored for developmental biology, cancer, and cardiovascular disease. With respect to the future, the latest developments in high-frequency ultrasound imaging are described. PMID:22866232
A case of 3q29 microdeletion syndrome involving oral cleft inherited from a non-affected mosaic parent: molecular analysis and ethical implications

PubMed Central

Petrin, Aline L.; Daack-Hirsch, Sandra; L’Heureux, Jamie; Murray, Jeffrey C

2010-01-01

Objective The objective of this study was to use array-CGH to detect causal microdeletions in samples of subjects with cleft lip and palate. Subjects We analyzed DNA samples from a male patient and parents that was seen during surgical screening for an Operation Smile medical mission in the Philippines. Method We used Affymetrix Genome Wide Human SNP Array 6.0 followed by sequencing and quantitative PCR using SYBR Green I dye. Results We report the second case of 3q29 microdeletion syndrome including cleft lip with or without cleft palate and the first case of this microdeletion syndrome inherited from a phenotypically normal mosaic parent. Conclusions Our findings confirm the utility of aCGH to detect causal microdeletions; indicate that parental somatic mosaicism should be considered in healthy parents for genetic counseling of the families and discuss important ethical implications of sharing health impact results from research studies with the participant families. PMID:20500065
Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens.

PubMed

de Groot, Reinoud; Lüthi, Joel; Lindsay, Helen; Holtackers, René; Pelkmans, Lucas

2018-01-23

High-content imaging using automated microscopy and computer vision allows multivariate profiling of single-cell phenotypes. Here, we present methods for the application of the CISPR-Cas9 system in large-scale, image-based, gene perturbation experiments. We show that CRISPR-Cas9-mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image-based phenotyping. We developed a pipeline to construct a large-scale arrayed library of 2,281 sequence-verified CRISPR-Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine-learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in-depth characterization of gene perturbation effects. This approach enables genome-scale image-based multivariate gene perturbation profiling using CRISPR-Cas9. © 2018 The Authors. Published under the terms of the CC BY 4.0 license.
Genomics Analogy Model for Educators (GAME): VELCRO® Analogy Model to Enable the Learning of DNA Arrays for Visually Impaired and Blind Students

ERIC Educational Resources Information Center

Bello, Julia; Butler, Charles; Radavich, Rosanne; York, Alan; Oseto, Christian; Orvis, Kathryn; Pittendrigh, Barry R.

2007-01-01

Although members of the general public have often heard of the terms "genetic engineering" and, more recently, genomics, they typically have little to no knowledge about these topics, and in some cases are confused about basic concepts in these areas. There is currently a need for teaching models to explain concepts behind genomics.…

A comprehensive molecular cytogenetic analysis of chromosome rearrangements in gibbons

PubMed Central

Capozzi, Oronzo; Carbone, Lucia; Stanyon, Roscoe R.; Marra, Annamaria; Yang, Fengtang; Whelan, Christopher W.; de Jong, Pieter J.; Rocchi, Mariano; Archidiacono, Nicoletta

2012-01-01

Chromosome rearrangements in small apes are up to 20 times more frequent than in most mammals. Because of their complexity, the full extent of chromosome evolution in these hominoids is not yet fully documented. However, previous work with array painting, BAC-FISH, and selective sequencing in two of the four karyomorphs has shown that high-resolution methods can precisely define chromosome breakpoints and map the complex flow of evolutionary chromosome rearrangements. Here we use these tools to precisely define the rearrangements that have occurred in the remaining two karyomorphs, genera Symphalangus (2n = 50) and Hoolock (2n = 38). This research provides the most comprehensive insight into the evolutionary origins of chromosome rearrangements involved in transforming small apes genome. Bioinformatics analyses of the human–gibbon synteny breakpoints revealed association with transposable elements and segmental duplications, providing some insight into the mechanisms that might have promoted rearrangements in small apes. In the near future, the comparison of gibbon genome sequences will provide novel insights to test hypotheses concerning the mechanisms of chromosome evolution. The precise definition of synteny block boundaries and orientation, chromosomal fusions, and centromere repositioning events presented here will facilitate genome sequence assembly for these close relatives of humans. PMID:22892276
Prioritizing causal disease genes using unbiased genomic features.

PubMed

Deo, Rahul C; Musso, Gabriel; Tasan, Murat; Tang, Paul; Poon, Annie; Yuan, Christiana; Felix, Janine F; Vasan, Ramachandran S; Beroukhim, Rameen; De Marco, Teresa; Kwok, Pui-Yan; MacRae, Calum A; Roth, Frederick P

2014-12-03

Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits. To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM. Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
aCGH Analysis to Estimate Genetic Variations among Domesticated Chickens

PubMed Central

Lin, Mengjie

2016-01-01

Chickens have been familiar to humans since ancient times and have been used not only for culinary purposes but also for cultural purposes including ritual ceremonies and traditional entertainment. The various chicken breeds developed for these purposes often display distinct morphological and/or behavioural traits. For example, the Japanese Shamo is larger and more aggressive than other domesticated chickens, reflecting its role as a fighting cock breed, whereas Japanese Naganakidori breeds, which have long-crowing behaviour, were bred instead for their entertaining and aesthetic qualities. However, the genetic backgrounds of these distinct morphological and behavioural traits remain unclear. Therefore, the question arises as to which genomic regions in these chickens were acted upon by selective pressures through breeding. We compared the entire genomes of six chicken breeds domesticated for various cultural purposes by utilizing array comparative genomic hybridization. From these analyses, we identified 782 regions that underwent insertions, deletions, or mutations, representing man-made selection pressure in these chickens. Furthermore, we found that a number of genes diversified in domesticated chickens bred for cultural or entertainment purposes were different from those diversified in chickens bred for food, such as broilers and layers. PMID:27525263
Assessing genome-wide copy number variation in the Han Chinese population.

PubMed

Lu, Jianqi; Lou, Haiyi; Fu, Ruiqing; Lu, Dongsheng; Zhang, Feng; Wu, Zhendong; Zhang, Xi; Li, Changhua; Fang, Baijun; Pu, Fangfang; Wei, Jingning; Wei, Qian; Zhang, Chao; Wang, Xiaoji; Lu, Yan; Yan, Shi; Yang, Yajun; Jin, Li; Xu, Shuhua

2017-10-01

Copy number variation (CNV) is a valuable source of genetic diversity in the human genome and a well-recognised cause of various genetic diseases. However, CNVs have been considerably under-represented in population-based studies, particularly the Han Chinese which is the largest ethnic group in the world. To build a representative CNV map for the Han Chinese population. We conducted a genome-wide CNV study involving 451 male Han Chinese samples from 11 geographical regions encompassing 28 dialect groups, representing a less-biased panel compared with the currently available data. We detected CNVs by using 4.2M NimbleGen comparative genomic hybridisation array and whole-genome deep sequencing of 51 samples to optimise the filtering conditions in CNV discovery. A comprehensive Han Chinese CNV map was built based on a set of high-quality variants (positive predictive value >0.8, with sizes ranging from 369 bp to 4.16 Mb and a median of 5907 bp). The map consists of 4012 CNV regions (CNVRs), and more than half are novel to the 30 East Asian CNV Project and the 1000 Genomes Project Phase 3. We further identified 81 CNVRs specific to regional groups, which was indicative of the subpopulation structure within the Han Chinese population. Our data are complementary to public data sources, and the CNV map may facilitate in the identification of pathogenic CNVs and further biomedical research studies involving the Han Chinese population. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Chemical and radiation mutagenesis: Induction and detection by whole genome sequencing

USDA-ARS?s Scientific Manuscript database

Brachypodium distachyon has emerged as an effective model system to address fundamental questions in grass biology. With its small sequenced genome, short generation time and rapidly expanding array of genetic tools B. distachyon is an ideal system to elucidate the molecular basis of important trai...
Genome, transcriptome, and functional analyses of Penicillium expansum provide new insights into secondary metabolism and pathogenicity

USDA-ARS?s Scientific Manuscript database

The relationship between secondary metabolism and infection in pathogenic fungi has remained largely elusive. Penicillium comprises a group of plant pathogens with varying host specificities and with the ability to produce a wide array of secondary metabolites. The genomes of three Penicillium exp...
Increasing feed efficiency and reducing methane emissions using genomics: An international approach

USDA-ARS?s Scientific Manuscript database

Genomic technology (including SNP arrays and next-generation sequencing) is a powerful driver for the genetic improvement of livestock. Phenotype recording can now, to an extent, be partitioned from selection, and even limited to several thousand animals. Rapid development of new technologies and pr...
Aquaculture genomics, genetics and breeding in the United States: Current status, challenges, and priorities for future research

USDA-ARS?s Scientific Manuscript database

Advancing the production efficiency and profitability of aquaculture is dependent upon the ability to utilize a diverse array of genetic resources. The ultimate goals of aquaculture genomics, genetics and breeding research are to enhance aquaculture production efficiency, sustainability, product qua...
Linkage Disequilibrium And Genome-Wide Association Studies In O. sativa

USDA-ARS?s Scientific Manuscript database

There is increasing evidence that genome-wide association studies provide a powerful approach to find the genetic basis of complex phenotypic variation in all kinds of species. For this purpose, we developed the first generation 44K Affymetrix SNP array in rice (see Tung et al. poster). We genotyped...
APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY

EPA Science Inventory

With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...
Genomic Expression Patterns in Menstrually-Related Migraine in Adolescents

PubMed Central

Hershey, Andrew; Horn, Paul; Kabbouche, Marielle; O'Brien, Hope; Powers, Scott

2011-01-01

Background Exacerbation of migraine with menses is common in adolescent girls and women with migraine, occurring in up to 60% of females with migraine. These migraines are oftentimes longer and more disabling and may be related to estrogen levels and hormonal fluctuations. Objective This study identifies the unique genomic expression pattern of menstrually-related migraine (MRM) in comparison to migraine occurring outside the menstrual period and headache free controls. Methods Whole blood samples were obtained from female subjects having an acute migraine during their menstrual period (MRM) or outside of their menstrual period (nonMRM) and controls (C) – females having a menstrual period without any history of headache. The mRNA was isolated from these samples and genomic profile was assessed. Affymetrix Human Exon ST 1.0 arrays were used to examine the genomic expression pattern differences between these three groups. Results Blood genomic expression patterns were obtained on 56 subjects (MRM = 18, nonMRM = 18 and C = 20). Unique genomic expression patterns were observed for both MRM and nonMRM. For MRM, 77 genes were identified that were unique to MRM, while 61 genes were commonly expressed for MRM and nonMRM and 127 genes appeared to have a unique expression pattern for nonMRM. In addition, there were 279 genes that differentially expressed for MRM compared to nonMRM that were not differentially expressed for nonMRM. Gene ontology of these samples indicated many of these groups of genes were functionally related and included categories of immunomodulation/inflammation, mitochondrial function and DNA homeostasis. Conclusions Blood genomic patterns can accurately differentiate MRM from nonMRM. These results indicate that MRM involves a unique molecular biology pathway that can be identified with a specific biomarker and suggest that individuals with MRM have a different underlying genetic etiology. PMID:22220971
A 16 kb naturally occurring genomic deletion including mce and PPE genes in Mycobacterium avium subspecies paratuberculosis isolates from goats with Johne's disease.

PubMed

Castellanos, Elena; Aranaz, Alicia; de Juan, Lucia; Dominguez, Lucas; Linedale, Richard; Bull, Tim J

2012-09-14

In this study we characterise the genomic and transcriptomic variability of a natural deletion strain of Mycobacterium avium subspecies paratuberculosis (MAP) prevalent in Spanish Guadarrama goats. Using a pan-genome microarray including MAP and M. avium subspecies hominissuis 104 genomes (MAPAC) we demonstrate the genotype to be MAP Type II with a single deletion of 19 contiguous ORFs (16 kb) including a complete mammalian cell entry (mce7_1) operon and adjacent proline-glutamic acid (PE)/proline-proline-glutamic acid (PPE) genes. A deletion specific PCR test was developed and a subsequent screening identified four goat herds infected with the variant strain. Each was located in central Spain and showed epidemiological links suggestive of transmission between herds. A majority of animals infected with the variant manifested a paucibacillary form of the disease. Comparisons between virulent complete genome compliment strains isolated from multibacillary diseased goats and the MAP variant strain during entry into activated macrophages demonstrated an increased sensitivity in the variant to intracellular killing in human and ovine macrophages. As PPE and mce genes are associated with mycobacterial virulence and pathogenesis we investigated the interplay of these gene sets during cell entry using the MAPAC array. This showed significant differential transcriptome profiles compared to full genome complement MAP controls that included changes in other undeleted mce operons and PE/PPE genes, esx-like signalling operons and stress response/fatty acid metabolism pathways. This strain represents the first report of a MAP Type II genotype with significant natural genomic deletions which remains able to cause disease and is transmissible in goats. Copyright © 2012 Elsevier B.V. All rights reserved.
Genome-wide DNA methylation measurements in prostate tissues uncovers novel prostate cancer diagnostic biomarkers and transcription factor binding patterns.

PubMed

Kirby, Marie K; Ramaker, Ryne C; Roberts, Brian S; Lasseigne, Brittany N; Gunther, David S; Burwell, Todd C; Davis, Nicholas S; Gulzar, Zulfiqar G; Absher, Devin M; Cooper, Sara J; Brooks, James D; Myers, Richard M

2017-04-17

Current diagnostic tools for prostate cancer lack specificity and sensitivity for detecting very early lesions. DNA methylation is a stable genomic modification that is detectable in peripheral patient fluids such as urine and blood plasma that could serve as a non-invasive diagnostic biomarker for prostate cancer. We measured genome-wide DNA methylation patterns in 73 clinically annotated fresh-frozen prostate cancers and 63 benign-adjacent prostate tissues using the Illumina Infinium HumanMethylation450 BeadChip array. We overlaid the most significantly differentially methylated sites in the genome with transcription factor binding sites measured by the Encyclopedia of DNA Elements consortium. We used logistic regression and receiver operating characteristic curves to assess the performance of candidate diagnostic models. We identified methylation patterns that have a high predictive power for distinguishing malignant prostate tissue from benign-adjacent prostate tissue, and these methylation signatures were validated using data from The Cancer Genome Atlas Project. Furthermore, by overlaying ENCODE transcription factor binding data, we observed an enrichment of enhancer of zeste homolog 2 binding in gene regulatory regions with higher DNA methylation in malignant prostate tissues. DNA methylation patterns are greatly altered in prostate cancer tissue in comparison to benign-adjacent tissue. We have discovered patterns of DNA methylation marks that can distinguish prostate cancers with high specificity and sensitivity in multiple patient tissue cohorts, and we have identified transcription factors binding in these differentially methylated regions that may play important roles in prostate cancer development.
High-resolution analysis of copy number variants in adults with simple-to-moderate congenital heart disease.

PubMed

Zhao, Wei; Niu, Guannan; Shen, Botao; Zheng, Yang; Gong, Fangchao; Wang, Xianfu; Lee, Jiyun; Mulvihill, John J; Chen, Xiaohui; Li, Shibo

2013-12-01

As patients with congenital heart disease (CHD) increasingly survive to childbearing age, it becomes important to understand the genetic origins of CHD. In children, CHD is frequently caused by chromosomal imbalances. We searched for submicroscopic imbalances in adults with CHD focusing on simple-to-moderate phenotypes, without associated dysmorphic features, a group not previously examined. A total of 100 Han Chinese adults with a diverse range of isolated CHD and 65 ethnically matched controls were screened using whole-genome array comparative genomic hybridization. Forty-five large (>100 kb) rare copy number variants (CNVs) were identified in 36/100 patients. These variants were not listed in the Database of Genomic Variants nor found in controls. In three of these genomic imbalances (22q11.2, 18q23, 3q21.3), genes that play an important role in cardiac development were implicated, including CRKL, NFATC1, PLXNA1, the latter has not been associated with human CHD before. This study detected a 0.7 Mb 22q11.2 deletion, which marginally overlapped the common 3 Mb 22q11.2 deletion, in one patient with a perimembranous ventricular septal defect without any extracardiac manifestation. Furthermore, we detected a novel inherited aberration dup (16q23.1). Although a causal relationship with CHD remains to be established, this CNVs profile provides a spectrum of genomic imbalances in this condition, and improves the CNV-phenotype correlations. © 2013 Wiley Periodicals, Inc.
CIDR

Science.gov Websites

Consortium Developed Arrays Infinium Human Drug Core Array The Illumina nfinium DrugDev Consortium array drug target discovery, validation and treatment response. Detailed Information on Array Infinium Human
Array-CGH analysis in Rwandan patients presenting development delay/intellectual disability with multiple congenital anomalies.

PubMed

Uwineza, Annette; Caberg, Jean-Hubert; Hitayezu, Janvier; Hellin, Anne Cecile; Jamar, Mauricette; Dideberg, Vinciane; Rusingiza, Emmanuel K; Bours, Vincent; Mutesa, Leon

2014-07-12

Array-CGH is considered as the first-tier investigation used to identify copy number variations. Right now, there is no available data about the genetic etiology of patients with development delay/intellectual disability and congenital malformation in East Africa. Array comparative genomic hybridization was performed in 50 Rwandan patients with development delay/intellectual disability and multiple congenital abnormalities, using the Agilent's 180 K microarray platform. Fourteen patients (28%) had a global development delay whereas 36 (72%) patients presented intellectual disability. All patients presented multiple congenital abnormalities. Clinically significant copy number variations were found in 13 patients (26%). Size of CNVs ranged from 0,9 Mb to 34 Mb. Six patients had CNVs associated with known syndromes, whereas 7 patients presented rare genomic imbalances. This study showed that CNVs are present in African population and show the importance to implement genetic testing in East-African countries.
Deep ancestry of mammalian X chromosome revealed by comparison with the basal tetrapod Xenopus tropicalis.

PubMed

Mácha, Jaroslav; Teichmanová, Radka; Sater, Amy K; Wells, Dan E; Tlapáková, Tereza; Zimmerman, Lyle B; Krylov, Vladimír

2012-07-16

The X and Y sex chromosomes are conspicuous features of placental mammal genomes. Mammalian sex chromosomes arose from an ordinary pair of autosomes after the proto-Y acquired a male-determining gene and degenerated due to suppression of X-Y recombination. Analysis of earlier steps in X chromosome evolution has been hampered by the long interval between the origins of teleost and amniote lineages as well as scarcity of X chromosome orthologs in incomplete avian genome assemblies. This study clarifies the genesis and remodelling of the Eutherian X chromosome by using a combination of sequence analysis, meiotic map information, and cytogenetic localization to compare amniote genome organization with that of the amphibian Xenopus tropicalis. Nearly all orthologs of human X genes localize to X. tropicalis chromosomes 2 and 8, consistent with an ancestral X-conserved region and a single X-added region precursor. This finding contradicts a previous hypothesis of three evolutionary strata in this region. Homologies between human, opossum, chicken and frog chromosomes suggest a single X-added region predecessor in therian mammals, corresponding to opossum chromosomes 4 and 7. A more ancient X-added ancestral region, currently extant as a major part of chicken chromosome 1, is likely to have been present in the progenitor of synapsids and sauropsids. Analysis of X chromosome gene content emphasizes conservation of single protein coding genes and the role of tandem arrays in formation of novel genes. Chromosomal regions orthologous to Therian X chromosomes have been located in the genome of the frog X. tropicalis. These X chromosome ancestral components experienced a series of fusion and breakage events to give rise to avian autosomes and mammalian sex chromosomes. The early branching tetrapod X. tropicalis' simple diploid genome and robust synteny to amniotes greatly enhances studies of vertebrate chromosome evolution.
Deep ancestry of mammalian X chromosome revealed by comparison with the basal tetrapod Xenopus tropicalis

PubMed Central

2012-01-01

Background The X and Y sex chromosomes are conspicuous features of placental mammal genomes. Mammalian sex chromosomes arose from an ordinary pair of autosomes after the proto-Y acquired a male-determining gene and degenerated due to suppression of X-Y recombination. Analysis of earlier steps in X chromosome evolution has been hampered by the long interval between the origins of teleost and amniote lineages as well as scarcity of X chromosome orthologs in incomplete avian genome assemblies. Results This study clarifies the genesis and remodelling of the Eutherian X chromosome by using a combination of sequence analysis, meiotic map information, and cytogenetic localization to compare amniote genome organization with that of the amphibian Xenopus tropicalis. Nearly all orthologs of human X genes localize to X. tropicalis chromosomes 2 and 8, consistent with an ancestral X-conserved region and a single X-added region precursor. This finding contradicts a previous hypothesis of three evolutionary strata in this region. Homologies between human, opossum, chicken and frog chromosomes suggest a single X-added region predecessor in therian mammals, corresponding to opossum chromosomes 4 and 7. A more ancient X-added ancestral region, currently extant as a major part of chicken chromosome 1, is likely to have been present in the progenitor of synapsids and sauropsids. Analysis of X chromosome gene content emphasizes conservation of single protein coding genes and the role of tandem arrays in formation of novel genes. Conclusions Chromosomal regions orthologous to Therian X chromosomes have been located in the genome of the frog X. tropicalis. These X chromosome ancestral components experienced a series of fusion and breakage events to give rise to avian autosomes and mammalian sex chromosomes. The early branching tetrapod X. tropicalis’ simple diploid genome and robust synteny to amniotes greatly enhances studies of vertebrate chromosome evolution. PMID:22800176
High throughput SNP discovery and genotyping in hexaploid wheat

PubMed Central

Navarro, Julien; Kitt, Jonathan; Choulet, Frédéric; Leveugle, Magalie; Duarte, Jorge; Rivière, Nathalie; Eversole, Kellye; Le Gouis, Jacques; Davassi, Alessandro; Balfourier, François; Le Paslier, Marie-Christine; Berard, Aurélie; Brunel, Dominique; Feuillet, Catherine; Poncet, Charles; Sourdille, Pierre

2018-01-01

Because of their abundance and their amenability to high-throughput genotyping techniques, Single Nucleotide Polymorphisms (SNPs) are powerful tools for efficient genetics and genomics studies, including characterization of genetic resources, genome-wide association studies and genomic selection. In wheat, most of the previous SNP discovery initiatives targeted the coding fraction, leaving almost 98% of the wheat genome largely unexploited. Here we report on the use of whole-genome resequencing data from eight wheat lines to mine for SNPs in the genic, the repetitive and non-repetitive intergenic fractions of the wheat genome. Eventually, we identified 3.3 million SNPs, 49% being located on the B-genome, 41% on the A-genome and 10% on the D-genome. We also describe the development of the TaBW280K high-throughput genotyping array containing 280,226 SNPs. Performance of this chip was examined by genotyping a set of 96 wheat accessions representing the worldwide diversity. Sixty-nine percent of the SNPs can be efficiently scored, half of them showing a diploid-like clustering. The TaBW280K was proven to be a very efficient tool for diversity analyses, as well as for breeding as it can discriminate between closely related elite varieties. Finally, the TaBW280K array was used to genotype a population derived from a cross between Chinese Spring and Renan, leading to the construction a dense genetic map comprising 83,721 markers. The results described here will provide the wheat community with powerful tools for both basic and applied research. PMID:29293495
A user-friendly workflow for analysis of Illumina gene expression bead array data available at the arrayanalysis.org portal.

PubMed

Eijssen, Lars M T; Goelela, Varshna S; Kelder, Thomas; Adriaens, Michiel E; Evelo, Chris T; Radonjic, Marijana

2015-06-30

Illumina whole-genome expression bead arrays are a widely used platform for transcriptomics. Most of the tools available for the analysis of the resulting data are not easily applicable by less experienced users. ArrayAnalysis.org provides researchers with an easy-to-use and comprehensive interface to the functionality of R and Bioconductor packages for microarray data analysis. As a modular open source project, it allows developers to contribute modules that provide support for additional types of data or extend workflows. To enable data analysis of Illumina bead arrays for a broad user community, we have developed a module for ArrayAnalysis.org that provides a free and user-friendly web interface for quality control and pre-processing for these arrays. This module can be used together with existing modules for statistical and pathway analysis to provide a full workflow for Illumina gene expression data analysis. The module accepts data exported from Illumina's GenomeStudio, and provides the user with quality control plots and normalized data. The outputs are directly linked to the existing statistics module of ArrayAnalysis.org, but can also be downloaded for further downstream analysis in third-party tools. The Illumina bead arrays analysis module is available at http://www.arrayanalysis.org . A user guide, a tutorial demonstrating the analysis of an example dataset, and R scripts are available. The module can be used as a starting point for statistical evaluation and pathway analysis provided on the website or to generate processed input data for a broad range of applications in life sciences research.

Genetic Dosage Compensation in a Family with Velo-cardio-facial/DiGeorge/22q11.2 Deletion Syndrome

PubMed Central

Alkalay, Avishai A.; Guo, Tingwei; Montagna, Cristina; Digilio, M. Cristina; Marino, Bruno; Dallapiccola, Bruno; Morrow, Bernice

2014-01-01

Cytogenetic studies of a male child carrying the 22q11.2 deletion common in patients with velo-cardio-facial/DiGeorge syndrome revealed an unexpected rearrangement of the 22q11.2 region in his normal appearing mother. The mother carries a 3 Mb deletion on one copy and a reciprocal, similar sized duplication on the other copy of chromosome 22q11.2 as revealed by fluorescence in situ hybridization and array comparative genome hybridization analysis. The most parsimonious mechanism for the rearrangement is a mitotic non-allelic homologous recombination event in a cell in the early embryo soon after fertilization. The normal phenotype of the mother can be explained by the theory of genetic dosage compensation. This is the second documented case of such an event for this or any genomic disorder. This finding helps to reinforce this phenomenon in a human model, and has significant implications for genetic counseling of future children. PMID:21337693
Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses.

PubMed

Orr, N; Back, W; Gu, J; Leegwater, P; Govindarajan, P; Conroy, J; Ducro, B; Van Arendonk, J A M; MacHugh, D E; Ennis, S; Hill, E W; Brama, P A J

2010-12-01

The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of inheritance, to a 2-MB region of chromosome 14 using just 10 affected animals and 10 controls. We successfully genotyped 34,429 SNPs that were tested for association with dwarfism using chi-square tests. The most significant SNP in our study, BIEC2-239376 (P(2df)=4.54 × 10(-5), P(rec)=7.74 × 10(-6)), is located close to a gene implicated in human dwarfism. Fine-mapping and resequencing analyses did not aid in further localization of the causative variant, and replication of our findings in independent sample sets will be necessary to confirm these results. © 2010 The Authors, Journal compilation © 2010 Stichting International Foundation for Animal Genetics.
Exome chip meta-analysis identifies novel loci and East Asian-specific coding variants that contribute to lipid levels and coronary artery disease.

PubMed

Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J; Wu, Ying; Zhang, He; Zhou, Wei; Li, Jun; Tang, Clara Sze-Man; Dorajoo, Rajkumar; Li, Huaixing; Long, Jirong; Guo, Xiuqing; Xu, Ming; Spracklen, Cassandra N; Chen, Yang; Liu, Xuezhen; Zhang, Yan; Khor, Chiea Chuen; Liu, Jianjun; Sun, Liang; Wang, Laiyuan; Gao, Yu-Tang; Hu, Yao; Yu, Kuai; Wang, Yiqin; Cheung, Chloe Yu Yan; Wang, Feijie; Huang, Jianfeng; Fan, Qiao; Cai, Qiuyin; Chen, Shufeng; Shi, Jinxiu; Yang, Xueli; Zhao, Wanting; Sheu, Wayne H-H; Cherny, Stacey Shawn; He, Meian; Feranil, Alan B; Adair, Linda S; Gordon-Larsen, Penny; Du, Shufa; Varma, Rohit; Chen, Yii-Der Ida; Shu, Xiao-Ou; Lam, Karen Siu Ling; Wong, Tien Yin; Ganesh, Santhi K; Mo, Zengnan; Hveem, Kristian; Fritsche, Lars G; Nielsen, Jonas Bille; Tse, Hung-Fat; Huo, Yong; Cheng, Ching-Yu; Chen, Y Eugene; Zheng, Wei; Tai, E Shyong; Gao, Wei; Lin, Xu; Huang, Wei; Abecasis, Goncalo; Kathiresan, Sekar; Mohlke, Karen L; Wu, Tangchun; Sham, Pak Chung; Gu, Dongfeng; Willer, Cristen J

2017-12-01

Most genome-wide association studies have been of European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we used an exome array to examine protein-coding genetic variants in 47,532 East Asian individuals. We identified 255 variants at 41 loci that reached chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After a meta-analysis including >300,000 European samples, we identified an additional nine novel loci. Sixteen genes were identified by protein-altering variants in both East Asians and Europeans, and thus are likely to be functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci.
Exome chip meta-analysis identifies novel loci and East Asian-specific coding variants contributing to lipid levels and coronary artery disease

PubMed Central

Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J.; Wu, Ying; Zhang, He; Zhou, Wei; Li, Jun; Tang, Clara Sze-man; Dorajoo, Rajkumar; Li, Huaixing; Long, Jirong; Guo, Xiuqing; Xu, Ming; Spracklen, Cassandra N.; Chen, Yang; Liu, Xuezhen; Zhang, Yan; Khor, Chiea Chuen; Liu, Jianjun; Sun, Liang; Wang, Laiyuan; Gao, Yu-Tang; Hu, Yao; Yu, Kuai; Wang, Yiqin; Cheung, Chloe Yu Yan; Wang, Feijie; Huang, Jianfeng; Fan, Qiao; Cai, Qiuyin; Chen, Shufeng; Shi, Jinxiu; Yang, Xueli; Zhao, Wanting; Sheu, Wayne H.-H.; Cherny, Stacey Shawn; He, Meian; Feranil, Alan B.; Adair, Linda S.; Gordon-Larsen, Penny; Du, Shufa; Varma, Rohit; da Chen, Yii-Der I; Shu, XiaoOu; Lam, Karen Siu Ling; Wong, Tien Yin; Ganesh, Santhi K.; Mo, Zengnan; Hveem, Kristian; Fritsche, Lars; Nielsen, Jonas Bille; Tse, Hung-fat; Huo, Yong; Cheng, Ching-Yu; Chen, Y. Eugene; Zheng, Wei; Tai, E Shyong; Gao, Wei; Lin, Xu; Huang, Wei; Abecasis, Goncalo; Consortium, GLGC; Kathiresan, Sekar; Mohlke, Karen L.; Wu, Tangchun; Sham, Pak Chung; Gu, Dongfeng; Willer, Cristen J

2017-01-01

Most genome-wide association studies have been conducted in European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we examined protein-coding genetic variants in 47,532 East Asian individuals using an exome array. We identified 255 variants at 41 loci reaching chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After meta-analysis with > 300,000 European samples, we identified an additional 9 novel loci. The same 16 genes were identified by the protein-altering variants in both East Asians and Europeans, likely pointing to the functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population-specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci. PMID:29083407
CNV-seq, a new method to detect copy number variation using high-throughput sequencing.

PubMed

Xie, Chao; Tammi, Martti T

2009-03-06

DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations. Here, we describe a method to detect copy number variation using shotgun sequencing, CNV-seq. The method is based on a robust statistical model that describes the complete analysis procedure and allows the computation of essential confidence values for detection of CNV. Our results show that the number of reads, not the length of the reads is the key factor determining the resolution of detection. This favors the next-generation sequencing methods that rapidly produce large amount of short reads. Simulation of various sequencing methods with coverage between 0.1x to 8x show overall specificity between 91.7 - 99.9%, and sensitivity between 72.2 - 96.5%. We also show the results for assessment of CNV between two individual human genomes.
Defining Genomic Changes in Triple-Negative Breast Cancer in Women of African Descent

DTIC Science & Technology

2012-06-01

Triple negative breast cancer • Ethnic disparities • Breast cancer amongst African Americans and Africans • Gene expression profiling • Array... negative cases seen in both African and African - American breast cancer cases. Gene Expression Array Studies The 31 triple negative Kijabe... African - American Adjacent Normal Breast Tissue PI: Pegram &
Detection and validation of single feature polymorphisms using RNA expression data from a rice genome array

USDA-ARS?s Scientific Manuscript database

A large number of genetic variations have been identified in rice. Such variations must in many cases control phenotypic differences in abiotic stress tolerance and other traits. A single feature polymorphism (SFP) is an oligonucleotide array-based polymorphism which can be used for identification o...
[Application of array-based comparative genomic hybridization technique in genetic analysis of patients with spontaneous abortion].

PubMed

Chu, Y; Wu, D; Hou, Q F; Huo, X D; Gao, Y; Wang, T; Wang, H D; Yang, Y L; Liao, S X

2016-08-25

To investigate the value of array-based comparative genomic hybridization (array-CGH) technique for the detection of chromosomal analysis of miscarried embryo, and to provide genetic counseling for couples with spontaneous abortion. Totally 382 patients who underwent miscarriage were enrolled in this study. All aborted tissues were analyzed with conventional cytogenetic karyotyping and array-CGH, respectively. Through genetic analysis, all of the 382 specimens were successfully analyzed by array-CGH (100.0%, 382/382), and the detection rate of chromosomal aberrations was 46.6% (178/382). However, conventional karyotype analysis was successfully performed in 281 cases (73.6%, 281/382), and 113 (40.2%, 113/281) were found with chromosomal aberrations. Of these 178 samples identified by array-CGH, 163 samples (91.6%, 163/178) were aneuploidy, 15 samples (8.4%, 15/178) were segmental deletion and (or) duplication cases. Four of 10 cases with small segmental deletion and duplication were validated to be transferred from their fathers or mathers who were carriers of submicroscopic reciprocal translocation. Of these 113 abnormal karyotypes founded by conventional karyotyping, 108 cases (95.6%, 108/113) were aneuploidy and 5 cases (4.4%, 5/113) had chromosome structural aberrations. Most array-CGH results were consistent with conventional karyotyping but with 3 cases of discrepancy, which included 2 cases of triploids, 1 case of low-level mosaicism that undetcted by array-CGH. Compared with conventional karyotyping, there is an increased detection rate of chromosomal abnormalities when array-CGH is used to analyse the products of conception, primarilly because of its sucess with nonviable tissues. It could be a first-line method to determine the reason of miscarrage with higher accuracy and sensitivity.
SNPchiMp v.3: integrating and standardizing single nucleotide polymorphism data for livestock species.

PubMed

Nicolazzi, Ezequiel L; Caprera, Andrea; Nazzicari, Nelson; Cozzi, Paolo; Strozzi, Francesco; Lawley, Cindy; Pirani, Ali; Soans, Chandrasen; Brew, Fiona; Jorjani, Hossein; Evans, Gary; Simpson, Barry; Tosser-Klopp, Gwenola; Brauning, Rudiger; Williams, John L; Stella, Alessandra

2015-04-10

In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information. Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion. This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.
Piecewise polynomial representations of genomic tracks.

PubMed

Tarabichi, Maxime; Detours, Vincent; Konopka, Tomasz

2012-01-01

Genomic data from micro-array and sequencing projects consist of associations of measured values to chromosomal coordinates. These associations can be thought of as functions in one dimension and can thus be stored, analyzed, and interpreted as piecewise-polynomial curves. We present a general framework for building piecewise polynomial representations of genome-scale signals and illustrate some of its applications via examples. We show that piecewise constant segmentation, a typical step in copy-number analyses, can be carried out within this framework for both array and (DNA) sequencing data offering advantages over existing methods in each case. Higher-order polynomial curves can be used, for example, to detect trends and/or discontinuities in transcription levels from RNA-seq data. We give a concrete application of piecewise linear functions to diagnose and quantify alignment quality at exon borders (splice sites). Our software (source and object code) for building piecewise polynomial models is available at http://sourceforge.net/projects/locsmoc/.
LS-CAP: an algorithm for identifying cytogenetic aberrations in hepatocellular carcinoma using microarray data.

PubMed

He, Xianmin; Wei, Qing; Sun, Meiqian; Fu, Xuping; Fan, Sichang; Li, Yao

2006-05-01

Biological techniques such as Array-Comparative genomic hybridization (CGH), fluorescent in situ hybridization (FISH) and affymetrix single nucleotide pleomorphism (SNP) array have been used to detect cytogenetic aberrations. However, on genomic scale, these techniques are labor intensive and time consuming. Comparative genomic microarray analysis (CGMA) has been used to identify cytogenetic changes in hepatocellular carcinoma (HCC) using gene expression microarray data. However, CGMA algorithm can not give precise localization of aberrations, fails to identify small cytogenetic changes, and exhibits false negatives and positives. Locally un-weighted smoothing cytogenetic aberrations prediction (LS-CAP) based on local smoothing and binomial distribution can be expected to address these problems. LS-CAP algorithm was built and used on HCC microarray profiles. Eighteen cytogenetic abnormalities were identified, among them 5 were reported previously, and 12 were proven by CGH studies. LS-CAP effectively reduced the false negatives and positives, and precisely located small fragments with cytogenetic aberrations.
Whole organism lineage tracing by combinatorial and cumulative genome editing

PubMed Central

McKenna, Aaron; Findlay, Gregory M.; Gagnon, James A.; Horwitz, Marshall S.; Schier, Alexander F.; Shendure, Jay

2016-01-01

Multicellular systems develop from single cells through distinct lineages. However, current lineage tracing approaches scale poorly to whole, complex organisms. Here we use genome editing to progressively introduce and accumulate diverse mutations in a DNA barcode over multiple rounds of cell division. The barcode, an array of CRISPR/Cas9 target sites, marks cells and enables the elucidation of lineage relationships via the patterns of mutations shared between cells. In cell culture and zebrafish, we show that rates and patterns of editing are tunable, and that thousands of lineage-informative barcode alleles can be generated. By sampling hundreds of thousands of cells from individual zebrafish, we find that most cells in adult organs derive from relatively few embryonic progenitors. In future analyses, genome editing of synthetic target arrays for lineage tracing (GESTALT) can be used to generate large-scale maps of cell lineage in multicellular systems for normal development and disease. PMID:27229144
Three gangliogliomas: results of GTG-banding, SKY, genome-wide high resolution SNP-array, gene expression and review of the literature.

PubMed

Xu, Li-Xin; Holland, Heidrun; Kirsten, Holger; Ahnert, Peter; Krupp, Wolfgang; Bauer, Manfred; Schober, Ralf; Mueller, Wolf; Fritzsch, Dominik; Meixensberger, Jürgen; Koschny, Ronald

2015-04-01

According to the World Health Organization gangliogliomas are classified as well-differentiated and slowly growing neuroepithelial tumors, composed of neoplastic mature ganglion and glial cells. It is the most frequent tumor entity observed in patients with long-term epilepsy. Comprehensive cytogenetic and molecular cytogenetic data including high-resolution genomic profiling (single nucleotide polymorphism (SNP)-array) of gangliogliomas are scarce but necessary for a better oncological understanding of this tumor entity. For a detailed characterization at the single cell and cell population levels, we analyzed genomic alterations of three gangliogliomas using trypsin-Giemsa banding (GTG-banding) and by spectral karyotyping (SKY) in combination with SNP-array and gene expression array experiments. By GTG and SKY, we could confirm frequently detected chromosomal aberrations (losses within chromosomes 10, 13 and 22; gains within chromosomes 5, 7, 8 and 12), and identify so far unknown genetic aberrations like the unbalanced non-reciprocal translocation t(1;18)(q21;q21). Interestingly, we report on the second so far detected ganglioglioma with ring chromosome 1. Analyses of SNP-array data from two of the tumors and respective germline DNA (peripheral blood) identified few small gains and losses and a number of copy-neutral regions with loss of heterozygosity (LOH) in germline and in tumor tissue. In comparison to germline DNA, tumor tissues did not show substantial regions with significant loss or gain or with newly developed LOH. Gene expression analyses of tumor-specific genes revealed similarities in the profile of the analyzed samples regarding different relevant pathways. Taken together, we describe overlapping but also distinct and novel genetic aberrations of three gangliogliomas. © 2014 Japanese Society of Neuropathology.
SNP-array reveals genome-wide patterns of geographical and potential adaptive divergence across the natural range of Atlantic salmon (Salmo salar).

PubMed

Bourret, Vincent; Kent, Matthew P; Primmer, Craig R; Vasemägi, Anti; Karlsson, Sten; Hindar, Kjetil; McGinnity, Philip; Verspoor, Eric; Bernatchez, Louis; Lien, Sigbjørn

2013-02-01

Atlantic salmon (Salmo salar) is one of the most extensively studied fish species in the world due to its significance in aquaculture, fisheries and ongoing conservation efforts to protect declining populations. Yet, limited genomic resources have hampered our understanding of genetic architecture in the species and the genetic basis of adaptation to the wide range of natural and artificial environments it occupies. In this study, we describe the development of a medium-density Atlantic salmon single nucleotide polymorphism (SNP) array based on expressed sequence tags (ESTs) and genomic sequencing. The array was used in the most extensive assessment of population genetic structure performed to date in this species. A total of 6176 informative SNPs were successfully genotyped in 38 anadromous and freshwater wild populations distributed across the species natural range. Principal component analysis clearly differentiated European and North American populations, and within Europe, three major regional genetic groups were identified for the first time in a single analysis. We assessed the potential for the array to disentangle neutral and putative adaptive divergence of SNP allele frequencies across populations and among regional groups. In Europe, secondary contact zones were identified between major clusters where endogenous and exogenous barriers could be associated, rendering the interpretation of environmental influence on potentially adaptive divergence equivocal. A small number of markers highly divergent in allele frequencies (outliers) were observed between (multiple) freshwater and anadromous populations, between northern and southern latitudes, and when comparing Baltic populations to all others. We also discuss the potential future applications of the SNP array for conservation, management and aquaculture. © 2012 Blackwell Publishing Ltd.
Human genetics and genomics a decade after the release of the draft sequence of the human genome.

PubMed

Naidoo, Nasheen; Pawitan, Yudi; Soong, Richie; Cooper, David N; Ku, Chee-Seng

2011-10-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.
Human genetics and genomics a decade after the release of the draft sequence of the human genome

PubMed Central

2011-01-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605
Widespread of horizontal gene transfer in the human genome.

PubMed

Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun

2017-04-04

A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.
Characterizing the cancer genome in lung adenocarcinoma

PubMed Central

Weir, Barbara A.; Woo, Michele S.; Getz, Gad; Perner, Sven; Ding, Li; Beroukhim, Rameen; Lin, William M.; Province, Michael A.; Kraja, Aldi; Johnson, Laura A.; Shah, Kinjal; Sato, Mitsuo; Thomas, Roman K.; Barletta, Justine A.; Borecki, Ingrid B.; Broderick, Stephen; Chang, Andrew C.; Chiang, Derek Y.; Chirieac, Lucian R.; Cho, Jeonghee; Fujii, Yoshitaka; Gazdar, Adi F.; Giordano, Thomas; Greulich, Heidi; Hanna, Megan; Johnson, Bruce E.; Kris, Mark G.; Lash, Alex; Lin, Ling; Lindeman, Neal; Mardis, Elaine R.; McPherson, John D.; Minna, John D.; Morgan, Margaret B.; Nadel, Mark; Orringer, Mark B.; Osborne, John R.; Ozenberger, Brad; Ramos, Alex H.; Robinson, James; Roth, Jack A.; Rusch, Valerie; Sasaki, Hidefumi; Shepherd, Frances; Sougnez, Carrie; Spitz, Margaret R.; Tsao, Ming-Sound; Twomey, David; Verhaak, Roel G. W.; Weinstock, George M.; Wheeler, David A.; Winckler, Wendy; Yoshizawa, Akihiko; Yu, Soyoung; Zakowski, Maureen F.; Zhang, Qunyuan; Beer, David G.; Wistuba, Ignacio I.; Watson, Mark A.; Garraway, Levi A.; Ladanyi, Marc; Travis, William D.; Pao, William; Rubin, Mark A.; Gabriel, Stacey B.; Gibbs, Richard A.; Varmus, Harold E.; Wilson, Richard K.; Lander, Eric S.; Meyerson, Matthew

2008-01-01

Somatic alterations in cellular DNA underlie almost all human cancers1. The prospect of targeted therapies2 and the development of high-resolution, genome-wide approaches3–8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumors (n = 371) using dense single nucleotide polymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scale copy-number gain or loss, of which only a handful have been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in ~12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 (NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered. PMID:17982442
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

PubMed

Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael

2017-01-01

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata

PubMed Central

Podduturi, Nikhil R.; Glick, David I.; Baymuradov, Ulugbek K.; Malladi, Venkat S.; Chan, Esther T.; Davidson, Jean M.; Gabdank, Idan; Narayana, Aditi K.; Onate, Kathrina C.; Hilton, Jason; Ho, Marcus C.; Lee, Brian T.; Miyasato, Stuart R.; Dreszer, Timothy R.; Sloan, Cricket A.; Strattan, J. Seth; Tanaka, Forrest Y.; Hong, Eurie L.; Cherry, J. Michael

2017-01-01

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package. PMID:28403240

Some links on this page may take you to non-federal websites. Their policies may differ from this site.