mouse reference genome: Topics by Science.gov

Sample records for mouse reference genome

Unexpected effects of different genetic backgrounds on identification of genomic rearrangements via whole-genome next generation sequencing.

PubMed

Chen, Zhangguo; Gowan, Katherine; Leach, Sonia M; Viboolsittiseri, Sawanee S; Mishra, Ameet K; Kadoishi, Tanya; Diener, Katrina; Gao, Bifeng; Jones, Kenneth; Wang, Jing H

2016-10-21

Whole genome next generation sequencing (NGS) is increasingly employed to detect genomic rearrangements in cancer genomes, especially in lymphoid malignancies. We recently established a unique mouse model by specifically deleting a key non-homologous end-joining DNA repair gene, Xrcc4, and a cell cycle checkpoint gene, Trp53, in germinal center B cells. This mouse model spontaneously develops mature B cell lymphomas (termed G1XP lymphomas). Here, we attempt to employ whole genome NGS to identify novel structural rearrangements, in particular inter-chromosomal translocations (CTXs), in these G1XP lymphomas. We sequenced six lymphoma samples, aligned our NGS data with mouse reference genome (in C57BL/6J (B6) background) and identified CTXs using CREST algorithm. Surprisingly, we detected widespread CTXs in both lymphomas and wildtype control samples, majority of which were false positive and attributable to different genetic backgrounds. In addition, we validated our NGS pipeline by sequencing multiple control samples from distinct tissues of different genetic backgrounds of mouse (B6 vs non-B6). Lastly, our studies showed that widespread false positive CTXs can be generated by simply aligning sequences from different genetic backgrounds of mouse. We conclude that mapping and alignment with reference genome might not be a preferred method for analyzing whole-genome NGS data obtained from a genetic background different from reference genome. Given the complex genetic background of different mouse strains or the heterogeneity of cancer genomes in human patients, in order to minimize such systematic artifacts and uncover novel CTXs, a preferred method might be de novo assembly of personalized normal control genome and cancer cell genome, instead of mapping and aligning NGS data to mouse or human reference genome. Thus, our studies have critical impact on the manner of data analysis for cancer genomics.
FANTOM5 CAGE profiles of human and mouse reprocessed for GRCh38 and GRCm38 genome assemblies.

PubMed

Abugessaisa, Imad; Noguchi, Shuhei; Hasegawa, Akira; Harshbarger, Jayson; Kondo, Atsushi; Lizio, Marina; Severin, Jessica; Carninci, Piero; Kawaji, Hideya; Kasukawa, Takeya

2017-08-29

The FANTOM5 consortium described the promoter-level expression atlas of human and mouse by using CAGE (Cap Analysis of Gene Expression) with single molecule sequencing. In the original publications, GRCh37/hg19 and NCBI37/mm9 assemblies were used as the reference genomes of human and mouse respectively; later, the Genome Reference Consortium released newer genome assemblies GRCh38/hg38 and GRCm38/mm10. To increase the utility of the atlas in forthcoming researches, we reprocessed the data to make them available on the recent genome assemblies. The data include observed frequencies of transcription starting sites (TSSs) based on the realignment of CAGE reads, and TSS peaks that are converted from those based on the previous reference. Annotations of the peak names were also updated based on the latest public databases. The reprocessed results enable us to examine frequencies of transcription initiations on the recent genome assemblies and to refer promoters with updated information across the genome assemblies consistently.
Phylogenomic Insights into Mouse Evolution Using a Pseudoreference Approach

PubMed Central

Sarver, Brice A.J.; Keeble, Sara; Cosart, Ted; Tucker, Priscilla K.; Dean, Matthew D.

2017-01-01

Comparative genomic studies are now possible across a broad range of evolutionary timescales, but the generation and analysis of genomic data across many different species still present a number of challenges. The most sophisticated genotyping and down-stream analytical frameworks are still predominantly based on comparisons to high-quality reference genomes. However, established genomic resources are often limited within a given group of species, necessitating comparisons to divergent reference genomes that could restrict or bias comparisons across a phylogenetic sample. Here, we develop a scalable pseudoreference approach to iteratively incorporate sample-specific variation into a genome reference and reduce the effects of systematic mapping bias in downstream analyses. To characterize this framework, we used targeted capture to sequence whole exomes (∼54 Mbp) in 12 lineages (ten species) of mice spanning the Mus radiation. We generated whole exome pseudoreferences for all species and show that this iterative reference-based approach improved basic genomic analyses that depend on mapping accuracy while preserving the associated annotations of the mouse reference genome. We then use these pseudoreferences to resolve evolutionary relationships among these lineages while accounting for phylogenetic discordance across the genome, contributing an important resource for comparative studies in the mouse system. We also describe patterns of genomic introgression among lineages and compare our results to previous studies. Our general approach can be applied to whole or partitioned genomic data and is easily portable to any system with sufficient genomic resources, providing a useful framework for phylogenomic studies in mice and other taxa. PMID:28338821
Creating reference gene annotation for the mouse C57BL6/J genome assembly.

PubMed

Mudge, Jonathan M; Harrow, Jennifer

2015-10-01

Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species.
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2013-10-01

Microarray intensities were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference. This...additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal of these studies is to expand our number of genomic profiles (DNA and...mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset with which to identify key candidate oncogenes, tumor suppressor genes
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2012-10-01

Microarray intensities were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference...identify candidate drug targets of CPC. Task 1: Generation of additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal...of these studies is to expand our number of genomic profiles (DNA and mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2011-10-01

were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference. This analysis highlights...Task 1: Generation of additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal of these studies is to expand our...number of genomic profiles (DNA and mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset with which to identify key candidate
Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions

PubMed Central

Pezer, Željka; Chung, Amanda G.; Karn, Robert C.

2017-01-01

Abstract The Androgen-binding protein (Abp) gene region of the mouse genome contains 64 genes, some encoding pheromones that influence assortative mating between mice from different subspecies. Using CNVnator and quantitative PCR, we explored copy number variation in this gene family in natural populations of Mus musculus domesticus (Mmd) and Mus musculus musculus (Mmm), two subspecies of house mice that form a narrow hybrid zone in Central Europe. We found that copy number variation in the center of the Abp gene region is very common in wild Mmd, primarily representing the presence/absence of the final duplications described for the mouse genome. Clustering of Mmd individuals based on this variation did not reflect their geographical origin, suggesting no population divergence in the Abp gene cluster. However, copy number variation patterns differ substantially between Mmd and other mouse taxa. Large blocks of Abp genes are absent in Mmm, Mus musculus castaneus and an outgroup, Mus spretus, although with differences in variation and breakpoint locations. Our analysis calls into question the reliance on a reference genome for interpreting the detailed organization of genes in taxa more distant from the Mmd reference genome. The polymorphic nature of the gene family expansion in all four taxa suggests that the number of Abp genes, especially in the central gene region, is not critical to the survival and reproduction of the mouse. However, Abp haplotypes of variable length may serve as a source of raw genetic material for new signals influencing reproductive communication and thus speciation of mice. PMID:28575204
Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation

PubMed Central

Pujar, Shashikant; O’Leary, Nuala A; Farrell, Catherine M; Mudge, Jonathan M; Wallin, Craig; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bult, Carol J; Frankish, Adam; Pruitt, Kim D

2018-01-01

Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. PMID:29126148
Using optical mapping data for the improvement of vertebrate genome assemblies.

PubMed

Howe, Kerstin; Wood, Jonathan M D

2015-01-01

Optical mapping is a technology that gathers long-range information on genome sequences similar to ordered restriction digest maps. Because it is not subject to cloning, amplification, hybridisation or sequencing bias, it is ideally suited to the improvement of fragmented genome assemblies that can no longer be improved by classical methods. In addition, its low cost and rapid turnaround make it equally useful during the scaffolding process of de novo assembly from high throughput sequencing reads. We describe how optical mapping has been used in practice to produce high quality vertebrate genome assemblies. In particular, we detail the efforts undertaken by the Genome Reference Consortium (GRC), which maintains the reference genomes for human, mouse, zebrafish and chicken, and uses different optical mapping platforms for genome curation.
Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions.

PubMed

Pezer, Željka; Chung, Amanda G; Karn, Robert C; Laukaitis, Christina M

2017-06-01

The Androgen-binding protein ( Abp ) gene region of the mouse genome contains 64 genes, some encoding pheromones that influence assortative mating between mice from different subspecies. Using CNVnator and quantitative PCR, we explored copy number variation in this gene family in natural populations of Mus musculus domesticus ( Mmd ) and Mus musculus musculus ( Mmm ), two subspecies of house mice that form a narrow hybrid zone in Central Europe. We found that copy number variation in the center of the Abp gene region is very common in wild Mmd , primarily representing the presence/absence of the final duplications described for the mouse genome. Clustering of Mmd individuals based on this variation did not reflect their geographical origin, suggesting no population divergence in the Abp gene cluster. However, copy number variation patterns differ substantially between Mmd and other mouse taxa. Large blocks of Abp genes are absent in Mmm , Mus musculus castaneus and an outgroup, Mus spretus , although with differences in variation and breakpoint locations. Our analysis calls into question the reliance on a reference genome for interpreting the detailed organization of genes in taxa more distant from the Mmd reference genome. The polymorphic nature of the gene family expansion in all four taxa suggests that the number of Abp genes, especially in the central gene region, is not critical to the survival and reproduction of the mouse. However, Abp haplotypes of variable length may serve as a source of raw genetic material for new signals influencing reproductive communication and thus speciation of mice. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.

PubMed

Pujar, Shashikant; O'Leary, Nuala A; Farrell, Catherine M; Loveland, Jane E; Mudge, Jonathan M; Wallin, Craig; Girón, Carlos G; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; Martin, Fergal J; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Suner, Marie-Marthe; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bruford, Elspeth A; Bult, Carol J; Frankish, Adam; Murphy, Terence; Pruitt, Kim D

2018-01-04

The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse.

PubMed

Eppig, Janan T

2017-07-01

The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. © The Author 2017. Published by Oxford University Press.
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse

PubMed Central

Eppig, Janan T.

2017-01-01

Abstract The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. PMID:28838066
Differential DNA Methylation Analysis without a Reference Genome.

PubMed

Klughammer, Johanna; Datlinger, Paul; Printz, Dieter; Sheffield, Nathan C; Farlik, Matthias; Hadler, Johanna; Fritsch, Gerhard; Bock, Christoph

2015-12-22

Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS), which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish). Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org). The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Mouse ENU Mutagenesis to Understand Immunity to Infection: Methods, Selected Examples, and Perspectives

PubMed Central

Caignard, Grégory; Eva, Megan M.; van Bruggen, Rebekah; Eveleigh, Robert; Bourque, Guillaume; Malo, Danielle; Gros, Philippe; Vidal, Silvia M.

2014-01-01

Infectious diseases are responsible for over 25% of deaths globally, but many more individuals are exposed to deadly pathogens. The outcome of infection results from a set of diverse factors including pathogen virulence factors, the environment, and the genetic make-up of the host. The completion of the human reference genome sequence in 2004 along with technological advances have tremendously accelerated and renovated the tools to study the genetic etiology of infectious diseases in humans and its best characterized mammalian model, the mouse. Advancements in mouse genomic resources have accelerated genome-wide functional approaches, such as gene-driven and phenotype-driven mutagenesis, bringing to the fore the use of mouse models that reproduce accurately many aspects of the pathogenesis of human infectious diseases. Treatment with the mutagen N-ethyl-N-nitrosourea (ENU) has become the most popular phenotype-driven approach. Our team and others have employed mouse ENU mutagenesis to identify host genes that directly impact susceptibility to pathogens of global significance. In this review, we first describe the strategies and tools used in mouse genetics to understand immunity to infection with special emphasis on chemical mutagenesis of the mouse germ-line together with current strategies to efficiently identify functional mutations using next generation sequencing. Then, we highlight illustrative examples of genes, proteins, and cellular signatures that have been revealed by ENU screens and have been shown to be involved in susceptibility or resistance to infectious diseases caused by parasites, bacteria, and viruses. PMID:25268389
The genomic landscape shaped by selection on transposable elements across 18 mouse strains.

PubMed

Nellåker, Christoffer; Keane, Thomas M; Yalcin, Binnaz; Wong, Kim; Agam, Avigail; Belgard, T Grant; Flint, Jonathan; Adams, David J; Frankel, Wayne N; Ponting, Chris P

2012-06-15

Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.
gQTL: A Web Application for QTL Analysis Using the Collaborative Cross Mouse Genetic Reference Population.

PubMed

Konganti, Kranti; Ehrlich, Andre; Rusyn, Ivan; Threadgill, David W

2018-06-07

Multi-parental recombinant inbred populations, such as the Collaborative Cross (CC) mouse genetic reference population, are increasingly being used for analysis of quantitative trait loci (QTL). However specialized analytic software for these complex populations is typically built in R that works only on command-line, which limits the utility of these powerful resources for many users. To overcome analytic limitations, we developed gQTL, a web accessible, simple graphical user interface application based on the DOQTL platform in R to perform QTL mapping using data from CC mice. Copyright © 2018, G3: Genes, Genomes, Genetics.
Genome-wide compendium and functional assessment of in vivo heart enhancers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen

Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less
Genome-wide compendium and functional assessment of in vivo heart enhancers

DOE PAGES

Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; ...

2016-10-05

Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less

Genome-wide compendium and functional assessment of in vivo heart enhancers

PubMed Central

Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; Fukuda-Yuzawa, Yoko; Osterwalder, Marco; Mannion, Brandon J.; May, Dalit; Spurrell, Cailyn H.; Plajzer-Frick, Ingrid; Pickle, Catherine S.; Lee, Elizabeth; Garvin, Tyler H.; Kato, Momoe; Akiyama, Jennifer A.; Afzal, Veena; Lee, Ah Young; Gorkin, David U.; Ren, Bing; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

2016-01-01

Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of >35 epigenomic data sets from mouse and human pre- and postnatal hearts we created a comprehensive reference of >80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs of two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function. PMID:27703156
Solutions for data integration in functional genomics: a critical assessment and case study.

PubMed

Smedley, Damian; Swertz, Morris A; Wolstencroft, Katy; Proctor, Glenn; Zouberakis, Michael; Bard, Jonathan; Hancock, John M; Schofield, Paul

2008-11-01

The torrent of data emerging from the application of new technologies to functional genomics and systems biology can no longer be contained within the traditional modes of data sharing and publication with the consequence that data is being deposited in, distributed across and disseminated through an increasing number of databases. The resulting fragmentation poses serious problems for the model organism community which increasingly rely on data mining and computational approaches that require gathering of data from a range of sources. In the light of these problems, the European Commission has funded a coordination action, CASIMIR (coordination and sustainability of international mouse informatics resources), with a remit to assess the technical and social aspects of database interoperability that currently prevent the full realization of the potential of data integration in mouse functional genomics. In this article, we assess the current problems with interoperability, with particular reference to mouse functional genomics, and critically review the technologies that can be deployed to overcome them. We describe a typical use-case where an investigator wishes to gather data on variation, genomic context and metabolic pathway involvement for genes discovered in a genome-wide screen. We go on to develop an automated approach involving an in silico experimental workflow tool, Taverna, using web services, BioMart and MOLGENIS technologies for data retrieval. Finally, we focus on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.
Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes).

PubMed

Dessimoz, Christophe; Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro

2011-09-01

Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.
Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes)

PubMed Central

Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro

2011-01-01

Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. PMID:21712341
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.

PubMed

Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan; Brent, Michael R

2009-07-01

The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary determinant of alignment accuracy, while heuristics that prevent consideration of certain alignments are a primary determinant of runtime and memory usage. Both accuracy and speed are important considerations in choosing an alignment algorithm, but scoring systems have received much less attention than heuristics. We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners. To validate these results with natural sequences, we performed cross-species alignment using orthologous transcripts from human, mouse and rat. We found that aligner accuracy is heavily dependent on sequence identity. For sequences with 100% identity, Pairagon achieved accuracy levels of >99.6%, with one quarter of the errors of any other aligner. Furthermore, for human/mouse alignments, which are only 85% identical, Pairagon achieved 87% accuracy, higher than any other aligner. Pairagon source and executables are freely available at http://mblab.wustl.edu/software/pairagon/
Orthology for comparative genomics in the mouse genome database.

PubMed

Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A

2015-08-01

The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
Manual Gene Ontology annotation workflow at the Mouse Genome Informatics Database

PubMed Central

Drabkin, Harold J.; Blake, Judith A.

2012-01-01

The Mouse Genome Database, the Gene Expression Database and the Mouse Tumor Biology database are integrated components of the Mouse Genome Informatics (MGI) resource (http://www.informatics.jax.org). The MGI system presents both a consensus view and an experimental view of the knowledge concerning the genetics and genomics of the laboratory mouse. From genotype to phenotype, this information resource integrates information about genes, sequences, maps, expression analyses, alleles, strains and mutant phenotypes. Comparative mammalian data are also presented particularly in regards to the use of the mouse as a model for the investigation of molecular and genetic components of human diseases. These data are collected from literature curation as well as downloads of large datasets (SwissProt, LocusLink, etc.). MGI is one of the founding members of the Gene Ontology (GO) and uses the GO for functional annotation of genes. Here, we discuss the workflow associated with manual GO annotation at MGI, from literature collection to display of the annotations. Peer-reviewed literature is collected mostly from a set of journals available electronically. Selected articles are entered into a master bibliography and indexed to one of eight areas of interest such as ‘GO’ or ‘homology’ or ‘phenotype’. Each article is then either indexed to a gene already contained in the database or funneled through a separate nomenclature database to add genes. The master bibliography and associated indexing provide information for various curator-reports such as ‘papers selected for GO that refer to genes with NO GO annotation’. Once indexed, curators who have expertise in appropriate disciplines enter pertinent information. MGI makes use of several controlled vocabularies that ensure uniform data encoding, enable robust analysis and support the construction of complex queries. These vocabularies range from pick-lists to structured vocabularies such as the GO. All data associations are supported with statements of evidence as well as access to source publications. PMID:23110975
Manual Gene Ontology annotation workflow at the Mouse Genome Informatics Database.

PubMed

Drabkin, Harold J; Blake, Judith A

2012-01-01

The Mouse Genome Database, the Gene Expression Database and the Mouse Tumor Biology database are integrated components of the Mouse Genome Informatics (MGI) resource (http://www.informatics.jax.org). The MGI system presents both a consensus view and an experimental view of the knowledge concerning the genetics and genomics of the laboratory mouse. From genotype to phenotype, this information resource integrates information about genes, sequences, maps, expression analyses, alleles, strains and mutant phenotypes. Comparative mammalian data are also presented particularly in regards to the use of the mouse as a model for the investigation of molecular and genetic components of human diseases. These data are collected from literature curation as well as downloads of large datasets (SwissProt, LocusLink, etc.). MGI is one of the founding members of the Gene Ontology (GO) and uses the GO for functional annotation of genes. Here, we discuss the workflow associated with manual GO annotation at MGI, from literature collection to display of the annotations. Peer-reviewed literature is collected mostly from a set of journals available electronically. Selected articles are entered into a master bibliography and indexed to one of eight areas of interest such as 'GO' or 'homology' or 'phenotype'. Each article is then either indexed to a gene already contained in the database or funneled through a separate nomenclature database to add genes. The master bibliography and associated indexing provide information for various curator-reports such as 'papers selected for GO that refer to genes with NO GO annotation'. Once indexed, curators who have expertise in appropriate disciplines enter pertinent information. MGI makes use of several controlled vocabularies that ensure uniform data encoding, enable robust analysis and support the construction of complex queries. These vocabularies range from pick-lists to structured vocabularies such as the GO. All data associations are supported with statements of evidence as well as access to source publications.
The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease.

PubMed

Eppig, Janan T; Blake, Judith A; Bult, Carol J; Kadin, James A; Richardson, Joel E

2015-01-01

The Mouse Genome Database (MGD, http://www.informatics.jax.org) serves the international biomedical research community as the central resource for integrated genomic, genetic and biological data on the laboratory mouse. To facilitate use of mouse as a model in translational studies, MGD maintains a core of high-quality curated data and integrates experimentally and computationally generated data sets. MGD maintains a unified catalog of genes and genome features, including functional RNAs, QTL and phenotypic loci. MGD curates and provides functional and phenotype annotations for mouse genes using the Gene Ontology and Mammalian Phenotype Ontology. MGD integrates phenotype data and associates mouse genotypes to human diseases, providing critical mouse-human relationships and access to repositories holding mouse models. MGD is the authoritative source of nomenclature for genes, genome features, alleles and strains following guidelines of the International Committee on Standardized Genetic Nomenclature for Mice. A new addition to MGD, the Human-Mouse: Disease Connection, allows users to explore gene-phenotype-disease relationships between human and mouse. MGD has also updated search paradigms for phenotypic allele attributes, incorporated incidental mutation data, added a module for display and exploration of genes and microRNA interactions and adopted the JBrowse genome browser. MGD resources are freely available to the scientific community. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Role of VAPB-MSP, a Novel EphA2 RTK Antagonist in Breast Cancer

DTIC Science & Technology

2012-12-01

13 REFERENCES 1. Arriola E, Marchio C, Tan DS, Drury SC, Lambros MB, et al. (2008) Genomic analysis of the HER2/TOP2A amplicon in breast cancer and...proliferation and branching in mouse mammary epithelium. Mol Biol Cell 12: 1445–1455. 27. Arriola E, Marchio C, Tan DS, Drury SC, Lambros MB, et al
Initial sequencing and comparative analysis of the mouse genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan

2002-12-15

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of themore » genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.« less
microPIR2: a comprehensive database for human–mouse comparative study of microRNA–promoter interactions

PubMed Central

Piriyapongsa, Jittima; Bootchai, Chaiwat; Ngamphiw, Chumpol; Tongsima, Sissades

2014-01-01

microRNA (miRNA)–promoter interaction resource (microPIR) is a public database containing over 15 million predicted miRNA target sites located within human promoter sequences. These predicted targets are presented along with their related genomic and experimental data, making the microPIR database the most comprehensive repository of miRNA promoter target sites. Here, we describe major updates of the microPIR database including new target predictions in the mouse genome and revised human target predictions. The updated database (microPIR2) now provides ∼80 million human and 40 million mouse predicted target sites. In addition to being a reference database, microPIR2 is a tool for comparative analysis of target sites on the promoters of human–mouse orthologous genes. In particular, this new feature was designed to identify potential miRNA–promoter interactions conserved between species that could be stronger candidates for further experimental validation. We also incorporated additional supporting information to microPIR2 such as nuclear and cytoplasmic localization of miRNAs and miRNA–disease association. Extra search features were also implemented to enable various investigations of targets of interest. Database URL: http://www4a.biotec.or.th/micropir2 PMID:25425035
Mouse Genome Database: From sequence to phenotypes and disease models

PubMed Central

Richardson, Joel E.; Kadin, James A.; Smith, Cynthia L.; Blake, Judith A.; Bult, Carol J.

2015-01-01

Summary The Mouse Genome Database (MGD, www.informatics.jax.org) is the international scientific database for genetic, genomic, and biological data on the laboratory mouse to support the research requirements of the biomedical community. To accomplish this goal, MGD provides broad data coverage, serves as the authoritative standard for mouse nomenclature for genes, mutants, and strains, and curates and integrates many types of data from literature and electronic sources. Among the key data sets MGD supports are: the complete catalog of mouse genes and genome features, comparative homology data for mouse and vertebrate genes, the authoritative set of Gene Ontology (GO) annotations for mouse gene functions, a comprehensive catalog of mouse mutations and their phenotypes, and a curated compendium of mouse models of human diseases. Here, we describe the data acquisition process, specifics about MGD's key data areas, methods to access and query MGD data, and outreach and user help facilities. genesis 53:458–473, 2015. © 2015 The Authors. Genesis Published by Wiley Periodicals, Inc. PMID:26150326
Genomic locus modulating corneal thickness in the mouse identifies POU6F2 as a potential risk of developing glaucoma

PubMed Central

Li, Ying; Wang, Jiaxing; Allingham, R. Rand; Hauser, Michael A.; Wiggs, Janey L.; Geisert, Eldon E.

2018-01-01

Central corneal thickness (CCT) is one of the most heritable ocular traits and it is also a phenotypic risk factor for primary open angle glaucoma (POAG). The present study uses the BXD Recombinant Inbred (RI) strains to identify novel quantitative trait loci (QTLs) modulating CCT in the mouse with the potential of identifying a molecular link between CCT and risk of developing POAG. The BXD RI strain set was used to define mammalian genomic loci modulating CCT, with a total of 818 corneas measured from 61 BXD RI strains (between 60–100 days of age). The mice were anesthetized and the eyes were positioned in front of the lens of the Phoenix Micron IV Image-Guided OCT system or the Bioptigen OCT system. CCT data for each strain was averaged and used to QTLs modulating this phenotype using the bioinformatics tools on GeneNetwork (www.genenetwork.org). The candidate genes and genomic loci identified in the mouse were then directly compared with the summary data from a human POAG genome wide association study (NEIGHBORHOOD) to determine if any genomic elements modulating mouse CCT are also risk factors for POAG.This analysis revealed one significant QTL on Chr 13 and a suggestive QTL on Chr 7. The significant locus on Chr 13 (13 to 19 Mb) was examined further to define candidate genes modulating this eye phenotype. For the Chr 13 QTL in the mouse, only one gene in the region (Pou6f2) contained nonsynonymous SNPs. Of these five nonsynonymous SNPs in Pou6f2, two resulted in changes in the amino acid proline which could result in altered secondary structure affecting protein function. The 7 Mb region under the mouse Chr 13 peak distributes over 2 chromosomes in the human: Chr 1 and Chr 7. These genomic loci were examined in the NEIGHBORHOOD database to determine if they are potential risk factors for human glaucoma identified using meta-data from human GWAS. The top 50 hits all resided within one gene (POU6F2), with the highest significance level of p = 10−6 for SNP rs76319873. POU6F2 is found in retinal ganglion cells and in corneal limbal stem cells. To test the effect of POU6F2 on CCT we examined the corneas of a Pou6f2-null mice and the corneas were thinner than those of wild-type littermates. In addition, these POU6F2 RGCs die early in the DBA/2J model of glaucoma than most RGCs. Using a mouse genetic reference panel, we identified a transcription factor, Pou6f2, that modulates CCT in the mouse. POU6F2 is also found in a subset of retinal ganglion cells and these RGCs are sensitive to injury. PMID:29370175
ARTS: a web-based tool for the set-up of high-throughput genome-wide mapping panels for the SNP genotyping of mouse mutants.

PubMed

Klaften, Matthias; Hrabé de Angelis, Martin

2005-07-01

Genome-wide mapping in the identification of novel candidate genes has always been the standard method in genetics and genomics to correlate a clinically interesting phenotypic trait with a genotype. However, the performance of a mapping experiment using classical microsatellite approaches can be very time consuming. The high-throughput analysis of single-nucleotide polymorphisms (SNPs) has the potential of being the successor of microsatellite analysis routinely used for these mapping approaches, where one of the major obstacles is the design of the appropriate SNP marker set itself. Here we report on ARTS, an advanced retrieval tool for SNPs, which allows researchers to comb freely the public mouse dbSNP database for multiple reference and test strains. Several filters can be applied in order to improve the sensitivity and the specificity of the search results. By employing the panel generator function of this program, it is possible to abbreviate the extraction of reliable sequence data for a large marker panel including several different mouse strains from days to minutes. The concept of ARTS is easily adaptable to other species for which SNP databases are available, making it a versatile tool for the use of SNPs as markers for genotyping. The web interface is accessible at http://andromeda.gsf.de/arts.
An analysis of possible off target effects following CAS9/CRISPR targeted deletions of neuropeptide gene enhancers from the mouse genome.

PubMed

Hay, Elizabeth Anne; Khalaf, Abdulla Razak; Marini, Pietro; Brown, Andrew; Heath, Karyn; Sheppard, Darrin; MacKenzie, Alasdair

2017-08-01

We have successfully used comparative genomics to identify putative regulatory elements within the human genome that contribute to the tissue specific expression of neuropeptides such as galanin and receptors such as CB1. However, a previous inability to rapidly delete these elements from the mouse genome has prevented optimal assessment of their function in-vivo. This has been solved using CAS9/CRISPR genome editing technology which uses a bacterial endonuclease called CAS9 that, in combination with specifically designed guide RNA (gRNA) molecules, cuts specific regions of the mouse genome. However, reports of "off target" effects, whereby the CAS9 endonuclease is able to cut sites other than those targeted, limits the appeal of this technology. We used cytoplasmic microinjection of gRNA and CAS9 mRNA into 1-cell mouse embryos to rapidly generate enhancer knockout mouse lines. The current study describes our analysis of the genomes of these enhancer knockout lines to detect possible off-target effects. Bioinformatic analysis was used to identify the most likely putative off-target sites and to design PCR primers that would amplify these sequences from genomic DNA of founder enhancer deletion mouse lines. Amplified DNA was then sequenced and blasted against the mouse genome sequence to detect off-target effects. Using this approach we were unable to detect any evidence of off-target effects in the genomes of three founder lines using any of the four gRNAs used in the analysis. This study suggests that the problem of off-target effects in transgenic mice have been exaggerated and that CAS9/CRISPR represents a highly effective and accurate method of deleting putative neuropeptide gene enhancer sequences from the mouse genome. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Cross-referencing yeast genetics and mammalian genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hieter, P.; Basset, D.; Boguski, M.

1994-09-01

We have initiated a project that will systematically transfer information about yeast genes onto the genetic maps of mice and human beings. Rapidly expanding human EST data will serve as a source of candidate human homologs that will be repeatedly searched using yeast protein sequence queries. Search results will be automatically reported to participating labs. Human cDNA sequences from which the ESTs are derived will be mapped at high resolution in the human and mouse genomes. The comparative mapping information cross-references the genomic position of novel human cDNAs with functional information known about the cognate yeast genes. This should facilitatemore » the initial identification of genes responsible for mammalian mutant phenotypes, including human disease. In addition, the identification of mammalian homologs of yeast genes provides reagents for determining evolutionary conservation and for performing direct experiments in multicellular eukaryotes to enhance study of the yeast protein`s function. For example, ESTs homologous to CDC27 and CDC16 were identified, and the corresponding cDNA clones were obtained from ATTC, completely sequenced, and mapped on human and mouse chromosomes. In addition, the CDC17hs cDNA has been used to raise antisera to the CDC27Hs protein and used in subcellular localization experiments and junctional studies in mammalian cells. We have received funding from the National Center for Human Genome Research to provide a community resource which will establish comprehensive cross-referencing among yeast, human, and mouse loci. The project is set up as a service and information on how to communicate with this effort will be provided.« less
Discovery of the "RNA continent" through a contrarian's research strategy.

PubMed

Hayashizaki, Yoshihide

2011-01-01

The International Human Genome Sequencing Consortium completed the decoding of the human genome sequence in 2003. Readers will be aware of the paradigm shift which has occurred since then in the field of life science research. At last, mankind has been able to focus on a complete picture of the full extent of the genome, on which is recorded the basic information that controls all life. Meanwhile, another genome project, centered on Japan and known as the mouse genome encyclopedia project, was progressing with participation from around the world. Led by our research group at RIKEN, it was a full-length cDNA project which aimed to decode the whole RNA (transcriptome) using the mouse as a model. The basic information that controls all life is recorded on the genome, but in order to obtain a complete picture of this extensive information, the decoding of the genome alone is far from sufficient. These two genome projects established that the number of letters in the genome, which is the blueprint of life, is finite, that the number of RNA molecules derived from it is also finite, and that the number of protein molecules derived from the RNA is probably finite too. A massive number of combinations is still involved, but we are now able to understand one section of the network formed by these data. Once an object of study has been understood to be finite, establishing an image of the whole is certain to lead us to an understanding of the whole. Omics is an approach that views the information controlling life as finite and seeks to assemble and analyze it as a whole. Here, I would like to present our transcriptome research while making reference to our unique research strategy.
The Mouse Genomes Project: a repository of inbred laboratory mouse strain genomes.

PubMed

Adams, David J; Doran, Anthony G; Lilue, Jingtao; Keane, Thomas M

2015-10-01

The Mouse Genomes Project was initiated in 2009 with the goal of using next-generation sequencing technologies to catalogue molecular variation in the common laboratory mouse strains, and a selected set of wild-derived inbred strains. The initial sequencing and survey of sequence variation in 17 inbred strains was completed in 2011 and included comprehensive catalogue of single nucleotide polymorphisms, short insertion/deletions, larger structural variants including their fine scale architecture and landscape of transposable element variation, and genomic sites subject to post-transcriptional alteration of RNA. From this beginning, the resource has expanded significantly to include 36 fully sequenced inbred laboratory mouse strains, a refined and updated data processing pipeline, and new variation querying and data visualisation tools which are available on the project's website ( http://www.sanger.ac.uk/resources/mouse/genomes/ ). The focus of the project is now the completion of de novo assembled chromosome sequences and strain-specific gene structures for the core strains. We discuss how the assembled chromosomes will power comparative analysis, data access tools and future directions of mouse genetics.
The UCSC Genome Browser database: extensions and updates 2013.

PubMed

Meyer, Laurence R; Zweig, Ann S; Hinrichs, Angie S; Karolchik, Donna; Kuhn, Robert M; Wong, Matthew; Sloan, Cricket A; Rosenbloom, Kate R; Roe, Greg; Rhead, Brooke; Raney, Brian J; Pohl, Andy; Malladi, Venkat S; Li, Chin H; Lee, Brian T; Learned, Katrina; Kirkup, Vanessa; Hsu, Fan; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Goldman, Mary; Giardine, Belinda M; Fujita, Pauline A; Dreszer, Timothy R; Diekhans, Mark; Cline, Melissa S; Clawson, Hiram; Barber, Galt P; Haussler, David; Kent, W James

2013-01-01

The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.

The Evolutionary Fates of a Large Segmental Duplication in Mouse

PubMed Central

Morgan, Andrew P.; Holt, J. Matthew; McMullan, Rachel C.; Bell, Timothy A.; Clayshulte, Amelia M.-F.; Didion, John P.; Yadgary, Liran; Thybert, David; Odom, Duncan T.; Flicek, Paul; McMillan, Leonard; de Villena, Fernando Pardo-Manuel

2016-01-01

Gene duplication and loss are major sources of genetic polymorphism in populations, and are important forces shaping the evolution of genome content and organization. We have reconstructed the origin and history of a 127-kbp segmental duplication, R2d, in the house mouse (Mus musculus). R2d contains a single protein-coding gene, Cwc22. De novo assembly of both the ancestral (R2d1) and the derived (R2d2) copies reveals that they have been subject to nonallelic gene conversion events spanning tens of kilobases. R2d2 is also a hotspot for structural variation: its diploid copy number ranges from zero in the mouse reference genome to >80 in wild mice sampled from around the globe. Hemizygosity for high copy-number alleles of R2d2 is associated in cis with meiotic drive; suppression of meiotic crossovers; and copy-number instability, with a mutation rate in excess of 1 per 100 transmissions in some laboratory populations. Our results provide a striking example of allelic diversity generated by duplication and demonstrate the value of de novo assembly in a phylogenetic context for understanding the mutational processes affecting duplicate genes. PMID:27371833
Lineage-Specific Biology Revealed by a Finished Genome Assembly of the Mouse

PubMed Central

Hillier, LaDeana W.; Zody, Michael C.; Goldstein, Steve; She, Xinwe; Bult, Carol J.; Agarwala, Richa; Cherry, Joshua L.; DiCuccio, Michael; Hlavina, Wratko; Kapustin, Yuri; Meric, Peter; Maglott, Donna; Birtle, Zoë; Marques, Ana C.; Graves, Tina; Zhou, Shiguo; Teague, Brian; Potamousis, Konstantinos; Churas, Christopher; Place, Michael; Herschleb, Jill; Runnheim, Ron; Forrest, Daniel; Amos-Landgraf, James; Schwartz, David C.; Cheng, Ze; Lindblad-Toh, Kerstin; Eichler, Evan E.; Ponting, Chris P.

2009-01-01

The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non–protein-coding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not. PMID:19468303
Mouse Genome Informatics (MGI): Resources for Mining Mouse Genetic, Genomic, and Biological Data in Support of Primary and Translational Research.

PubMed

Eppig, Janan T; Smith, Cynthia L; Blake, Judith A; Ringwald, Martin; Kadin, James A; Richardson, Joel E; Bult, Carol J

2017-01-01

The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

PubMed Central

Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

2000-01-01

The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly

PubMed Central

2013-01-01

Background The lack of genomic resources can present challenges for studies of non-model organisms. Transcriptome sequencing offers an attractive method to gather information about genes and gene expression without the need for a reference genome. However, it is unclear what sequencing depth is adequate to assemble the transcriptome de novo for these purposes. Results We assembled transcriptomes of animals from six different phyla (Annelids, Arthropods, Chordates, Cnidarians, Ctenophores, and Molluscs) at regular increments of reads using Velvet/Oases and Trinity to determine how read count affects the assembly. This included an assembly of mouse heart reads because we could compare those against the reference genome that is available. We found qualitative differences in the assemblies of whole-animals versus tissues. With increasing reads, whole-animal assemblies show rapid increase of transcripts and discovery of conserved genes, while single-tissue assemblies show a slower discovery of conserved genes though the assembled transcripts were often longer. A deeper examination of the mouse assemblies shows that with more reads, assembly errors become more frequent but such errors can be mitigated with more stringent assembly parameters. Conclusions These assembly trends suggest that representative assemblies are generated with as few as 20 million reads for tissue samples and 30 million reads for whole-animals for RNA-level coverage. These depths provide a good balance between coverage and noise. Beyond 60 million reads, the discovery of new genes is low and sequencing errors of highly-expressed genes are likely to accumulate. Finally, siphonophores (polymorphic Cnidarians) are an exception and possibly require alternate assembly strategies. PMID:23496952
A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly.

PubMed

Francis, Warren R; Christianson, Lynne M; Kiko, Rainer; Powers, Meghan L; Shaner, Nathan C; Haddock, Steven H D

2013-03-12

The lack of genomic resources can present challenges for studies of non-model organisms. Transcriptome sequencing offers an attractive method to gather information about genes and gene expression without the need for a reference genome. However, it is unclear what sequencing depth is adequate to assemble the transcriptome de novo for these purposes. We assembled transcriptomes of animals from six different phyla (Annelids, Arthropods, Chordates, Cnidarians, Ctenophores, and Molluscs) at regular increments of reads using Velvet/Oases and Trinity to determine how read count affects the assembly. This included an assembly of mouse heart reads because we could compare those against the reference genome that is available. We found qualitative differences in the assemblies of whole-animals versus tissues. With increasing reads, whole-animal assemblies show rapid increase of transcripts and discovery of conserved genes, while single-tissue assemblies show a slower discovery of conserved genes though the assembled transcripts were often longer. A deeper examination of the mouse assemblies shows that with more reads, assembly errors become more frequent but such errors can be mitigated with more stringent assembly parameters. These assembly trends suggest that representative assemblies are generated with as few as 20 million reads for tissue samples and 30 million reads for whole-animals for RNA-level coverage. These depths provide a good balance between coverage and noise. Beyond 60 million reads, the discovery of new genes is low and sequencing errors of highly-expressed genes are likely to accumulate. Finally, siphonophores (polymorphic Cnidarians) are an exception and possibly require alternate assembly strategies.
Genome characterization of the selected long- and short-sleep mouse lines.

PubMed

Dowell, Robin; Odell, Aaron; Richmond, Phillip; Malmer, Daniel; Halper-Stromberg, Eitan; Bennett, Beth; Larson, Colin; Leach, Sonia; Radcliffe, Richard A

2016-12-01

The Inbred Long- and Short-Sleep (ILS, ISS) mouse lines were selected for differences in acute ethanol sensitivity using the loss of righting response (LORR) as the selection trait. The lines show an over tenfold difference in LORR and, along with a recombinant inbred panel derived from them (the LXS), have been widely used to dissect the genetic underpinnings of acute ethanol sensitivity. Here we have sequenced the genomes of the ILS and ISS to investigate the DNA variants that contribute to their sensitivity difference. We identified ~2.7 million high-confidence SNPs and small indels and ~7000 structural variants between the lines; variants were found to occur in 6382 annotated genes. Using a hidden Markov model, we were able to reconstruct the genome-wide ancestry patterns of the eight inbred progenitor strains from which the ILS and ISS were derived, and found that quantitative trait loci that have been mapped for LORR were slightly enriched for DNA variants. Finally, by mapping and quantifying RNA-seq reads from the ILS and ISS to their strain-specific genomes rather than to the reference genome, we found a substantial improvement in a differential expression analysis between the lines. This work will help in identifying and characterizing the DNA sequence variants that contribute to the difference in ethanol sensitivity between the ILS and ISS and will also aid in accurate quantification of RNA-seq data generated from the LXS RIs.
Cross-species comparison of aCGH data from mouse and human BRCA1- and BRCA2-mutated breast cancers

PubMed Central

2010-01-01

Background Genomic gains and losses are a result of genomic instability in many types of cancers. BRCA1- and BRCA2-mutated breast cancers are associated with increased amounts of chromosomal aberrations, presumably due their functions in genome repair. Some of these genomic aberrations may harbor genes whose absence or overexpression may give rise to cellular growth advantage. So far, it has not been easy to identify the driver genes underlying gains and losses. A powerful approach to identify these driver genes could be a cross-species comparison of array comparative genomic hybridization (aCGH) data from cognate mouse and human tumors. Orthologous regions of mouse and human tumors that are commonly gained or lost might represent essential genomic regions selected for gain or loss during tumor development. Methods To identify genomic regions that are associated with BRCA1- and BRCA2-mutated breast cancers we compared aCGH data from 130 mouse Brca1Δ/Δ;p53Δ/Δ, Brca2Δ/Δ;p53Δ/Δ and p53Δ/Δ mammary tumor groups with 103 human BRCA1-mutated, BRCA2-mutated and non-hereditary breast cancers. Results Our genome-wide cross-species analysis yielded a complete collection of loci and genes that are commonly gained or lost in mouse and human breast cancer. Principal common CNAs were the well known MYC-associated gain and RB1/INTS6-associated loss that occurred in all mouse and human tumor groups, and the AURKA-associated gain occurred in BRCA2-related tumors from both species. However, there were also important differences between tumor profiles of both species, such as the prominent gain on chromosome 10 in mouse Brca2Δ/Δ;p53Δ/Δ tumors and the PIK3CA associated 3q gain in human BRCA1-mutated tumors, which occurred in tumors from one species but not in tumors from the other species. This disparity in recurrent aberrations in mouse and human tumors might be due to differences in tumor cell type or genomic organization between both species. Conclusions The selection of the oncogenome during mouse and human breast tumor development is markedly different, apart from the MYC gain and RB1-associated loss. These differences should be kept in mind when using mouse models for preclinical studies. PMID:20735817
The Mouse Lemur, a Genetic Model Organism for Primate Biology, Behavior, and Health

PubMed Central

Ezran, Camille; Karanewsky, Caitlin J.; Pendleton, Jozeph L.; Sholtz, Alex; Krasnow, Maya R.; Willick, Jason; Razafindrakoto, Andriamahery; Zohdy, Sarah; Albertelli, Megan A.; Krasnow, Mark A.

2017-01-01

Systematic genetic studies of a handful of diverse organisms over the past 50 years have transformed our understanding of biology. However, many aspects of primate biology, behavior, and disease are absent or poorly modeled in any of the current genetic model organisms including mice. We surveyed the animal kingdom to find other animals with advantages similar to mice that might better exemplify primate biology, and identified mouse lemurs (Microcebus spp.) as the outstanding candidate. Mouse lemurs are prosimian primates, roughly half the genetic distance between mice and humans. They are the smallest, fastest developing, and among the most prolific and abundant primates in the world, distributed throughout the island of Madagascar, many in separate breeding populations due to habitat destruction. Their physiology, behavior, and phylogeny have been studied for decades in laboratory colonies in Europe and in field studies in Malagasy rainforests, and a high quality reference genome sequence has recently been completed. To initiate a classical genetic approach, we developed a deep phenotyping protocol and have screened hundreds of laboratory and wild mouse lemurs for interesting phenotypes and begun mapping the underlying mutations, in collaboration with leading mouse lemur biologists. We also seek to establish a mouse lemur gene “knockout” library by sequencing the genomes of thousands of mouse lemurs to identify null alleles in most genes from the large pool of natural genetic variants. As part of this effort, we have begun a citizen science project in which students across Madagascar explore the remarkable biology around their schools, including longitudinal studies of the local mouse lemurs. We hope this work spawns a new model organism and cultivates a deep genetic understanding of primate biology and health. We also hope it establishes a new and ethical method of genetics that bridges biological, behavioral, medical, and conservation disciplines, while providing an example of how hands-on science education can help transform developing countries. PMID:28592502
The Mouse Lemur, a Genetic Model Organism for Primate Biology, Behavior, and Health.

PubMed

Ezran, Camille; Karanewsky, Caitlin J; Pendleton, Jozeph L; Sholtz, Alex; Krasnow, Maya R; Willick, Jason; Razafindrakoto, Andriamahery; Zohdy, Sarah; Albertelli, Megan A; Krasnow, Mark A

2017-06-01

Systematic genetic studies of a handful of diverse organisms over the past 50 years have transformed our understanding of biology. However, many aspects of primate biology, behavior, and disease are absent or poorly modeled in any of the current genetic model organisms including mice. We surveyed the animal kingdom to find other animals with advantages similar to mice that might better exemplify primate biology, and identified mouse lemurs ( Microcebus spp.) as the outstanding candidate. Mouse lemurs are prosimian primates, roughly half the genetic distance between mice and humans. They are the smallest, fastest developing, and among the most prolific and abundant primates in the world, distributed throughout the island of Madagascar, many in separate breeding populations due to habitat destruction. Their physiology, behavior, and phylogeny have been studied for decades in laboratory colonies in Europe and in field studies in Malagasy rainforests, and a high quality reference genome sequence has recently been completed. To initiate a classical genetic approach, we developed a deep phenotyping protocol and have screened hundreds of laboratory and wild mouse lemurs for interesting phenotypes and begun mapping the underlying mutations, in collaboration with leading mouse lemur biologists. We also seek to establish a mouse lemur gene "knockout" library by sequencing the genomes of thousands of mouse lemurs to identify null alleles in most genes from the large pool of natural genetic variants. As part of this effort, we have begun a citizen science project in which students across Madagascar explore the remarkable biology around their schools, including longitudinal studies of the local mouse lemurs. We hope this work spawns a new model organism and cultivates a deep genetic understanding of primate biology and health. We also hope it establishes a new and ethical method of genetics that bridges biological, behavioral, medical, and conservation disciplines, while providing an example of how hands-on science education can help transform developing countries. Copyright © 2017 by the Genetics Society of America.
Strategies and tools for whole genome alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas

2002-11-25

The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With amore » view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.« less
Dissection of Host Susceptibility to Bacterial Infections and Its Toxins.

PubMed

Nashef, Aysar; Agbaria, Mahmoud; Shusterman, Ariel; Lorè, Nicola Ivan; Bragonzi, Alessandra; Wiess, Ervin; Houri-Haddad, Yael; Iraqi, Fuad A

2017-01-01

Infection is one of the leading causes of human mortality and morbidity. Exposure to microbial agents is obviously required. However, also non-microbial environmental and host factors play a key role in the onset, development and outcome of infectious disease, resulting in large of clinical variability between individuals in a population infected with the same microbe. Controlled and standardized investigations of the genetics of susceptibility to infectious disease are almost impossible to perform in humans whereas mouse models allow application of powerful genomic techniques to identify and validate causative genes underlying human diseases with complex etiologies. Most of current animal models used in complex traits diseases genetic mapping have limited genetic diversity. This limitation impedes the ability to create incorporated network using genetic interactions, epigenetics, environmental factors, microbiota, and other phenotypes. A novel mouse genetic reference population for high-resolution mapping and subsequently identifying genes underlying the QTL, namely the Collaborative Cross (CC) mouse genetic reference population (GRP) was recently developed. In this chapter, we discuss a variety of approaches using CC mice for mapping genes underlying quantitative trait loci (QTL) to dissect the host response to polygenic traits, including infectious disease caused by bacterial agents and its toxins.
C57BL/6N mutation in Cytoplasmic FMR interacting protein 2 regulates cocaine response

PubMed Central

Kumar, Vivek; Kim, Kyungin; Joseph, Chryshanthi; Kourrich, Saïd; Yoo, Seung Hee; Huang, Hung Chung; Vitaterna, Martha H.; de Villena, Fernando Pardo-Manuel; Churchill, Gary; Bonci, Antonello; Takahashi, Joseph S.

2015-01-01

The inbred mouse C57BL/6J is the reference strain for genome sequence and for most behavioral and physiological phenotypes. However the International Knockout Mouse Consortium uses an embryonic stem cell line derived from a related C57BL/6N substrain. We found that C57BL/6N has lower acute and sensitized response to cocaine and methamphetamine. We mapped a single causative locus and identified a non-synonymous mutation of serine to phenylalanine (S968F) in Cytoplasmic FMR interacting protein 2 (Cyfip2) as the causative variant. The S968F mutation destabilizes CYFIP2 and deletion of the C57BL/6N mutant allele leads to acute and sensitized cocaine response phenotypes. We propose CYFIP2 is a key regulator of cocaine response in mammals and present a framework to utilize mouse substrains to discover novel genes and alleles regulating behavior. PMID:24357318
Cyberinfrastructure for the digital brain: spatial standards for integrating rodent brain atlases

PubMed Central

Zaslavsky, Ilya; Baldock, Richard A.; Boline, Jyl

2014-01-01

Biomedical research entails capture and analysis of massive data volumes and new discoveries arise from data-integration and mining. This is only possible if data can be mapped onto a common framework such as the genome for genomic data. In neuroscience, the framework is intrinsically spatial and based on a number of paper atlases. This cannot meet today's data-intensive analysis and integration challenges. A scalable and extensible software infrastructure that is standards based but open for novel data and resources, is required for integrating information such as signal distributions, gene-expression, neuronal connectivity, electrophysiology, anatomy, and developmental processes. Therefore, the International Neuroinformatics Coordinating Facility (INCF) initiated the development of a spatial framework for neuroscience data integration with an associated Digital Atlasing Infrastructure (DAI). A prototype implementation of this infrastructure for the rodent brain is reported here. The infrastructure is based on a collection of reference spaces to which data is mapped at the required resolution, such as the Waxholm Space (WHS), a 3D reconstruction of the brain generated using high-resolution, multi-channel microMRI. The core standards of the digital atlasing service-oriented infrastructure include Waxholm Markup Language (WaxML): XML schema expressing a uniform information model for key elements such as coordinate systems, transformations, points of interest (POI)s, labels, and annotations; and Atlas Web Services: interfaces for querying and updating atlas data. The services return WaxML-encoded documents with information about capabilities, spatial reference systems (SRSs) and structures, and execute coordinate transformations and POI-based requests. Key elements of INCF-DAI cyberinfrastructure have been prototyped for both mouse and rat brain atlas sources, including the Allen Mouse Brain Atlas, UCSD Cell-Centered Database, and Edinburgh Mouse Atlas Project. PMID:25309417
Cyberinfrastructure for the digital brain: spatial standards for integrating rodent brain atlases.

PubMed

Zaslavsky, Ilya; Baldock, Richard A; Boline, Jyl

2014-01-01

Biomedical research entails capture and analysis of massive data volumes and new discoveries arise from data-integration and mining. This is only possible if data can be mapped onto a common framework such as the genome for genomic data. In neuroscience, the framework is intrinsically spatial and based on a number of paper atlases. This cannot meet today's data-intensive analysis and integration challenges. A scalable and extensible software infrastructure that is standards based but open for novel data and resources, is required for integrating information such as signal distributions, gene-expression, neuronal connectivity, electrophysiology, anatomy, and developmental processes. Therefore, the International Neuroinformatics Coordinating Facility (INCF) initiated the development of a spatial framework for neuroscience data integration with an associated Digital Atlasing Infrastructure (DAI). A prototype implementation of this infrastructure for the rodent brain is reported here. The infrastructure is based on a collection of reference spaces to which data is mapped at the required resolution, such as the Waxholm Space (WHS), a 3D reconstruction of the brain generated using high-resolution, multi-channel microMRI. The core standards of the digital atlasing service-oriented infrastructure include Waxholm Markup Language (WaxML): XML schema expressing a uniform information model for key elements such as coordinate systems, transformations, points of interest (POI)s, labels, and annotations; and Atlas Web Services: interfaces for querying and updating atlas data. The services return WaxML-encoded documents with information about capabilities, spatial reference systems (SRSs) and structures, and execute coordinate transformations and POI-based requests. Key elements of INCF-DAI cyberinfrastructure have been prototyped for both mouse and rat brain atlas sources, including the Allen Mouse Brain Atlas, UCSD Cell-Centered Database, and Edinburgh Mouse Atlas Project.
Generation of Knock-in Mouse by Genome Editing.

PubMed

Fujii, Wataru

2017-01-01

Knock-in mice are useful for evaluating endogenous gene expressions and functions in vivo. Instead of the conventional gene-targeting method using embryonic stem cells, an exogenous DNA sequence can be inserted into the target locus in the zygote using genome editing technology. In this chapter, I describe the generation of epitope-tagged mice using engineered endonuclease and single-stranded oligodeoxynucleotide through the mouse zygote as an example of how to generate a knock-in mouse by genome editing.
Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

PubMed Central

Kosugi, Shunichi; Natsume, Satoshi; Yoshida, Kentaro; MacLean, Daniel; Cano, Liliana; Kamoun, Sophien; Terauchi, Ryohei

2013-01-01

Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. PMID:24116042
Fine-scale maps of recombination rates and hotspots in the mouse genome.

PubMed

Brunschwig, Hadassa; Levi, Liat; Ben-David, Eyal; Williams, Robert W; Yakir, Benjamin; Shifman, Sagiv

2012-07-01

Recombination events are not uniformly distributed and often cluster in narrow regions known as recombination hotspots. Several studies using different approaches have dramatically advanced our understanding of recombination hotspot regulation. Population genetic data have been used to map and quantify hotspots in the human genome. Genetic variation in recombination rates and hotspots usage have been explored in human pedigrees, mouse intercrosses, and by sperm typing. These studies pointed to the central role of the PRDM9 gene in hotspot modulation. In this study, we used single nucleotide polymorphisms (SNPs) from whole-genome resequencing and genotyping studies of mouse inbred strains to estimate recombination rates across the mouse genome and identified 47,068 historical hotspots--an average of over 2477 per chromosome. We show by simulation that inbred mouse strains can be used to identify positions of historical hotspots. Recombination hotspots were found to be enriched for the predicted binding sequences for different alleles of the PRDM9 protein. Recombination rates were on average lower near transcription start sites (TSS). Comparing the inferred historical recombination hotspots with the recent genome-wide mapping of double-strand breaks (DSBs) in mouse sperm revealed a significant overlap, especially toward the telomeres. Our results suggest that inbred strains can be used to characterize and study the dynamics of historical recombination hotspots. They also strengthen previous findings on mouse recombination hotspots, and specifically the impact of sequence variants in Prdm9.
FULL-GENOME ANALYSIS OF ALTERNATIVE SPLICING IN MOUSE LIVER AFTER HEPATOTOXICANT EXPOSURE

EPA Science Inventory

Alternative splicing plays a role in determining gene function and protein diversity. We have employed whole genome exon profiling using Affymetrix Mouse Exon 1.0 ST arrays to understand the significance of alternative splicing on a genome-wide scale in response to multiple toxic...
Efficient analysis of mouse genome sequences reveal many nonsense variants

PubMed Central

Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

2016-01-01

Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

A vast genomic deletion in the C56BL/6 genome affects different genes within the Ifi200 cluster on chromosome 1 and mediates obesity and insulin resistance.

PubMed

Vogel, Heike; Jähnert, Markus; Stadion, Mandy; Matzke, Daniela; Scherneck, Stephan; Schürmann, Annette

2017-02-15

Obesity, the excessive accumulation of body fat, is a highly heritable and genetically heterogeneous disorder. The complex, polygenic basis for the disease consisting of a network of different gene variants is still not completely known. In the current study we generated a BAC library of the obese-prone NZO strain to clarify the genomic alteration within the gene cluster Ifi200 on chr.1 including Ifi202b, an obesity gene that is in contrast to NZO not expressed in the lean B6 mouse. With the PacBio sequencing data of NZO BAC clones we identified a deletion spanning approximately 261.8 kb in the B6 reference genome. The deletion affects different members of the Ifi200 gene family which also includes the original first exon and 5'-regulatory parts of the Ifi202b gene and suggests to be the relevant cause of its expression deficiency in B6. In addition, the generation and characterization of congenic mice carrying the critical fragment on the B6 background demonstrate its crucial role for obesity and insulin resistance. Our data reveal the reconstruction of a complex genomic region on mouse chr.1 resulting from deletions and duplications of Ifi200 genes and suggest to be relevant for the development of obesity. The results further demonstrate the complexity of the disease and highlight the importance for studying rare genetic variants as they can be causal for large effects.
Identification of an active ID-like group of SINEs in the mouse

PubMed Central

Kass, David H; Jamison, Nicole

2007-01-01

The mouse genome consists of five known families of SINEs: B1, B2, B4/RSINE, ID, and MIR. Using RT-PCR we identified a germ-line transcript that demonstrates 92.7% sequence identity to ID (excluding primer sequence), yet a BLAST search identified numerous matches of 100% sequence identity. We analyzed four of these elements for their presence in orthologous genes in strains and subspecies of M. musculus as well as other species of Mus using a PCR-based assay. All four analyzed elements were either identified only in M. musculus or exclusively in both M. musculus and M. domesticus indicative of recent integrations. In conjunction with the identification of transcripts, we present an active ID-like group of elements that is not derived from the proposed BC1 master gene of ID elements. A BLAST of the rat genome indicated that these elements were not in the rat. Therefore, this family of SINEs has recently evolved, and since thus far has mainly been observed in M. musculus, we then refer to this family as MMIDL. PMID:17572061
Identification of an active ID-like group of SINEs in the mouse.

PubMed

Kass, David H; Jamison, Nicole

2007-09-01

The mouse genome consists of five known families of SINEs: B1, B2, B4/RSINE, ID, and MIR. Using RT-PCR we identified a germ-line transcript that demonstrates 92.7% sequence identity to ID (excluding primer sequence), yet a BLAST search identified numerous matches of 100% sequence identity. We analyzed four of these elements for their presence in orthologous genes in strains and subspecies of Mus musculus as well as other species of Mus using a PCR-based assay. All four analyzed elements were identified either only in M. musculus or exclusively in both M. musculus and M. domesticus, indicative of recent integrations. In conjunction with the identification of transcripts, we present an active ID-like group of elements that is not derived from the proposed BC1 master gene of ID elements. A BLAST of the rat genome indicated that these elements were not in the rat. Therefore, this family of SINEs has recently evolved, and since it has thus far been observed mainly in M. musculus, we refer to this family as MMIDL.
Ensembl regulation resources

PubMed Central

Zerbino, Daniel R.; Johnson, Nathan; Juetteman, Thomas; Sheppard, Dan; Wilder, Steven P.; Lavidas, Ilias; Nuhn, Michael; Perry, Emily; Raffaillac-Desfosses, Quentin; Sobral, Daniel; Keefe, Damian; Gräf, Stefan; Ahmed, Ikhlak; Kinsella, Rhoda; Pritchard, Bethan; Brent, Simon; Amode, Ridwan; Parker, Anne; Trevanion, Steven; Birney, Ewan; Dunham, Ian; Flicek, Paul

2016-01-01

New experimental techniques in epigenomics allow researchers to assay a diversity of highly dynamic features such as histone marks, DNA modifications or chromatin structure. The study of their fluctuations should provide insights into gene expression regulation, cell differentiation and disease. The Ensembl project collects and maintains the Ensembl regulation data resources on epigenetic marks, transcription factor binding and DNA methylation for human and mouse, as well as microarray probe mappings and annotations for a variety of chordate genomes. From this data, we produce a functional annotation of the regulatory elements along the human and mouse genomes with plans to expand to other species as data becomes available. Starting from well-studied cell lines, we will progressively expand our library of measurements to a greater variety of samples. Ensembl’s regulation resources provide a central and easy-to-query repository for reference epigenomes. As with all Ensembl data, it is freely available at http://www.ensembl.org, from the Perl and REST APIs and from the public Ensembl MySQL database server at ensembldb.ensembl.org. Database URL: http://www.ensembl.org PMID:26888907
A critical assessment of Mus musculus gene function prediction using integrated genomic evidence

PubMed Central

Peña-Castillo, Lourdes; Tasan, Murat; Myers, Chad L; Lee, Hyunju; Joshi, Trupti; Zhang, Chao; Guan, Yuanfang; Leone, Michele; Pagnani, Andrea; Kim, Wan Kyu; Krumpelman, Chase; Tian, Weidong; Obozinski, Guillaume; Qi, Yanjun; Mostafavi, Sara; Lin, Guan Ning; Berriz, Gabriel F; Gibbons, Francis D; Lanckriet, Gert; Qiu, Jian; Grant, Charles; Barutcuoglu, Zafer; Hill, David P; Warde-Farley, David; Grouios, Chris; Ray, Debajyoti; Blake, Judith A; Deng, Minghua; Jordan, Michael I; Noble, William S; Morris, Quaid; Klein-Seetharaman, Judith; Bar-Joseph, Ziv; Chen, Ting; Sun, Fengzhu; Troyanskaya, Olga G; Marcotte, Edward M; Xu, Dong; Hughes, Timothy R; Roth, Frederick P

2008-01-01

Background: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated. Results: In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%. Conclusion: We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized. PMID:18613946
Manipulation of the mouse genome: a multiple impact resource for drug discovery and development.

PubMed

Prosser, Haydn; Rastan, Sohaila

2003-05-01

Few would deny that the pharmaceutical industry's investment in genomics throughout the 1990s has yet to deliver in terms of drugs on the market. The reasons are complex and beyond the scope of this review. The unique ability to manipulate the mouse genome, however, has already had a positive impact on all stages of the drug discovery process and, increasingly, on the drug development process too. We give an overview of some recent applications of so-called 'transgenic' mouse technology in pharmaceutical research and development. We show how genetic manipulation in the mouse can be employed at multiple points in the drug discovery and development process, providing new solutions to old problems.
Chip Based Magnetic Imager for Molecular Profiling of Ovarian Cancer Cells

DTIC Science & Technology

2016-12-01

2015) Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160:1246-1260. PMC4380877, PMID:25748654. Acknowledgement of...Weissleder R, Lee H, Zhang F, Sharp PA (2015) Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160:1246-1260. 5. Im H, Shao H...Lett 32(10):1229–1231. 6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.1501815112 Im et al. Resource Genome-wide CRISPR Screen in a Mouse Model of Tumor
18th International Mouse Genome Conference

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lossie, Amy C.; Meehan, Thomas P.; Castillo, Andrew

2005-07-01

The 18th International Mouse Genome Conference was held in Seattle, WA, US on October 18-22,2004. The meeting was partially supported by the Department of Energy, Grant No. DE-FG02-04ER63851. Abstracts can be seen at imgs.org and the summary of the meeting was published in Mammalian Genome, Vol 16, Number 7, Pages 471-475.
Number and location of mouse mammary tumor virus proviral DNA in mouse DNA of normal tissue and of mammary tumors.

PubMed Central

Groner, B; Hynes, N E

1980-01-01

The Southern DNA filter transfer technique was used to characterize the genomic location of the mouse mammary tumor proviral DNA in different inbred strains of mice. Two of the strains (C3H and CBA) arose from a cross of a Bagg albino (BALB/c) mouse and a DBA mouse. The mouse mammary tumor virus-containing restriction enzyme DNA fragments of these strains had similar patterns, suggesting that the proviruses of these mice are in similar genomic locations. Conversely, the pattern arising from the DNA of the GR mouse, a strain genetically unrelated to the others, appeared different, suggesting that its mouse mammary tumor proviruses are located in different genomic sites. The structure of another gene, that coding for beta-globin, was also compared. The mice strains which we studied can be categorized into two classes, expressing either one or two beta-globin proteins. The macroenvironment of the beta-globin gene appeared similar among the mice strains belonging to one genetic class. Female mice of the C3H strain exogenously transmit mouse mammary tumor virus via the milk, and their offspring have a high incidence of mammary tumor occurrence. DNA isolated from individual mammary tumors taken from C3H mice or from BALB/c mice foster nursed on C3H mothers was analyzed by the DNA filter transfer technique. Additional mouse mammary tumor virus-containing fragments were found in the DNA isolated from each mammary tumor. These proviral sequences were integrated into different genomic sites in each tumor. Images PMID:6245257
Human, Mouse, and Rat Genome Large-Scale Rearrangements: Stability Versus Speciation

PubMed Central

Zhao, Shaying; Shetty, Jyoti; Hou, Lihua; Delcher, Arthur; Zhu, Baoli; Osoegawa, Kazutoyo; de Jong, Pieter; Nierman, William C.; Strausberg, Robert L.; Fraser, Claire M.

2004-01-01

Using paired-end sequences from bacterial artificial chromosomes, we have constructed high-resolution synteny and rearrangement breakpoint maps among human, mouse, and rat genomes. Among the >300 syntenic blocks identified are segments of over 40 Mb without any detected interspecies rearrangements, as well as regions with frequently broken synteny and extensive rearrangements. As closely related species, mouse and rat share the majority of the breakpoints and often have the same types of rearrangements when compared with the human genome. However, the breakpoints not shared between them indicate that mouse rearrangements are more often interchromosomal, whereas intrachromosomal rearrangements are more prominent in rat. Centromeres may have played a significant role in reorganizing a number of chromosomes in all three species. The comparison of the three species indicates that genome rearrangements follow a path that accommodates a delicate balance between maintaining a basic structure underlying all mammalian species and permitting variations that are necessary for speciation. PMID:15364903
Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human.

PubMed

MacRae, Sheila L; Zhang, Quanwei; Lemetre, Christophe; Seim, Inge; Calder, Robert B; Hoeijmakers, Jan; Suh, Yousin; Gladyshev, Vadim N; Seluanov, Andrei; Gorbunova, Vera; Vijg, Jan; Zhang, Zhengdong D

2015-04-01

Genome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM genes appeared to be strongly conserved, with copy number variation in only four genes. Interestingly, we found NMR to have a higher copy number of CEBPG, a regulator of DNA repair, and TINF2, a protector of telomere integrity. NMR, as well as human, was also found to have a lower rate of germline nucleotide substitution than the mouse. Together, the data suggest that the long-lived NMR, as well as human, has more robust GM than mouse and identifies new targets for the analysis of the exceptional longevity of the NMR. © 2015 The Authors. Aging Cell published by the Anatomical Society and John Wiley & Sons Ltd.
Initial sequence and comparative analysis of the cat genome

PubMed Central

Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

2007-01-01

The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172
Partnering for functional genomics research conference: Abstracts of poster presentations

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

1998-06-01

This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots.

PubMed

Wu, Min; Kwoh, Chee-Keong; Przytycka, Teresa M; Li, Jing; Zheng, Jie

2012-06-21

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots

PubMed Central

2012-01-01

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots. PMID:22759569
Sequencing, Annotation and Analysis of the Syrian Hamster (Mesocricetus auratus) Transcriptome

PubMed Central

Tchitchek, Nicolas; Safronetz, David; Rasmussen, Angela L.; Martens, Craig; Virtaneva, Kimmo; Porcella, Stephen F.; Feldmann, Heinz

2014-01-01

Background The Syrian hamster (golden hamster, Mesocricetus auratus) is gaining importance as a new experimental animal model for multiple pathogens, including emerging zoonotic diseases such as Ebola. Nevertheless there are currently no publicly available transcriptome reference sequences or genome for this species. Results A cDNA library derived from mRNA and snRNA isolated and pooled from the brains, lungs, spleens, kidneys, livers, and hearts of three adult female Syrian hamsters was sequenced. Sequence reads were assembled into 62,482 contigs and 111,796 reads remained unassembled (singletons). This combined contig/singleton dataset, designated as the Syrian hamster transcriptome, represents a total of 60,117,204 nucleotides. Our Mesocricetus auratus Syrian hamster transcriptome mapped to 11,648 mouse transcripts representing 9,562 distinct genes, and mapped to a similar number of transcripts and genes in the rat. We identified 214 quasi-complete transcripts based on mouse annotations. Canonical pathways involved in a broad spectrum of fundamental biological processes were significantly represented in the library. The Syrian hamster transcriptome was aligned to the current release of the Chinese hamster ovary (CHO) cell transcriptome and genome to improve the genomic annotation of this species. Finally, our Syrian hamster transcriptome was aligned against 14 other rodents, primate and laurasiatheria species to gain insights about the genetic relatedness and placement of this species. Conclusions This Syrian hamster transcriptome dataset significantly improves our knowledge of the Syrian hamster's transcriptome, especially towards its future use in infectious disease research. Moreover, this library is an important resource for the wider scientific community to help improve genome annotation of the Syrian hamster and other closely related species. Furthermore, these data provide the basis for development of expression microarrays that can be used in functional genomics studies. PMID:25398096
A Comparative Encyclopedia of DNA Elements in the Mouse Genome

PubMed Central

Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing

2014-01-01

Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824
A comparative encyclopedia of DNA elements in the mouse genome.

PubMed

Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing

2014-11-20

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Motif mismatches in microsatellites: insights from genome-wide investigation among 20 insect species.

PubMed

Behura, Susanta K; Severson, David W

2015-02-01

We present a detailed genome-wide comparative study of motif mismatches of microsatellites among 20 insect species representing five taxonomic orders. The results show that varying proportions (∼15-46%) of microsatellites identified in these species are imperfect in motif structure, and that they also vary in chromosomal distribution within genomes. It was observed that the genomic abundance of imperfect repeats is significantly associated with the length and number of motif mismatches of microsatellites. Furthermore, microsatellites with a higher number of mismatches tend to have lower abundance in the genome, suggesting that sequence heterogeneity of repeat motifs is a key determinant of genomic abundance of microsatellites. This relationship seems to be a general feature of microsatellites even in unrelated species such as yeast, roundworm, mouse and human. We provide a mechanistic explanation of the evolutionary link between motif heterogeneity and genomic abundance of microsatellites by examining the patterns of motif mismatches and allele sequences of single-nucleotide polymorphisms identified within microsatellite loci. Using Drosophila Reference Genetic Panel data, we further show that pattern of allelic variation modulates motif heterogeneity of microsatellites, and provide estimates of allele age of specific imperfect microsatellites found within protein-coding genes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
YAC cloning Mus musculus telomeric DNA: physical, genetic, in situ and STS markers for the distal telomere of chromosome 10.

PubMed

Kipling, D; Wilson, H E; Thomson, E J; Cooke, H J

1995-06-01

Three Mus musculus DBA/2 YAC libraries were constructed using a half-YAC telomere cloning vector. This functional complementation approach yields libraries which include terminal restriction fragments of the mouse genome. Screening all three libraries led to the isolation of 32 independent clones which carry linear YACs containing the mouse terminal repeat sequence, (TTAGGG)n. These YACs provide a resource to isolate regions of the mouse genome close to chromosome termini and excluded from existing conventional YAC libraries. To demonstrate their utility, a hybridization probe was isolated from Mtel-1, the first (TTAGGG)n-containing YAC isolated. This probe detects a approximately 70 kb Kpnl fragment in the mouse genome which is sensitive to pretreatment with BAL31 exonuclease. A PCR-based genetic marker generated from the sequence of this probe maps 4.4 cM from the most distal anchor locus on chromosome 10 in the EUCIB interspecific backcross. STS primers for this locus, D10Hgu1, were used to isolate YAC 110F4 from a commercially available mouse YAC library. Fluorescence in situ hybridization demonstrates that YAC 110F4 hybridizes to the distal telomere of chromosome 10. Clones in this collection of telomere YACs therefore partially overlap clones in conventional YAC libraries, and thus the previously unavailable terminal regions of the mouse genome can now be linked with the developing mouse STS YAC contig. Genetic markers such as D10Hgu1 allow the ends of the mouse genetic map to be defined, thus closing the map.

Recombination rate variation in mice from an isolated island

PubMed Central

Wang, Richard J.; Gray, Melissa M.; Parmenter, Michelle D.; Broman, Karl W.; Payseur, Bret A.

2016-01-01

Recombination rate is a heritable trait that varies among individuals. Despite the major impact of recombination rate on patterns of genetic diversity and the efficacy of selection, natural variation in this phenotype remains poorly characterized. We present a comparison of genetic maps, sampling 1,212 meioses, from a unique population of wild house mice (Mus musculus domesticus) that recently colonized remote Gough Island. Crosses to a mainland reference strain (WSB/EiJ) reveal pervasive variation in recombination rate among Gough Island mice, including sub-chromosomal intervals spanning up to 28% of the genome. In spite of this high level of polymorphism, the genome-wide recombination rate does not significantly vary. In general, we find that recombination rate varies more when measured in smaller genomic intervals. Using the current standard genetic map of the laboratory mouse to polarize intervals with divergent recombination rates, we infer that the majority of evolutionary change occurred in one of the two tested lines of Gough Island mice. Our results confirm that natural populations harbor a high level of recombination rate polymorphism and highlight the disparities in recombination rate evolution across genomic scales. PMID:27864900
Towards precision medicine-based therapies for glioblastoma: interrogating human disease genomics and mouse phenotypes.

PubMed

Chen, Yang; Gao, Zhen; Wang, Bingcheng; Xu, Rong

2016-08-22

Glioblastoma (GBM) is the most common and aggressive brain tumors. It has poor prognosis even with optimal radio- and chemo-therapies. Since GBM is highly heterogeneous, drugs that target on specific molecular profiles of individual tumors may achieve maximized efficacy. Currently, the Cancer Genome Atlas (TCGA) projects have identified hundreds of GBM-associated genes. We develop a drug repositioning approach combining disease genomics and mouse phenotype data towards predicting targeted therapies for GBM. We first identified disease specific mouse phenotypes using the most recently discovered GBM genes. Then we systematically searched all FDA-approved drugs for candidates that share similar mouse phenotype profiles with GBM. We evaluated the ranks for approved and novel GBM drugs, and compared with an existing approach, which also use the mouse phenotype data but not the disease genomics data. We achieved significantly higher ranks for the approved and novel GBM drugs than the earlier approach. For all positive examples of GBM drugs, we achieved a median rank of 9.2 45.6 of the top predictions have been demonstrated effective in inhibiting the growth of human GBM cells. We developed a computational drug repositioning approach based on both genomic and phenotypic data. Our approach prioritized existing GBM drugs and outperformed a recent approach. Overall, our approach shows potential in discovering new targeted therapies for GBM.
Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

PubMed

Chung, Dongjun; Kuan, Pei Fen; Li, Bo; Sanalkumar, Rajendran; Liang, Kun; Bresnick, Emery H; Dewey, Colin; Keleş, Sündüz

2011-07-01

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.
A genome survey sequencing of the Java mouse deer (Tragulus javanicus) adds new aspects to the evolution of lineage specific retrotransposons in Ruminantia (Cetartiodactyla).

PubMed

Gallus, S; Kumar, V; Bertelsen, M F; Janke, A; Nilsson, M A

2015-10-25

Ruminantia, the ruminating, hoofed mammals (cow, deer, giraffe and allies) are an unranked artiodactylan clade. Around 50-60 million years ago the BovB retrotransposon entered the ancestral ruminantian genome through horizontal gene transfer. A survey genome screen using 454-pyrosequencing of the Java mouse deer (Tragulus javanicus) and the lesser kudu (Tragelaphus imberbis) was done to investigate and to compare the landscape of transposable elements within Ruminantia. The family Tragulidae (mouse deer) is the only representative of Tragulina and phylogenetically important, because it represents the earliest divergence in Ruminantia. The data analyses show that, relative to other ruminantian species, the lesser kudu genome has seen an expansion of BovB Long INterspersed Elements (LINEs) and BovB related Short INterspersed Elements (SINEs) like BOVA2. In comparison the genome of Java mouse deer has fewer BovB elements than other ruminants, especially Bovinae, and has in addition a novel CHR-3 SINE most likely propagated by LINE-1. By contrast the other ruminants have low amounts of CHR SINEs but high numbers of actively propagating BovB-derived and BovB-propagated SINEs. The survey sequencing data suggest that the transposable element landscape in mouse deer (Tragulina) is unique among Ruminantia, suggesting a lineage specific evolutionary trajectory that does not involve BovB mediated retrotransposition. This shows that the genomic landscape of mobile genetic elements can rapidly change in any lineage. Copyright © 2015 Elsevier B.V. All rights reserved.
Characterization of Aeromonas hydrophila wound pathotypes by comparative genomic and functional analyses of virulence genes.

PubMed

Grim, Christopher J; Kozlova, Elena V; Sha, Jian; Fitts, Eric C; van Lier, Christina J; Kirtley, Michelle L; Joseph, Sandeep J; Read, Timothy D; Burd, Eileen M; Tall, Ben D; Joseph, Sam W; Horneman, Amy J; Chopra, Ashok K; Shak, Joshua R

2013-04-23

Aeromonas hydrophila has increasingly been implicated as a virulent and antibiotic-resistant etiologic agent in various human diseases. In a previously published case report, we described a subject with a polymicrobial wound infection that included a persistent and aggressive strain of A. hydrophila (E1), as well as a more antibiotic-resistant strain of A. hydrophila (E2). To better understand the differences between pathogenic and environmental strains of A. hydrophila, we conducted comparative genomic and functional analyses of virulence-associated genes of these two wound isolates (E1 and E2), the environmental type strain A. hydrophila ATCC 7966(T), and four other isolates belonging to A. aquariorum, A. veronii, A. salmonicida, and A. caviae. Full-genome sequencing of strains E1 and E2 revealed extensive differences between the two and strain ATCC 7966(T). The more persistent wound infection strain, E1, harbored coding sequences for a cytotoxic enterotoxin (Act), a type 3 secretion system (T3SS), flagella, hemolysins, and a homolog of exotoxin A found in Pseudomonas aeruginosa. Corresponding phenotypic analyses with A. hydrophila ATCC 7966(T) and SSU as reference strains demonstrated the functionality of these virulence genes, with strain E1 displaying enhanced swimming and swarming motility, lateral flagella on electron microscopy, the presence of T3SS effector AexU, and enhanced lethality in a mouse model of Aeromonas infection. By combining sequence-based analysis and functional assays, we characterized an A. hydrophila pathotype, exemplified by strain E1, that exhibited increased virulence in a mouse model of infection, likely because of encapsulation, enhanced motility, toxin secretion, and cellular toxicity. Aeromonas hydrophila is a common aquatic bacterium that has increasingly been implicated in serious human infections. While many determinants of virulence have been identified in Aeromonas, rapid identification of pathogenic versus nonpathogenic strains remains a challenge for this genus, as it is for other opportunistic pathogens. This paper demonstrates, by using whole-genome sequencing of clinical Aeromonas strains, followed by corresponding virulence assays, that comparative genomics can be used to identify a virulent subtype of A. hydrophila that is aggressive during human infection and more lethal in a mouse model of infection. This aggressive pathotype contained genes for toxin production, toxin secretion, and bacterial motility that likely enabled its pathogenicity. Our results highlight the potential of whole-genome sequencing to transform microbial diagnostics; with further advances in rapid sequencing and annotation, genomic analysis will be able to provide timely information on the identities and virulence potential of clinically isolated microorganisms.
Insights from Human/Mouse genome comparisons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pennacchio, Len A.

2003-03-30

Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestrymore » of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.« less
Genome Editing in Mice Using TALE Nucleases.

PubMed

Wefers, Benedikt; Brandl, Christina; Ortiz, Oskar; Wurst, Wolfgang; Kühn, Ralf

2016-01-01

Gene engineering for generating targeted mouse mutants is a key technology for biomedical research. Using TALENs as sequence-specific nucleases to induce targeted double-strand breaks, the mouse genome can be directly modified in zygotes in a single step without the need for embryonic stem cells. By embryo microinjection of TALEN mRNAs and targeting vectors, knockout and knock-in alleles can be generated fast and efficiently. In this chapter we provide protocols for the application of TALENs in mouse zygotes.
Murine endogenous retroviruses

PubMed Central

2016-01-01

Up to 10% of the mouse genome is comprised of endogenous retrovirus (ERV) sequences, and most represent the remains of ancient germ line infections. Our knowledge of the three distinct classes of ERVs is inversely correlated with their copy number, and their characterization has benefited from the availability of divergent wild mouse species and subspecies, and from ongoing analysis of the Mus genome sequence. In contrast to human ERVs, which are nearly all extinct, active mouse ERVs can still be found in all three ERV classes. The distribution and diversity of ERVs has been shaped by host-virus interactions over the course of evolution, but ERVs have also been pivotal in shaping the mouse genome by altering host genes through insertional mutagenesis, by adding novel regulatory and coding sequences, and by their co-option by host cells as retroviral resistance genes. We review mechanisms by which an adaptive coexistence has evolved. (Part of a Multi-author Review) PMID:18818872
Mitochondrial genome-maintaining activity of mouse mitochondrial transcription factor A and its transcript isoform in Saccharomyces cerevisiae.

PubMed

Yoon, Young Geol; Koob, Michael D; Yoo, Young Hyun

2011-09-15

Mitochondrial transcription factor A (Tfam) binds to and organizes mitochondrial DNA (mtDNA) genome into a mitochondrial nucleoid (mt-nucleoid) structure, which is necessary for mtDNA transcription and maintenance. Here, we demonstrate the mtDNA-organizing activity of mouse Tfam and its transcript isoform (Tfam(iso)), which has a smaller high-mobility group (HMG)-box1 domain, using a yeast model system that contains a deletion of the yeast homolog of mouse Tfam protein, Abf2p. When the mouse Tfam genes were introduced into the ABF2 locus of yeast genome, the corresponding mouse proteins, Tfam and Tfam(iso), can functionally replace the yeast Abf2p and support mtDNA maintenance and mitochondrial biogenesis in yeast. Growth properties, mtDNA content and mitochondrial protein levels of genes encoded in the mtDNA were comparable in the strains expressing mouse proteins and the wild-type yeast strain, indicating that the proteins have robust mtDNA-maintaining and -expressing function in yeast mitochondria. These results imply that the mtDNA-organizing activities of the mouse mt-nucleoid proteins are structurally and evolutionary conserved, thus they can maintain the mtDNA of distantly related and distinctively different species, such as yeast. Copyright © 2011 Elsevier B.V. All rights reserved.
Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

PubMed Central

Schmouth, Jean-François; Bonaguro, Russell J.; Corso-Diaz, Ximena; Simpson, Elizabeth M.

2012-01-01

An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs), in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX). This method can be applied to most human genes for which a bacterial artificial chromosome (BAC) construct can be derived and a mouse-null allele exists. This strategy comprises (1) the use of recombineering technology to create a human variant–harbouring BAC, (2) knock-in of this BAC into the mouse genome using Hprt docking technology, and (3) allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation. PMID:22396661
Genomic landscapes of endogenous retroviruses unveil intricate genetics of conventional and genetically-engineered laboratory mouse strains.

PubMed

Lee, Kang-Hoon; Lim, Debora; Chiu, Sophia; Greenhalgh, David; Cho, Kiho

2016-04-01

Laboratory strains of mice, both conventional and genetically engineered, have been introduced as critical components of a broad range of studies investigating normal and disease biology. Currently, the genetic identity of laboratory mice is primarily confirmed by surveying polymorphisms in selected sets of "conventional" genes and/or microsatellites in the absence of a single completely sequenced mouse genome. First, we examined variations in the genomic landscapes of transposable repetitive elements, named the TREome, in conventional and genetically engineered mouse strains using murine leukemia virus-type endogenous retroviruses (MLV-ERVs) as a probe. A survey of the genomes from 56 conventional strains revealed strain-specific TREome landscapes, and certain families (e.g., C57BL) of strains were discernible with defined patterns. Interestingly, the TREome landscapes of C3H/HeJ (toll-like receptor-4 [TLR4] mutant) inbred mice were different from its control C3H/HeOuJ (TLR4 wild-type) strain. In addition, a CD14 knock-out strain had a distinct TREome landscape compared to its control/backcross C57BL/6J strain. Second, an examination of superantigen (SAg, a "TREome gene") coding sequences of mouse mammary tumor virus-type ERVs in the genomes of the 46 conventional strains revealed a high diversity, suggesting a potential role of SAgs in strain-specific immune phenotypes. The findings from this study indicate that unexplored and intricate genomic variations exist in laboratory mouse strains, both conventional and genetically engineered. The TREome-based high-resolution genetics surveillance system for laboratory mice would contribute to efficient study design with quality control and accurate data interpretation. This genetics system can be easily adapted to other species ranging from plants to humans. Copyright © 2016 Elsevier Inc. All rights reserved.
An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes

PubMed Central

Cho, Yun Sung; Kim, Hyunho; Kim, Hak-Min; Jho, Sungwoong; Jun, JeHoon; Lee, Yong Joo; Chae, Kyun Shik; Kim, Chang Geun; Kim, Sangsoo; Eriksson, Anders; Edwards, Jeremy S.; Lee, Semin; Kim, Byung Chul; Manica, Andrea; Oh, Tae-Kwang; Church, George M.; Bhak, Jong

2016-01-01

Human genomes are routinely compared against a universal reference. However, this strategy could miss population-specific and personal genomic variations, which may be detected more efficiently using an ethnically relevant or personal reference. Here we report a hybrid assembly of a Korean reference genome (KOREF) for constructing personal and ethnic references by combining sequencing and mapping methods. We also build its consensus variome reference, providing information on millions of variants from 40 additional ethnically homogeneous genomes from the Korean Personal Genome Project. We find that the ethnically relevant consensus reference can be beneficial for efficient variant detection. Systematic comparison of human assemblies shows the importance of assembly quality, suggesting the necessity of new technologies to comprehensively map ethnic and personal genomic structure variations. In the era of large-scale population genome projects, the leveraging of ethnicity-specific genome assemblies as well as the human reference genome will accelerate mapping all human genome diversity. PMID:27882922
Adaptive Mutations in Influenza A/California/07/2009 Enhance Polymerase Activity and Infectious Virion Production.

PubMed

Slaine, Patrick D; MacRae, Cara; Kleer, Mariel; Lamoureux, Emily; McAlpine, Sarah; Warhuus, Michelle; Comeau, André M; McCormick, Craig; Hatchette, Todd; Khaperskyy, Denys A

2018-05-18

Mice are not natural hosts for influenza A viruses (IAVs), but they are useful models for studying antiviral immune responses and pathogenesis. Serial passage of IAV in mice invariably causes the emergence of adaptive mutations and increased virulence. Here, we report the adaptation of IAV reference strain A/California/07/2009(H1N1) (also known as CA/07) in outbred Swiss Webster mice. Serial passage led to increased virulence and lung titers, and dissemination of the virus to brains. We adapted a deep-sequencing protocol to identify and enumerate adaptive mutations across all genome segments. Among mutations that emerged during mouse-adaptation, we focused on amino acid substitutions in polymerase subunits: polymerase basic-1 (PB1) T156A and F740L and polymerase acidic (PA) E349G. These mutations were evaluated singly and in combination in minigenome replicon assays, which revealed that PA E349G increased polymerase activity. By selectively engineering three PB1 and PA mutations into the parental CA/07 strain, we demonstrated that these mutations in polymerase subunits decreased the production of defective viral genome segments with internal deletions and dramatically increased the release of infectious virions from mouse cells. Together, these findings increase our understanding of the contribution of polymerase subunits to successful host adaptation.
Optimizing mouse models of neurodegenerative disorders: are therapeutics in sight?

PubMed

Lutz, Cathleen M; Osborne, Melissa A

2013-01-01

The genomic and biologic conservation between mice and humans, along with our increasing ability to manipulate the mouse genome, places the mouse as a premier model for deciphering disease mechanisms and testing potential new therapies. Despite these advantages, mouse models of neurodegenerative disease are sometimes difficult to generate and can present challenges that must be carefully addressed when used for preclinical studies. For those models that do exist, the standardization and optimization of the models is a critical step in ensuring success in both basic research and preclinical use. This review looks back on the history of model development for neurodegenerative diseases and highlights the key strategies that have been learned in order to improve the design, development and use of mouse models in the study of neurodegenerative disease.
Integrating text mining into the MGI biocuration workflow

PubMed Central

Dowell, K.G.; McAndrews-Hill, M.S.; Hill, D.P.; Drabkin, H.J.; Blake, J.A.

2009-01-01

A major challenge for functional and comparative genomics resource development is the extraction of data from the biomedical literature. Although text mining for biological data is an active research field, few applications have been integrated into production literature curation systems such as those of the model organism databases (MODs). Not only are most available biological natural language (bioNLP) and information retrieval and extraction solutions difficult to adapt to existing MOD curation workflows, but many also have high error rates or are unable to process documents available in those formats preferred by scientific journals. In September 2008, Mouse Genome Informatics (MGI) at The Jackson Laboratory initiated a search for dictionary-based text mining tools that we could integrate into our biocuration workflow. MGI has rigorous document triage and annotation procedures designed to identify appropriate articles about mouse genetics and genome biology. We currently screen ∼1000 journal articles a month for Gene Ontology terms, gene mapping, gene expression, phenotype data and other key biological information. Although we do not foresee that curation tasks will ever be fully automated, we are eager to implement named entity recognition (NER) tools for gene tagging that can help streamline our curation workflow and simplify gene indexing tasks within the MGI system. Gene indexing is an MGI-specific curation function that involves identifying which mouse genes are being studied in an article, then associating the appropriate gene symbols with the article reference number in the MGI database. Here, we discuss our search process, performance metrics and success criteria, and how we identified a short list of potential text mining tools for further evaluation. We provide an overview of our pilot projects with NCBO's Open Biomedical Annotator and Fraunhofer SCAI's ProMiner. In doing so, we prove the potential for the further incorporation of semi-automated processes into the curation of the biomedical literature. PMID:20157492
Integrating text mining into the MGI biocuration workflow.

PubMed

Dowell, K G; McAndrews-Hill, M S; Hill, D P; Drabkin, H J; Blake, J A

2009-01-01

A major challenge for functional and comparative genomics resource development is the extraction of data from the biomedical literature. Although text mining for biological data is an active research field, few applications have been integrated into production literature curation systems such as those of the model organism databases (MODs). Not only are most available biological natural language (bioNLP) and information retrieval and extraction solutions difficult to adapt to existing MOD curation workflows, but many also have high error rates or are unable to process documents available in those formats preferred by scientific journals.In September 2008, Mouse Genome Informatics (MGI) at The Jackson Laboratory initiated a search for dictionary-based text mining tools that we could integrate into our biocuration workflow. MGI has rigorous document triage and annotation procedures designed to identify appropriate articles about mouse genetics and genome biology. We currently screen approximately 1000 journal articles a month for Gene Ontology terms, gene mapping, gene expression, phenotype data and other key biological information. Although we do not foresee that curation tasks will ever be fully automated, we are eager to implement named entity recognition (NER) tools for gene tagging that can help streamline our curation workflow and simplify gene indexing tasks within the MGI system. Gene indexing is an MGI-specific curation function that involves identifying which mouse genes are being studied in an article, then associating the appropriate gene symbols with the article reference number in the MGI database.Here, we discuss our search process, performance metrics and success criteria, and how we identified a short list of potential text mining tools for further evaluation. We provide an overview of our pilot projects with NCBO's Open Biomedical Annotator and Fraunhofer SCAI's ProMiner. In doing so, we prove the potential for the further incorporation of semi-automated processes into the curation of the biomedical literature.
Mouse Genome Informatics (MGI) Is the International Resource for Information on the Laboratory Mouse.

PubMed

Law, MeiYee; Shaw, David R

2018-01-01

Mouse Genome Informatics (MGI, http://www.informatics.jax.org/ ) web resources provide free access to meticulously curated information about the laboratory mouse. MGI's primary goal is to help researchers investigate the genetic foundations of human diseases by translating information from mouse phenotypes and disease models studies to human systems. MGI provides comprehensive phenotypes for over 50,000 mutant alleles in mice and provides experimental model descriptions for over 1500 human diseases. Curated data from scientific publications are integrated with those from high-throughput phenotyping and gene expression centers. Data are standardized using defined, hierarchical vocabularies such as the Mammalian Phenotype (MP) Ontology, Mouse Developmental Anatomy and the Gene Ontologies (GO). This chapter introduces you to Gene and Allele Detail pages and provides step-by-step instructions for simple searches and those that take advantage of the breadth of MGI data integration.
Structure and polymorphism of the mouse myelin/oligodendrocyte glycoprotein gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Daubas, P.; Pham-Dinh, D.; Dautigny, A.

1994-09-01

The authors have isolated and characterized genomic clones containing the mouse myelin/oligodendrocyte glycoprotein (MOG) gene. It spans a region of 12.5 kb and consists of eight exons. Its exon-intron structure differs from that of classical MHC-class I genes, with which it is linked in the mouse genome. Nucleotide sequencing of the 5{prime} flanking region revelas that it contains several putative protein-binding sites, some of them in common with other myelin gene promoters. One intragenic polymorphism has been identified: it consists of a GA repeat, defining at least three alleles in mouse inbred strains, and is easily detectable using the polymerasemore » chain reaction method.« less
Smooth Muscle Cell Genome Browser: Enabling the Identification of Novel Serum Response Factor Target Genes

PubMed Central

Lee, Moon Young; Park, Chanjae; Berent, Robyn M.; Park, Paul J.; Fuchs, Robert; Syn, Hannah; Chin, Albert; Townsend, Jared; Benson, Craig C.; Redelman, Doug; Shen, Tsai-wei; Park, Jong Kun; Miano, Joseph M.; Sanders, Kenton M.; Ro, Seungil

2015-01-01

Genome-scale expression data on the absolute numbers of gene isoforms offers essential clues in cellular functions and biological processes. Smooth muscle cells (SMCs) perform a unique contractile function through expression of specific genes controlled by serum response factor (SRF), a transcription factor that binds to DNA sites known as the CArG boxes. To identify SRF-regulated genes specifically expressed in SMCs, we isolated SMC populations from mouse small intestine and colon, obtained their transcriptomes, and constructed an interactive SMC genome and CArGome browser. To our knowledge, this is the first online resource that provides a comprehensive library of all genetic transcripts expressed in primary SMCs. The browser also serves as the first genome-wide map of SRF binding sites. The browser analysis revealed novel SMC-specific transcriptional variants and SRF target genes, which provided new and unique insights into the cellular and biological functions of the cells in gastrointestinal (GI) physiology. The SRF target genes in SMCs, which were discovered in silico, were confirmed by proteomic analysis of SMC-specific Srf knockout mice. Our genome browser offers a new perspective into the alternative expression of genes in the context of SRF binding sites in SMCs and provides a valuable reference for future functional studies. PMID:26241044
Ensembl 2002: accommodating comparative genomics.

PubMed

Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E

2003-01-01

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.

Comparisons of Native and Chimeric Shiga Toxins Indicate that the Binding Subunit Dictates Degree of Toxicity

DTIC Science & Technology

2014-03-17

to the original BXD panel as BXD strains 43-103 (218). The genomes of both founder strains, B6 (308) and D2 (47; 307), have been sequenced and 1.8... sequencing of the DBA/2J mouse genome . BMC.Bioinformatics. 11 :07 308. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, et al. 2002. Initial... sequencing and comparative analysis of the mouse genome . Nature 420:520-62 309. Weeratna RD, Doyle MP. 1991. Detection and production of verotoxin 1
Persistence of Cytosine Methylation of DNA following Fertilisation in the Mouse

PubMed Central

Li, Yan; O'Neill, Chris

2012-01-01

Normal development of the mammalian embryo requires epigenetic reprogramming of the genome. The level of cytosine methylation of CpG-rich (5meC) regions of the genome is a major epigenetic regulator and active global demethylation of 5meC throughout the genome is reported to occur within the first cell-cycle following fertilization. An enzyme or mechanism capable of catalysing such rapid global demethylation has not been identified. The mouse is a widely used model for studying developmental epigenetics. We have reassessed the evidence for this phenomenon of genome-wide demethylation following fertilisation in the mouse. We found when using conventional methods of immunolocalization that 5meC showed a progressive acid-resistant antigenic masking during zygotic maturation which gave the appearance of demethylation. Changing the unmasking strategy by also performing tryptic digestion revealed a persistence of a methylated state. Analysis of methyl binding domain 1 protein (MBD1) binding confirmed that the genome remained methylated following fertilisation. The maintenance of this methylated state over the first several cell-cycles required the actions of DNA methyltransferase activity. The study shows that any 5meC remodelling that occurs during early development is not explained by a global active loss of 5meC staining during the cleavage stage of development and global loss of methylation following fertilization is not a major component of epigenetic reprogramming in the mouse zygote. PMID:22292019
Divergence of Mammalian Higher Order Chromatin Structure Is Associated with Developmental Loci

PubMed Central

Chambers, Emily V.; Bickmore, Wendy A.; Semple, Colin A.

2013-01-01

Several recent studies have examined different aspects of mammalian higher order chromatin structure – replication timing, lamina association and Hi-C inter-locus interactions — and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution. PMID:23592965
Recombination rate variation in mice from an isolated island.

PubMed

Wang, Richard J; Gray, Melissa M; Parmenter, Michelle D; Broman, Karl W; Payseur, Bret A

2017-01-01

Recombination rate is a heritable trait that varies among individuals. Despite the major impact of recombination rate on patterns of genetic diversity and the efficacy of selection, natural variation in this phenotype remains poorly characterized. We present a comparison of genetic maps, sampling 1212 meioses, from a unique population of wild house mice (Mus musculus domesticus) that recently colonized remote Gough Island. Crosses to a mainland reference strain (WSB/EiJ) reveal pervasive variation in recombination rate among Gough Island mice, including subchromosomal intervals spanning up to 28% of the genome. In spite of this high level of polymorphism, the genomewide recombination rate does not significantly vary. In general, we find that recombination rate varies more when measured in smaller genomic intervals. Using the current standard genetic map of the laboratory mouse to polarize intervals with divergent recombination rates, we infer that the majority of evolutionary change occurred in one of the two tested lines of Gough Island mice. Our results confirm that natural populations harbour a high level of recombination rate polymorphism and highlight the disparities in recombination rate evolution across genomic scales. © 2016 John Wiley & Sons Ltd.
A multiplicity of factors contributes to selective RNA polymerase III occupancy of a subset of RNA polymerase III genes in mouse liver

PubMed Central

Canella, Donatella; Bernasconi, David; Gilardi, Federica; LeMartelot, Gwendal; Migliavacca, Eugenia; Praz, Viviane; Cousin, Pascal; Delorenzi, Mauro; Hernandez, Nouria; Hernandez, Nouria; Delorenzi, Mauro; Deplancke, Bart; Desvergne, Béatrice; Guex, Nicolas; Herr, Winship; Naef, Felix; Rougemont, Jacques; Schibler, Ueli; Deplancke, Bart; Guex, Nicolas; Herr, Winship; Guex, Nicolas; Andersin, Teemu; Cousin, Pascal; Gilardi, Federica; Gos, Pascal; Le Martelot, Gwendal; Lammers, Fabienne; Canella, Donatella; Gilardi, Federica; Raghav, Sunil; Fabbretti, Roberto; Fortier, Arnaud; Long, Li; Vlegel, Volker; Xenarios, Ioannis; Migliavacca, Eugenia; Praz, Viviane; Guex, Nicolas; Naef, Felix; Rougemont, Jacques; David, Fabrice; Jarosz, Yohan; Kuznetsov, Dmitry; Liechti, Robin; Martin, Olivier; Ross, Frederick; Sinclair, Lucas; Cajan, Julia; Krier, Irina; Leleu, Marion; Migliavacca, Eugenia; Molina, Nacho; Naldi, Aurélien; Rey, Guillaume; Symul, Laura; Guex, Nicolas; Naef, Felix; Rougemont, Jacques; Bernasconi, David; Delorenzi, Mauro; Andersin, Teemu; Canella, Donatella; Gilardi, Federica; Le Martelot, Gwendal; Lammers, Fabienne; Raghav, Sunil

2012-01-01

The genomic loci occupied by RNA polymerase (RNAP) III have been characterized in human culture cells by genome-wide chromatin immunoprecipitations, followed by deep sequencing (ChIP-seq). These studies have shown that only ∼40% of the annotated 622 human tRNA genes and pseudogenes are occupied by RNAP-III, and that these genes are often in open chromatin regions rich in active RNAP-II transcription units. We have used ChIP-seq to characterize RNAP-III-occupied loci in a differentiated tissue, the mouse liver. Our studies define the mouse liver RNAP-III-occupied loci including a conserved mammalian interspersed repeat (MIR) as a potential regulator of an RNAP-III subunit-encoding gene. They reveal that synteny relationships can be established between a number of human and mouse RNAP-III genes, and that the expression levels of these genes are significantly linked. They establish that variations within the A and B promoter boxes, as well as the strength of the terminator sequence, can strongly affect RNAP-III occupancy of tRNA genes. They reveal correlations with various genomic features that explain the observed variation of 81% of tRNA scores. In mouse liver, loci represented in the NCBI37/mm9 genome assembly that are clearly occupied by RNAP-III comprise 50 Rn5s (5S RNA) genes, 14 known non-tRNA RNAP-III genes, nine Rn4.5s (4.5S RNA) genes, and 29 SINEs. Moreover, out of the 433 annotated tRNA genes, half are occupied by RNAP-III. Transfer RNA gene expression levels reflect both an underlying genomic organization conserved in dividing human culture cells and resting mouse liver cells, and the particular promoter and terminator strengths of individual genes. PMID:22287103
Mutational landscape of a chemically-induced mouse model of liver cancer.

PubMed

Connor, Frances; Rayner, Tim F; Aitken, Sarah J; Feig, Christine; Lukk, Margus; Santoyo-Lopez, Javier; Odom, Duncan T

2018-06-26

Carcinogen-induced mouse models of liver cancer are used extensively to study pathogenesis of the disease and have a critical role in validating candidate therapeutics. These models can recapitulate molecular and histological features of human disease. However, it is not known if the genomic alterations driving these mouse tumour genomes are comparable to those found in human tumours. Here, we provide a detailed genomic characterisation of tumours from a commonly used mouse model of hepatocellular carcinoma (HCC). We analysed whole exome sequences of liver tumours arising in mice exposed to diethylnitrosamine (DEN). DEN-initiated tumours had a high, uniform number of somatic single nucleotide variants (SNVs), with few insertions, deletions or copy number alterations, consistent with the known genotoxic action of DEN. Exposure of hepatocytes to DEN left a reproducible mutational imprint in resulting tumour exomes which we could computationally reconstruct using six known COSMIC mutational signatures. The tumours carried a high diversity of low-incidence, non-synonymous point mutations in many oncogenes and tumour suppressors, reflecting the stochastic introduction of SNVs into the hepatocyte genome by the carcinogen. We identified four recurrently mutated genes that were putative oncogenic drivers of HCC in this model. Every neoplasm carried activating hotspot mutations either in codon 61 of Hras, in codon 584 of Braf or in codon 254 of Egfr. Truncating mutations of Apc occurred in 21% of neoplasms, which were exclusively carcinomas supporting a role for deregulation of Wnt/β-catenin signalling in cancer progression. Our study provides detailed insight into the mutational landscape of tumours arising in a commonly-used carcinogen model of HCC, facilitating the future use of this model to understand the human disease. Mouse models are widely used to study the biology of cancer and to test potential therapies. Here, we have described the mutational landscape of tumours arising in a carcinogen-induced mouse model of liver cancer. Since cancer is a disease caused by genomic alterations, information about the patterns and types of mutations in the tumours in this mouse model should facilitate its use to study human liver cancer. Copyright © 2018 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Human vs. Mouse Eosinophils: “That which we call an eosinophil, by any other name would stain as red”

PubMed Central

Lee, James J.; Jacobsen, Elizabeth A.; Ochkur, Sergei I; McGarry, Michael P.; Condjella, Rachel M.; Doyle, Alfred D.; Luo, Huijun; Zellner, Katie R.; Protheroe, Cheryl A.; Willetts, Lian; LeSuer, William E.; Colbert, Dana C.; Helmers, Richard A.; Lacy, Paige; Moqbel, Redwan; Lee, Nancy A.

2012-01-01

The respective life histories of humans and mice are well defined and describe a unique story of evolutionary conservation extending from sequence identity within the genome to the underpinnings of biochemical, cellular, and physiological pathways. As a consequence, the hematopoietic lineages of both species are invariantly maintained, each with identifiable eosinophils. This canonical presence nonetheless does not preclude disparities between human and mouse eosinophils and/or their effector functions. Indeed, many books and reviews dogmatically highlight differences, providing a rationale to discount the use of mouse models of human eosinophilic diseases. We suggest that this perspective is parochial and ignores the wealth of available studies and the consensus of the literature that overwhelming similarities (and not differences) exist between human and mouse eosinophils. The goal of this review is to summarize this literature and in some cases provide the experimental details, comparing and contrasting eosinophils and eosinophil effector functions in humans vs. mice. In particular, our review will provide a summation and an easy to use reference guide to important studies demonstrating that while differences exist, more often than not their consequences are unknown and do not necessarily reflect inherent disparities in eosinophil function, but instead, species-specific variations. The conclusion from this overview is that despite nominal differences, the vast similarities between human and mouse eosinophils provide important insights as to their roles in health and disease and, in turn, demonstrate the unique utility of mouse-based studies with an expectation of valid extrapolation to the understanding and treatment of patients. PMID:22935586
Identification of structural variation in mouse genomes.

PubMed

Keane, Thomas M; Wong, Kim; Adams, David J; Flint, Jonathan; Reymond, Alexandre; Yalcin, Binnaz

2014-01-01

Structural variation is variation in structure of DNA regions affecting DNA sequence length and/or orientation. It generally includes deletions, insertions, copy-number gains, inversions, and transposable elements. Traditionally, the identification of structural variation in genomes has been challenging. However, with the recent advances in high-throughput DNA sequencing and paired-end mapping (PEM) methods, the ability to identify structural variation and their respective association to human diseases has improved considerably. In this review, we describe our current knowledge of structural variation in the mouse, one of the prime model systems for studying human diseases and mammalian biology. We further present the evolutionary implications of structural variation on transposable elements. We conclude with future directions on the study of structural variation in mouse genomes that will increase our understanding of molecular architecture and functional consequences of structural variation.
Astonishing advances in mouse genetic tools for biomedical research.

PubMed

Kaczmarczyk, Lech; Jackson, Walker S

2015-01-01

The humble house mouse has long been a workhorse model system in biomedical research. The technology for introducing site-specific genome modifications led to Nobel Prizes for its pioneers and opened a new era of mouse genetics. However, this technology was very time-consuming and technically demanding. As a result, many investigators continued to employ easier genome manipulation methods, though resulting models can suffer from overlooked or underestimated consequences. Another breakthrough, invaluable for the molecular dissection of disease mechanisms, was the invention of high-throughput methods to measure the expression of a plethora of genes in parallel. However, the use of samples containing material from multiple cell types could obfuscate data, and thus interpretations. In this review we highlight some important issues in experimental approaches using mouse models for biomedical research. We then discuss recent technological advances in mouse genetics that are revolutionising human disease research. Mouse genomes are now easily manipulated at precise locations thanks to guided endonucleases, such as transcription activator-like effector nucleases (TALENs) or the CRISPR/Cas9 system, both also having the potential to turn the dream of human gene therapy into reality. Newly developed methods of cell type-specific isolation of transcriptomes from crude tissue homogenates, followed by detection with next generation sequencing (NGS), are vastly improving gene regulation studies. Taken together, these amazing tools simplify the creation of much more accurate mouse models of human disease, and enable the extraction of hitherto unobtainable data.
Sequence and Characterization of the Ig Heavy Chain Constant and Partial Variable Region of the Mouse Strain 129S11

PubMed Central

Retter, Ida; Chevillard, Christophe; Scharfe, Maren; Conrad, Ansgar; Hafner, Martin; Im, Tschong-Hun; Ludewig, Monika; Nordsiek, Gabriele; Severitt, Simone; Thies, Stephanie; Mauhar, America; Blöcker, Helmut; Müller, Werner; Riblet, Roy

2009-01-01

Although the entire mouse genome has been sequenced, there remain challenges concerning the elucidation of particular complex and polymorphic genomic loci. In the murine Igh locus, different haplotypes exist in different inbred mouse strains. For example, the Ighb haplotype sequence of the Mouse Genome Project strain C57BL/6 differs considerably from the Igha haplotype of BALB/c, which has been widely used in the analyses of Ab responses. We have sequenced and annotated the 3′ half of the Igha locus of 129S1/SvImJ, covering the CH region and approximately half of the VH region. This sequence comprises 128 VH genes, of which 49 are judged to be functional. The comparison of the Igha sequence with the homologous Ighb region from C57BL/6 revealed two major expansions in the germline repertoire of Igha. In addition, we found smaller haplotype-specific differences like the duplication of five VH genes in the Igha locus. We generated a VH allele table by comparing the individual VH genes of both haplotypes. Surprisingly, the number and position of DH genes in the 129S1 strain differs not only from the sequence of C57BL/6 but also from the map published for BALB/c. Taken together, the contiguous genomic sequence of the 3′ part of the Igha locus allows a detailed view of the recent evolution of this highly dynamic locus in the mouse. PMID:17675503
A Mouse Geneticist’s Practical Guide to CRISPR Applications

PubMed Central

Singh, Priti; Schimenti, John C.; Bolcun-Filas, Ewelina

2015-01-01

CRISPR/Cas9 system of RNA-guided genome editing is revolutionizing genetics research in a wide spectrum of organisms. Even for the laboratory mouse, a model that has thrived under the benefits of embryonic stem (ES) cell knockout capabilities for nearly three decades, CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas9 technology enables one to manipulate the genome with unprecedented simplicity and speed. It allows generation of null, conditional, precisely mutated, reporter, or tagged alleles in mice. Moreover, it holds promise for other applications beyond genome editing. The crux of this system is the efficient and targeted introduction of DNA breaks that are repaired by any of several pathways in a predictable but not entirely controllable manner. Thus, further optimizations and improvements are being developed. Here, we summarize current applications and provide a practical guide to use the CRISPR/Cas9 system for mouse mutagenesis, based on published reports and our own experiences. We discuss critical points and suggest technical improvements to increase efficiency of RNA-guided genome editing in mouse embryos and address practical problems such as mosaicism in founders, which complicates genotyping and phenotyping. We describe a next-generation sequencing strategy for simultaneous characterization of on- and off-target editing in mice derived from multiple CRISPR experiments. Additionally, we report evidence that elevated frequency of precise, homology-directed editing can be achieved by transient inhibition of the Ligase IV-dependent nonhomologous end-joining pathway in one-celled mouse embryos. PMID:25271304
CIDR

Science.gov Websites

CIDR Skip navigation Home About CIDR General Highlights Newsletter Staff Employment Opportunities Genotyping General Information Genome Wide Association Custom FFPE Sample Options Methylation Linkage Consortium Developed Mouse Whole Genome Sequencing General Information Whole Genome Whole Exome Custom
CIDR

Science.gov Websites

Initiation Application Schedule Service Information and Pricing Services Sample Requirements Pricing SNP Genotyping General Information Genome Wide Association Custom FFPE Sample Options Methylation Linkage Consortium Developed Mouse Whole Genome Sequencing General Information Whole Genome Whole Exome Custom
Dissection of complex adult traits in a mouse synthetic population.

PubMed

Burke, David T; Kozloff, Kenneth M; Chen, Shu; West, Joshua L; Wilkowski, Jodi M; Goldstein, Steven A; Miller, Richard A; Galecki, Andrzej T

2012-08-01

Finding the causative genetic variations that underlie complex adult traits is a significant experimental challenge. The unbiased search strategy of genome-wide association (GWAS) has been used extensively in recent human population studies. These efforts, however, typically find only a minor fraction of the genetic loci that are predicted to affect variation. As an experimental model for the analysis of adult polygenic traits, we measured a mouse population for multiple phenotypes and conducted a genome-wide search for effector loci. Complex adult phenotypes, related to body size and bone structure, were measured as component phenotypes, and each subphenotype was associated with a genomic spectrum of candidate effector loci. The strategy successfully detected several loci for the phenotypes, at genome-wide significance, using a single, modest-sized population (N = 505). The effector loci each explain 2%-10% of the measured trait variation and, taken together, the loci can account for over 25% of a trait's total population variation. A replicate population (N = 378) was used to confirm initially observed loci for one trait (femur length), and, when the two groups were merged, the combined population demonstrated increased power to detect loci. In contrast to human population studies, our mouse genome-wide searches find loci that individually explain a larger fraction of the observed variation. Also, the additive effects of our detected mouse loci more closely match the predicted genetic component of variation. The genetic loci discovered are logical candidates for components of the genetic networks having evolutionary conservation with human biology.
Mouse Tumor Biology (MTB): a database of mouse models for human cancer.

PubMed

Bult, Carol J; Krupke, Debra M; Begley, Dale A; Richardson, Joel E; Neuhauser, Steven B; Sundberg, John P; Eppig, Janan T

2015-01-01

The Mouse Tumor Biology (MTB; http://tumor.informatics.jax.org) database is a unique online compendium of mouse models for human cancer. MTB provides online access to expertly curated information on diverse mouse models for human cancer and interfaces for searching and visualizing data associated with these models. The information in MTB is designed to facilitate the selection of strains for cancer research and is a platform for mining data on tumor development and patterns of metastases. MTB curators acquire data through manual curation of peer-reviewed scientific literature and from direct submissions by researchers. Data in MTB are also obtained from other bioinformatics resources including PathBase, the Gene Expression Omnibus and ArrayExpress. Recent enhancements to MTB improve the association between mouse models and human genes commonly mutated in a variety of cancers as identified in large-scale cancer genomics studies, provide new interfaces for exploring regions of the mouse genome associated with cancer phenotypes and incorporate data and information related to Patient-Derived Xenograft models of human cancers. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome engineering in cattle: recent technological advancements.

PubMed

Wang, Zhongde

2015-02-01

Great strides in technological advancements have been made in the past decade in cattle genome engineering. First, the success of cloning cattle by somatic cell nuclear transfer (SCNT) or chromatin transfer (CT) is a significant advancement that has made obsolete the need for using embryonic stem (ES) cells to conduct cell-mediated genome engineering, whereby site-specific genetic modifications can be conducted in bovine somatic cells via DNA homologous recombination (HR) and whereby genetically engineered cattle can subsequently be produced by animal cloning from the genetically modified cells. With this approach, a chosen bovine genomic locus can be precisely modified in somatic cells, such as to knock out (KO) or knock in (KI) a gene via HR, a gene-targeting strategy that had almost exclusively been used in mouse ES cells. Furthermore, by the creative application of embryonic cloning to rejuvenate somatic cells, cattle genome can be sequentially modified in the same line of somatic cells and complex genetic modifications have been achieved in cattle. Very recently, the development of designer nucleases-such as zinc finger nucleases (ZFNs) and transcription activator-like effector nuclease (TALENs), and clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9)-has enabled highly efficient and more facile genome engineering in cattle. Most notably, by employing such designer nucleases, genomes can be engineered at single-nucleotide precision; this process is now often referred to as genome or gene editing. The above achievements are a drastic departure from the traditional methods of creating genetically modified cattle, where foreign DNAs are randomly integrated into the animal genome, most often along with the integrations of bacterial or viral DNAs. Here, I review the most recent technological developments in cattle genome engineering by highlighting some of the major achievements in creating genetically engineered cattle for agricultural and biomedical applications.
The genome architecture of the Collaborative Cross mouse genetic reference population.

PubMed

2012-02-01

The Collaborative Cross Consortium reports here on the development of a unique genetic resource population. The Collaborative Cross (CC) is a multiparental recombinant inbred panel derived from eight laboratory mouse inbred strains. Breeding of the CC lines was initiated at multiple international sites using mice from The Jackson Laboratory. Currently, this innovative project is breeding independent CC lines at the University of North Carolina (UNC), at Tel Aviv University (TAU), and at Geniad in Western Australia (GND). These institutions aim to make publicly available the completed CC lines and their genotypes and sequence information. We genotyped, and report here, results from 458 extant lines from UNC, TAU, and GND using a custom genotyping array with 7500 SNPs designed to be maximally informative in the CC and used a novel algorithm to infer inherited haplotypes directly from hybridization intensity patterns. We identified lines with breeding errors and cousin lines generated by splitting incipient lines into two or more cousin lines at early generations of inbreeding. We then characterized the genome architecture of 350 genetically independent CC lines. Results showed that founder haplotypes are inherited at the expected frequency, although we also consistently observed highly significant transmission ratio distortion at specific loci across all three populations. On chromosome 2, there is significant overrepresentation of WSB/EiJ alleles, and on chromosome X, there is a large deficit of CC lines with CAST/EiJ alleles. Linkage disequilibrium decays as expected and we saw no evidence of gametic disequilibrium in the CC population as a whole or in random subsets of the population. Gametic equilibrium in the CC population is in marked contrast to the gametic disequilibrium present in a large panel of classical inbred strains. Finally, we discuss access to the CC population and to the associated raw data describing the genetic structure of individual lines. Integration of rich phenotypic and genomic data over time and across a wide variety of fields will be vital to delivering on one of the key attributes of the CC, a common genetic reference platform for identifying causative variants and genetic networks determining traits in mammals.
Mouse Models for Drug Discovery. Can New Tools and Technology Improve Translational Power?

PubMed Central

Zuberi, Aamir; Lutz, Cathleen

2016-01-01

Abstract The use of mouse models in biomedical research and preclinical drug evaluation is on the rise. The advent of new molecular genome-altering technologies such as CRISPR/Cas9 allows for genetic mutations to be introduced into the germ line of a mouse faster and less expensively than previous methods. In addition, the rapid progress in the development and use of somatic transgenesis using viral vectors, as well as manipulations of gene expression with siRNAs and antisense oligonucleotides, allow for even greater exploration into genomics and systems biology. These technological advances come at a time when cost reductions in genome sequencing have led to the identification of pathogenic mutations in patient populations, providing unprecedented opportunities in the use of mice to model human disease. The ease of genetic engineering in mice also offers a potential paradigm shift in resource sharing and the speed by which models are made available in the public domain. Predictively, the knowledge alone that a model can be quickly remade will provide relief to resources encumbered by licensing and Material Transfer Agreements. For decades, mouse strains have provided an exquisite experimental tool to study the pathophysiology of the disease and assess therapeutic options in a genetically defined system. However, a major limitation of the mouse has been the limited genetic diversity associated with common laboratory mice. This has been overcome with the recent development of the Collaborative Cross and Diversity Outbred mice. These strains provide new tools capable of replicating genetic diversity to that approaching the diversity found in human populations. The Collaborative Cross and Diversity Outbred strains thus provide a means to observe and characterize toxicity or efficacy of new therapeutic drugs for a given population. The combination of traditional and contemporary mouse genome editing tools, along with the addition of genetic diversity in new modeling systems, are synergistic and serve to make the mouse a better model for biomedical research, enhancing the potential for preclinical drug discovery and personalized medicine. PMID:28053071
The mutational landscape of MYCN, Lin28b and ALK F1174L driven murine neuroblastoma mimics human disease.

PubMed

De Wilde, Bram; Beckers, Anneleen; Lindner, Sven; Kristina, Althoff; De Preter, Katleen; Depuydt, Pauline; Mestdagh, Pieter; Sante, Tom; Lefever, Steve; Hertwig, Falk; Peng, Zhiyu; Shi, Le-Ming; Lee, Sangkyun; Vandermarliere, Elien; Martens, Lennart; Menten, Björn; Schramm, Alexander; Fischer, Matthias; Schulte, Johannes; Vandesompele, Jo; Speleman, Frank

2018-02-02

Genetically engineered mouse models have proven to be essential tools for unraveling fundamental aspects of cancer biology and for testing novel therapeutic strategies. To optimally serve these goals, it is essential that the mouse model faithfully recapitulates the human disease. Recently, novel mouse models for neuroblastoma have been developed. Here, we report on the further genomic characterization through exome sequencing and DNA copy number analysis of four of the currently available murine neuroblastoma model systems ( ALK, Th- MYCN, Dbh- MYCN and Lin28b ). The murine tumors revealed a low number of genomic alterations - in keeping with human neuroblastoma - and a positive correlation of the number of genetic lesions with the time to onset of tumor formation was observed. Gene copy number alterations are the hallmark of both murine and human disease and frequently affect syntenic genomic regions. Despite low mutational load, the genes mutated in murine disease were found to be enriched for genes mutated in human disease. Taken together, our study further supports the validity of the tested mouse models for mechanistic and preclinical studies of human neuroblastoma.
A novel humanized mouse model of Huntington disease for preclinical development of therapeutics targeting mutant huntingtin alleles.

PubMed

Southwell, Amber L; Skotte, Niels H; Villanueva, Erika B; Østergaard, Michael E; Gu, Xiaofeng; Kordasiewicz, Holly B; Kay, Chris; Cheung, Daphne; Xie, Yuanyun; Waltl, Sabine; Dal Cengio, Louisa; Findlay-Black, Hailey; Doty, Crystal N; Petoukhov, Eugenia; Iworima, Diepiriye; Slama, Ramy; Ooi, Jolene; Pouladi, Mahmoud A; Yang, X William; Swayze, Eric E; Seth, Punit P; Hayden, Michael R

2017-03-15

Huntington disease (HD) is a neurodegenerative disease caused by a mutation in the huntingtin (HTT) gene. HTT is a large protein, interacts with many partners and is involved in many cellular pathways, which are perturbed in HD. Therapies targeting HTT directly are likely to provide the most global benefit. Thus there is a need for preclinical models of HD recapitulating human HTT genetics. We previously generated a humanized mouse model of HD, Hu97/18, by intercrossing BACHD and YAC18 mice with knockout of the endogenous mouse HD homolog (Hdh). Hu97/18 mice recapitulate the genetics of HD, having two full-length, genomic human HTT transgenes heterozygous for the HD mutation and polymorphisms associated with HD in populations of Caucasian descent. We have now generated a companion model, Hu128/21, by intercrossing YAC128 and BAC21 mice on the Hdh-/- background. Hu128/21 mice have two full-length, genomic human HTT transgenes heterozygous for the HD mutation and polymorphisms associated with HD in populations of East Asian descent and in a minority of patients from other ethnic groups. Hu128/21 mice display a wide variety of HD-like phenotypes that are similar to YAC128 mice. Additionally, both transgenes in Hu128/21 mice match the human HTT exon 1 reference sequence. Conversely, the BACHD transgene carries a floxed, synthetic exon 1 sequence. Hu128/21 mice will be useful for investigations of human HTT that cannot be addressed in Hu97/18 mice, for developing therapies targeted to exon 1, and for preclinical screening of personalized HTT lowering therapies in HD patients of East Asian descent. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Hypothalamic transcriptomes of 99 mouse strains reveal trans eQTL hotspots, splicing QTLs and novel non-coding genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hasin-Brumshtein, Yehudit; Khan, Arshad H.; Hormozdiari, Farhad

2016-09-13

Previous studies had shown that the integration of genome wide expression profiles, in metabolic tissues, with genetic and phenotypic variance, provided valuable insight into the underlying molecular mechanisms. We used RNA-Seq to characterize hypothalamic transcriptome in 99 inbred strains of mice from the Hybrid Mouse Diversity Panel (HMDP), a reference resource population for cardiovascular and metabolic traits. We report numerous novel transcripts supported by proteomic analyses, as well as novel non coding RNAs. High resolution genetic mapping of transcript levels in HMDP, reveals bothlocalandtransexpression Quantitative Trait Loci (eQTLs) demonstrating 2transeQTL 'hotspots' associated with expression of hundreds of genes. We alsomore » report thousands of alternative splicing events regulated by genetic variants. Finally, comparison with about 150 metabolic and cardiovascular traits revealed many highly significant associations. Our data provide a rich resource for understanding the many physiologic functions mediated by the hypothalamus and their genetic regulation.« less
Adaptive Mutations in Influenza A/California/07/2009 Enhance Polymerase Activity and Infectious Virion Production

PubMed Central

Slaine, Patrick D.; MacRae, Cara; Kleer, Mariel; Lamoureux, Emily; McAlpine, Sarah; Warhuus, Michelle; Comeau, André M.; Hatchette, Todd

2018-01-01

Mice are not natural hosts for influenza A viruses (IAVs), but they are useful models for studying antiviral immune responses and pathogenesis. Serial passage of IAV in mice invariably causes the emergence of adaptive mutations and increased virulence. Here, we report the adaptation of IAV reference strain A/California/07/2009(H1N1) (also known as CA/07) in outbred Swiss Webster mice. Serial passage led to increased virulence and lung titers, and dissemination of the virus to brains. We adapted a deep-sequencing protocol to identify and enumerate adaptive mutations across all genome segments. Among mutations that emerged during mouse-adaptation, we focused on amino acid substitutions in polymerase subunits: polymerase basic-1 (PB1) T156A and F740L and polymerase acidic (PA) E349G. These mutations were evaluated singly and in combination in minigenome replicon assays, which revealed that PA E349G increased polymerase activity. By selectively engineering three PB1 and PA mutations into the parental CA/07 strain, we demonstrated that these mutations in polymerase subunits decreased the production of defective viral genome segments with internal deletions and dramatically increased the release of infectious virions from mouse cells. Together, these findings increase our understanding of the contribution of polymerase subunits to successful host adaptation. PMID:29783694
Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability.

PubMed

Reggiani, Claudio; Coppens, Sandra; Sekhara, Tayeb; Dimov, Ivan; Pichon, Bruno; Lufin, Nicolas; Addor, Marie-Claude; Belligni, Elga Fabia; Digilio, Maria Cristina; Faletra, Flavio; Ferrero, Giovanni Battista; Gerard, Marion; Isidor, Bertrand; Joss, Shelagh; Niel-Bütschi, Florence; Perrone, Maria Dolores; Petit, Florence; Renieri, Alessandra; Romana, Serge; Topa, Alexandra; Vermeesch, Joris Robert; Lenaerts, Tom; Casimir, Georges; Abramowicz, Marc; Bontempi, Gianluca; Vilain, Catheline; Deconinck, Nicolas; Smits, Guillaume

2017-07-19

Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders. Two pediatric patients with global developmental delay and intellectual disability phenotype underwent array-CGH genetic testing, both showing a partial deletion of the DLG2 gene. From independent human and murine omics datasets, we combined copy number variations, histone modifications, developmental tissue-specific regulation, and protein data to explore the molecular mechanism at play. Integrating genomics, transcriptomics, and epigenomics data, we describe two novel DLG2 promoters and coding first exons expressed in human fetal brain. Their murine conservation and protein-level evidence allowed us to produce new DLG2 gene models for human and mouse. These new genic elements are deleted in 90% of 29 patients (public and in-house) showing partial deletion of the DLG2 gene. The patients' clinical characteristics expand the neurodevelopmental phenotypic spectrum linked to DLG2 gene disruption to cognitive and behavioral categories. While protein-coding genes are regarded as well known, our work shows that integration of multiple omics datasets can unveil novel coding elements. From a clinical perspective, our work demonstrates that two new DLG2 promoters and exons are crucial for the neurodevelopmental phenotypes associated with this gene. In addition, our work brings evidence for the lack of cross-annotation in human versus mouse reference genomes and nucleotide versus protein databases.
Reliability, robustness, and reproducibility in mouse behavioral phenotyping: a cross-laboratory study

PubMed Central

Mandillo, Silvia; Tucci, Valter; Hölter, Sabine M.; Meziane, Hamid; Banchaabouchi, Mumna Al; Kallnik, Magdalena; Lad, Heena V.; Nolan, Patrick M.; Ouagazzal, Abdel-Mouttalib; Coghill, Emma L.; Gale, Karin; Golini, Elisabetta; Jacquot, Sylvie; Krezel, Wojtek; Parker, Andy; Riet, Fabrice; Schneider, Ilka; Marazziti, Daniela; Auwerx, Johan; Brown, Steve D. M.; Chambon, Pierre; Rosenthal, Nadia; Tocchini-Valentini, Glauco; Wurst, Wolfgang

2008-01-01

Establishing standard operating procedures (SOPs) as tools for the analysis of behavioral phenotypes is fundamental to mouse functional genomics. It is essential that the tests designed provide reliable measures of the process under investigation but most importantly that these are reproducible across both time and laboratories. For this reason, we devised and tested a set of SOPs to investigate mouse behavior. Five research centers were involved across France, Germany, Italy, and the UK in this study, as part of the EUMORPHIA program. All the procedures underwent a cross-validation experimental study to investigate the robustness of the designed protocols. Four inbred reference strains (C57BL/6J, C3HeB/FeJ, BALB/cByJ, 129S2/SvPas), reflecting their use as common background strains in mutagenesis programs, were analyzed to validate these tests. We demonstrate that the operating procedures employed, which includes open field, SHIRPA, grip-strength, rotarod, Y-maze, prepulse inhibition of acoustic startle response, and tail flick tests, generated reproducible results between laboratories for a number of the test output parameters. However, we also identified several uncontrolled variables that constitute confounding factors in behavioral phenotyping. The EUMORPHIA SOPs described here are an important start-point for the ongoing development of increasingly robust phenotyping platforms and their application in large-scale, multicentre mouse phenotyping programs. PMID:18505770
Principles of regulatory information conservation between mouse and human.

PubMed

Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P

2014-11-20

To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
10. international mouse genome conference

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meisler, M.H.

Ten years after hosting the First International Mammalian Genome Conference in Paris in 1986, Dr. Jean-Louis Guenet presided over the Tenth Conference at the Pasteur Institute, October 7--10, 1996. The 1986 conference was a satellite to the Human Gene Mapping Workshop and had approximately 50 attendees. The 1996 meeting was attended by 300 scientists from around the world. In the interim, the number of mapped loci in the mouse increased from 1,000 to over 20,000. This report contains a listing of the program and its participants, and two articles that review the meeting and the role of the laboratory mousemore » in the Human Genome project. More than 200 papers were presented at the conference covering the following topics: International mouse chromosome committee meetings; Mutant generation and identification; Physical and genetic maps; New technology and resources; Chromatin structure and gene regulation; Rate and hamster genetic maps; Informatics and databases; and Quantitative trait analysis.« less
Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

PubMed

Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

2014-12-02

Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae.
Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication

PubMed Central

Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L.; Searle, Steven M. J.; Minx, Patrick; Hillier, LaDeana W.; Koboldt, Daniel C.; Davis, Brian W.; Driscoll, Carlos A.; Barr, Christina S.; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W. C.; Hahn, Matthew W.; Menotti-Raymond, Marilyn; O’Brien, Stephen J.; Wilson, Richard K.; Lyons, Leslie A.; Murphy, William J.; Warren, Wesley C.

2014-01-01

Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592
Beyond 'knock-out' mice: new perspectives for the programmed modification of the mammalian genome.

PubMed

Cohen-Tannoudji, M; Babinet, C

1998-10-01

The emergence of gene inactivation by homologous recombination methodology in embryonic stem cells has revolutionized the field of mouse genetics. Indeed, the availability of a rapidly growing number of mouse null mutants has represented an invaluable source of knowledge on mammalian development, cellular biology and physiology and has provided many models for human inherited diseases. In recent years, improvements of the original 'knock-out' strategy, as well as the exploitation of exogenous enzymatic systems that are active in the recombination process, have considerably extended the range of genetic manipulations that can be produced. For example, it is now possible to create a mouse bearing a targeted point mutation as the unique change in its entire genome therefore allowing very fine dissection of gene function in vivo. Chromosome alterations such as large deletions, inversions or translocations can also be designed and will facilitate the global functional analysis of the mouse genome. This will extend the possibilities of creating models of human pathologies that frequently originate from various chromosomal disorders. Finally, the advent of methods allowing conditional gene targeting will open the way for the analysis of the consequence of a particular mutation in a defined organ and at a specific time during the life of a mouse.
Targeting vector construction through recombineering.

PubMed

Malureanu, Liviu A

2011-01-01

Gene targeting in mouse embryonic stem cells is an essential, yet still very expensive and highly time-consuming, tool and method to study gene function at the organismal level or to create mouse models of human diseases. Conventional cloning-based methods have been largely used for generating targeting vectors, but are hampered by a number of limiting factors, including the variety and location of restriction enzymes in the gene locus of interest, the specific PCR amplification of repetitive DNA sequences, and cloning of large DNA fragments. Recombineering is a technique that exploits the highly efficient homologous recombination function encoded by λ phage in Escherichia coli. Bacteriophage-based recombination can recombine homologous sequences as short as 30-50 bases, allowing manipulations such as insertion, deletion, or mutation of virtually any genomic region. The large availability of mouse genomic bacterial artificial chromosome (BAC) libraries covering most of the genome facilitates the retrieval of genomic DNA sequences from the bacterial chromosomes through recombineering. This chapter describes a successfully applied protocol and aims to be a detailed guide through the steps of generation of targeting vectors through recombineering.
The truth about mouse, human, worms and yeast

PubMed Central

2004-01-01

Genome comparisons are behind the powerful new annotation methods being developed to find all human genes, as well as genes from other genomes. Genomes are now frequently being studied in pairs to provide cross-comparison datasets. This 'Noah's Ark' approach often reveals unsuspected genes and may support the deletion of false-positive predictions. Joining mouse and human as the cross-comparison dataset for the first two mammals are: two Drosophila species, D. melanogaster and D. pseudoobscura; two sea squirts, Ciona intestinalis and Ciona savignyi; four yeast (Saccharomyces) species; two nematodes, Caenorhabditis elegans and Caenorhabditis briggsae; and two pufferfish (Takefugu rubripes and Tetraodon nigroviridis). Even genomes like yeast and C. elegans, which have been known for more than five years, are now being significantly improved. Methods developed for yeast or nematodes will now be applied to mouse and human, and soon to additional mammals such as rat and dog, to identify all the mammalian protein-coding genes. Current large disparities between human Unigene predictions (127,835 genes) and gene-scanning methods (45,000 genes) still need to be resolved. This will be the challenge during the next few years. PMID:15601543
The truth about mouse, human, worms and yeast.

PubMed

Nelson, David R; Nebert, Daniel W

2004-01-01

Genome comparisons are behind the powerful new annotation methods being developed to find all human genes, as well as genes from other genomes. Genomes are now frequently being studied in pairs to provide cross-comparison datasets. This 'Noah's Ark' approach often reveals unsuspected genes and may support the deletion of false-positive predictions. Joining mouse and human as the cross-comparison dataset for the first two mammals are: two Drosophila species, D. melanogaster and D. pseudoobscura; two sea squirts, Ciona intestinalis and Ciona savignyi; four yeast (Saccharomyces) species; two nematodes, Caenorhabditis elegans and Caenorhabditis briggsae; and two pufferfish (Takefugu rubripes and Tetraodon nigroviridis). Even genomes like yeast and C. elegans, which have been known for more than five years, are now being significantly improved. Methods developed for yeast or nematodes will now be applied to mouse and human, and soon to additional mammals such as rat and dog, to identify all the mammalian protein-coding genes. Current large disparities between human Unigene predictions (127,835 genes) and gene-scanning methods (45,000 genes) still need to be resolved. This will be the challenge during the next few years.
Identification of high-efficiency 3'GG gRNA motifs in indexed FASTA files with ngg2.

PubMed

Roberson, Elisha D O

CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans . Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae , Caenorhabditis elegans , Drosophila melanogaster , Danio rerio , Mus musculus , and Homo sapiens. I also scanned the genomes of pig ( Sus scrofa ) and African elephant ( Loxodonta africana ) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3'GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3'GG editing sites in any species with an available genome sequence.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

PubMed Central

Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic

2001-01-01

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Efficient genome editing of differentiated renal epithelial cells.

PubMed

Hofherr, Alexis; Busch, Tilman; Huber, Nora; Nold, Andreas; Bohn, Albert; Viau, Amandine; Bienaimé, Frank; Kuehn, E Wolfgang; Arnold, Sebastian J; Köttgen, Michael

2017-02-01

Recent advances in genome editing technologies have enabled the rapid and precise manipulation of genomes, including the targeted introduction, alteration, and removal of genomic sequences. However, respective methods have been described mainly in non-differentiated or haploid cell types. Genome editing of well-differentiated renal epithelial cells has been hampered by a range of technological issues, including optimal design, efficient expression of multiple genome editing constructs, attainable mutation rates, and best screening strategies. Here, we present an easily implementable workflow for the rapid generation of targeted heterozygous and homozygous genomic sequence alterations in renal cells using transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeat (CRISPR) system. We demonstrate the versatility of established protocols by generating novel cellular models for studying autosomal dominant polycystic kidney disease (ADPKD). Furthermore, we show that cell culture-validated genetic modifications can be readily applied to mouse embryonic stem cells (mESCs) for the generation of corresponding mouse models. The described procedure for efficient genome editing can be applied to any cell type to study physiological and pathophysiological functions in the context of precisely engineered genotypes.
Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies.

PubMed

Card, Daren C; Schield, Drew R; Reyes-Velasco, Jacobo; Fujita, Matthew K; Andrew, Audra L; Oyler-McCance, Sara J; Fike, Jennifer A; Tomback, Diana F; Ruggiero, Robert P; Castoe, Todd A

2014-01-01

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5-5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies

USGS Publications Warehouse

Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthre K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

2014-01-01

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (~3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.
Characterization of Aeromonas hydrophila Wound Pathotypes by Comparative Genomic and Functional Analyses of Virulence Genes

PubMed Central

Grim, Christopher J.; Kozlova, Elena V.; Sha, Jian; Fitts, Eric C.; van Lier, Christina J.; Kirtley, Michelle L.; Joseph, Sandeep J.; Read, Timothy D.; Burd, Eileen M.; Tall, Ben D.; Joseph, Sam W.; Horneman, Amy J.; Chopra, Ashok K.; Shak, Joshua R.

2013-01-01

ABSTRACT Aeromonas hydrophila has increasingly been implicated as a virulent and antibiotic-resistant etiologic agent in various human diseases. In a previously published case report, we described a subject with a polymicrobial wound infection that included a persistent and aggressive strain of A. hydrophila (E1), as well as a more antibiotic-resistant strain of A. hydrophila (E2). To better understand the differences between pathogenic and environmental strains of A. hydrophila, we conducted comparative genomic and functional analyses of virulence-associated genes of these two wound isolates (E1 and E2), the environmental type strain A. hydrophila ATCC 7966T, and four other isolates belonging to A. aquariorum, A. veronii, A. salmonicida, and A. caviae. Full-genome sequencing of strains E1 and E2 revealed extensive differences between the two and strain ATCC 7966T. The more persistent wound infection strain, E1, harbored coding sequences for a cytotoxic enterotoxin (Act), a type 3 secretion system (T3SS), flagella, hemolysins, and a homolog of exotoxin A found in Pseudomonas aeruginosa. Corresponding phenotypic analyses with A. hydrophila ATCC 7966T and SSU as reference strains demonstrated the functionality of these virulence genes, with strain E1 displaying enhanced swimming and swarming motility, lateral flagella on electron microscopy, the presence of T3SS effector AexU, and enhanced lethality in a mouse model of Aeromonas infection. By combining sequence-based analysis and functional assays, we characterized an A. hydrophila pathotype, exemplified by strain E1, that exhibited increased virulence in a mouse model of infection, likely because of encapsulation, enhanced motility, toxin secretion, and cellular toxicity. PMID:23611906
Revealing the missing expressed genes beyond the human reference genome by RNA-Seq.

PubMed

Chen, Geng; Li, Ruiyuan; Shi, Leming; Qi, Junyi; Hu, Pengzhan; Luo, Jian; Liu, Mingyao; Shi, Tieliu

2011-12-02

The complete and accurate human reference genome is important for functional genomics researches. Therefore, the incomplete reference genome and individual specific sequences have significant effects on various studies. we used two RNA-Seq datasets from human brain tissues and 10 mixed cell lines to investigate the completeness of human reference genome. First, we demonstrated that in previously identified ~5 Mb Asian and ~5 Mb African novel sequences that are absent from the human reference genome of NCBI build 36, ~211 kb and ~201 kb of them could be transcribed, respectively. Our results suggest that many of those transcribed regions are not specific to Asian and African, but also present in Caucasian. Then, we found that the expressions of 104 RefSeq genes that are unalignable to NCBI build 37 in brain and cell lines are higher than 0.1 RPKM. 55 of them are conserved across human, chimpanzee and macaque, suggesting that there are still a significant number of functional human genes absent from the human reference genome. Moreover, we identified hundreds of novel transcript contigs that cannot be aligned to NCBI build 37, RefSeq genes and EST sequences. Some of those novel transcript contigs are also conserved among human, chimpanzee and macaque. By positioning those contigs onto the human genome, we identified several large deletions in the reference genome. Several conserved novel transcript contigs were further validated by RT-PCR. Our findings demonstrate that a significant number of genes are still absent from the incomplete human reference genome, highlighting the importance of further refining the human reference genome and curating those missing genes. Our study also shows the importance of de novo transcriptome assembly. The comparative approach between reference genome and other related human genomes based on the transcriptome provides an alternative way to refine the human reference genome.
The Role of Retrotransposons in Gene Family Expansions in the Human and Mouse Genomes

PubMed Central

Janoušek, Václav; Laukaitis, Christina M.; Yanchukov, Alexey

2016-01-01

Abstract Retrotransposons comprise a large portion of mammalian genomes. They contribute to structural changes and more importantly to gene regulation. The expansion and diversification of gene families have been implicated as sources of evolutionary novelties. Given the roles retrotransposons play in genomes, their contribution to the evolution of gene families warrants further exploration. In this study, we found a significant association between two major retrotransposon classes, LINEs and LTRs, and lineage-specific gene family expansions in both the human and mouse genomes. The distribution and diversity differ between LINEs and LTRs, suggesting that each has a distinct involvement in gene family expansion. LTRs are associated with open chromatin sites surrounding the gene families, supporting their involvement in gene regulation, whereas LINEs may play a structural role promoting gene duplication. Our findings also suggest that gene family expansions, especially in the mouse genome, undergo two phases. The first phase is characterized by elevated deposition of LTRs and their utilization in reshaping gene regulatory networks. The second phase is characterized by rapid gene family expansion due to continuous accumulation of LINEs and it appears that, in some instances at least, this could become a runaway process. We provide an example in which this has happened and we present a simulation supporting the possibility of the runaway process. Altogether we provide evidence of the contribution of retrotransposons to the expansion and evolution of gene families. Our findings emphasize the putative importance of these elements in diversification and adaptation in the human and mouse lineages. PMID:27503295

Genome Wide Analysis of Inbred Mouse Lines Identifies a Locus Containing Ppar-γ as Contributing to Enhanced Malaria Survival

PubMed Central

Henson, Kerstin; Luzader, Angelina; Lindstrom, Merle; Spooner, Muriel; Steffy, Brian M.; Suzuki, Oscar; Janse, Chris; Waters, Andrew P.; Zhou, Yingyao; Wiltshire, Tim; Winzeler, Elizabeth A.

2010-01-01

The genetic background of a patient determines in part if a person develops a mild form of malaria and recovers, or develops a severe form and dies. We have used a mouse model to detect genes involved in the resistance or susceptibility to Plasmodium berghei malaria infection. To this end we first characterized 32 different mouse strains infected with P. berghei and identified survival as the best trait to discriminate between the strains. We found a locus on chromosome 6 by linking the survival phenotypes of the mouse strains to their genetic variations using genome wide analyses such as haplotype associated mapping and the efficient mixed-model for association. This new locus involved in malaria resistance contains only two genes and confirms the importance of Ppar-γ in malaria infection. PMID:20531941
Human versus mouse eosinophils: "that which we call an eosinophil, by any other name would stain as red".

PubMed

Lee, James J; Jacobsen, Elizabeth A; Ochkur, Sergei I; McGarry, Michael P; Condjella, Rachel M; Doyle, Alfred D; Luo, Huijun; Zellner, Katie R; Protheroe, Cheryl A; Willetts, Lian; Lesuer, William E; Colbert, Dana C; Helmers, Richard A; Lacy, Paige; Moqbel, Redwan; Lee, Nancy A

2012-09-01

The respective life histories of human subjects and mice are well defined and describe a unique story of evolutionary conservation extending from sequence identity within the genome to the underpinnings of biochemical, cellular, and physiologic pathways. As a consequence, the hematopoietic lineages of both species are invariantly maintained, each with identifiable eosinophils. This canonical presence nonetheless does not preclude disparities between human and mouse eosinophils, their effector functions, or both. Indeed, many books and reviews dogmatically highlight differences, providing a rationale to discount the use of mouse models of human eosinophilic diseases. We suggest that this perspective is parochial and ignores the wealth of available studies and the consensus of the literature that overwhelming similarities (and not differences) exist between human and mouse eosinophils. The goal of this review is to summarize this literature and in some cases provide experimental details comparing and contrasting eosinophils and eosinophil effector functions in human subjects versus mice. In particular, our review will provide a summation and an easy-to-use reference guide to important studies demonstrating that although differences exist, more often than not, their consequences are unknown and do not necessarily reflect inherent disparities in eosinophil function but instead species-specific variations. The conclusion from this overview is that despite nominal differences, the vast similarities between human and mouse eosinophils provide important insights as to their roles in health and disease and, in turn, demonstrate the unique utility of mouse-based studies with an expectation of valid extrapolation to the understanding and treatment of patients. Copyright © 2012 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Editing of mouse and human immunoglobulin genes by CRISPR-Cas9 system.

PubMed

Cheong, Taek-Chin; Compagno, Mara; Chiarle, Roberto

2016-03-09

Applications of the CRISPR-Cas9 system to edit the genome have widely expanded to include DNA gene knock-out, deletions, chromosomal rearrangements, RNA editing and genome-wide screenings. Here we show the application of CRISPR-Cas9 technology to edit the mouse and human immunoglobulin (Ig) genes. By delivering Cas9 and guide-RNA (gRNA) with retro- or lenti-virus to IgM(+) mouse B cells and hybridomas, we induce class-switch recombination (CSR) of the IgH chain to the desired subclass. Similarly, we induce CSR in all human B cell lines tested with high efficiency to targeted IgH subclass. Finally, we engineer mouse hybridomas to secrete Fab' fragments instead of the whole Ig. Our results indicate that Ig genes in mouse and human cells can be edited to obtain any desired IgH switching helpful to study the biology of normal and lymphoma B cells. We also propose applications that could transform the technology of antibody production.
Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus

PubMed Central

Sundaram, Vasavi; Choudhary, Mayank N. K.; Pehrsson, Erica; Xing, Xiaoyun; Fiore, Christopher; Pandey, Manishi; Maricque, Brett; Udawatta, Methma; Ngo, Duc; Chen, Yujie; Paguntalan, Asia; Ray, Tammy; Hughes, Ava; Cohen, Barak A.; Wang, Ting

2017-01-01

Cis-regulatory modules contain multiple transcription factor (TF)-binding sites and integrate the effects of each TF to control gene expression in specific cellular contexts. Transposable elements (TEs) are uniquely equipped to deposit their regulatory sequences across a genome, which could also contain cis-regulatory modules that coordinate the control of multiple genes with the same regulatory logic. We provide the first evidence of mouse-specific TEs that encode a module of TF-binding sites in mouse embryonic stem cells (ESCs). The majority (77%) of the individual TEs tested exhibited enhancer activity in mouse ESCs. By mutating individual TF-binding sites within the TE, we identified a module of TF-binding motifs that cooperatively enhanced gene expression. Interestingly, we also observed the same motif module in the in silico constructed ancestral TE that also acted cooperatively to enhance gene expression. Our results suggest that ancestral TE insertions might have brought in cis-regulatory modules into the mouse genome. PMID:28348391
Equalizer reduces SNP bias in Affymetrix microarrays.

PubMed

Quigley, David

2015-07-30

Gene expression microarrays measure the levels of messenger ribonucleic acid (mRNA) in a sample using probe sequences that hybridize with transcribed regions. These probe sequences are designed using a reference genome for the relevant species. However, most model organisms and all humans have genomes that deviate from their reference. These variations, which include single nucleotide polymorphisms, insertions of additional nucleotides, and nucleotide deletions, can affect the microarray's performance. Genetic experiments comparing individuals bearing different population-associated single nucleotide polymorphisms that intersect microarray probes are therefore subject to systemic bias, as the reduction in binding efficiency due to a technical artifact is confounded with genetic differences between parental strains. This problem has been recognized for some time, and earlier methods of compensation have attempted to identify probes affected by genome variants using statistical models. These methods may require replicate microarray measurement of gene expression in the relevant tissue in inbred parental samples, which are not always available in model organisms and are never available in humans. By using sequence information for the genomes of organisms under investigation, potentially problematic probes can now be identified a priori. However, there is no published software tool that makes it easy to eliminate these probes from an annotation. I present equalizer, a software package that uses genome variant data to modify annotation files for the commonly used Affymetrix IVT and Gene/Exon platforms. These files can be used by any microarray normalization method for subsequent analysis. I demonstrate how use of equalizer on experiments mapping germline influence on gene expression in a genetic cross between two divergent mouse species and in human samples significantly reduces probe hybridization-induced bias, reducing false positive and false negative findings. The equalizer package reduces probe hybridization bias from experiments performed on the Affymetrix microarray platform, allowing accurate assessment of germline influence on gene expression.
Genome-wide ENU mutagenesis for the discovery of novel male fertility regulators.

PubMed

Jamsai, Duangporn; O'Bryan, Moira K

2010-06-01

The completion of genome sequencing projects has provided an extensive knowledge of the contents of the genomes of human, mouse, and many other organisms. Despite this, the function of most of the estimated 25,000 human genes remains largely unknown. Attention has now turned to elucidating gene function and identifying biological pathways that contribute to human diseases, including male infertility. Our understanding of the genetic regulation of male fertility has been accelerated through the use of genetically modified mouse models including knockout, knock-in, gene-trapped, and transgenic mice. Such reverse genetic approaches however, require some fore-knowledge of a gene's function and, as such, bias against the discovery of completely novel genes and biological pathways. To facilitate high throughput gene discovery, genome-wide mouse mutagenesis via the use of a potent chemical mutagen, N-ethyl-N-nitrosourea (ENU), has been developed over the past decade. This forward genetic, or phenotype-driven, approach relies upon observing a phenotype first, then subsequently defining the underlining genetic defect. Mutations are randomly introduced into the mouse genome via ENU exposure. Through a controlled breeding scheme, mutations causing a phenotype of interest (e.g., male infertility) are then identified by linkage analysis and candidate gene sequencing. This approach allows for the possibility of revealing comprehensive phenotype-genotype relationships for a range of genes and pathways i.e. in addition to null alleles, mice containing partial loss of function or gain-of-function mutations, can be recovered. Such point mutations are likely to be more reflective of those that occur within the human population. Many research groups have successfully used this approach to generate infertile mouse lines and some novel male fertility genes have been revealed. In this review, we focus on the utility of ENU mutagenesis for the discovery of novel male fertility regulators.
Characterization and mapping of the mouse NDP (Norrie disease) locus (Ndp).

PubMed

Battinelli, E M; Boyd, Y; Craig, I W; Breakefield, X O; Chen, Z Y

1996-02-01

Norrie disease is a severe X-linked recessive neurological disorder characterized by congenital blindness with progressive loss of hearing. Over half of Norrie patients also manifest different degrees of mental retardation. The gene for Norrie disease (NDP) has recently been cloned and characterized. With the human NDP cDNA, mouse genomic phage libraries were screened for the homolog of the gene. Comparison between mouse and human genomic DNA blots hybridized with the NDP cDNA, as well as analysis of phage clones, shows that the mouse NDP gene is 29 kb in size (28 kb for the human gene). The organization in the two species is very similar. Both have three exons with similar-sized introns and identical exon-intron boundaries between exon 2 and 3. The mouse open reading frame is 393 bp and, like the human coding sequence, is encoded in exons 2 and 3. The absence of six nucleotides in the second mouse exon results in the encoded protein being two amino acids smaller than its human counterpart. The overall homology between the human and mouse NDP protein is 95% and is particularly high (99%) in exon 3, consistent with the apparent functional importance of this region. Analysis of transcription initiation sites suggests the presence of multiple start sites associated with expression of the mouse NDP gene. Pedigree analysis of an interspecific mouse backcross localizes the mouse NDP gene close to Maoa in the conserved segment, which runs from CYBB to PFC in both human and mouse.
Coordinates and intervals in graph-based reference genomes.

PubMed

Rand, Knut D; Grytten, Ivar; Nederbragt, Alexander J; Storvik, Geir O; Glad, Ingrid K; Sandve, Geir K

2017-05-18

It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph-based reference genomes. We formalize offset-based coordinate systems on graph-based reference genomes and introduce methods for representing intervals on these reference structures. We show the advantage of our methods by representing genes on a graph-based representation of the newest assembly of the human genome (GRCh38) and its alternative loci for regions that are highly variable. More complex reference genomes, containing alternative loci, require methods to represent genomic data on these structures. Our proposed notation for genomic intervals makes it possible to fully utilize the alternative loci of the GRCh38 assembly and potential future graph-based reference genomes. We have made a Python package for representing such intervals on offset-based coordinate systems, available at https://github.com/uio-cels/offsetbasedgraph . An interactive web-tool using this Python package to visualize genes on a graph created from GRCh38 is available at https://github.com/uio-cels/genomicgraphcoords .
High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.

PubMed

Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia; Pérez-Lluch, Sílvia; Abad, Amaya; Davis, Carrie; Gingeras, Thomas R; Frankish, Adam; Harrow, Jennifer; Guigo, Roderic; Johnson, Rory

2017-12-01

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.
The genome sequence of ectromelia virus Naval and Cornell isolates from outbreaks in North America.

PubMed

Mavian, Carla; López-Bueno, Alberto; Bryant, Neil A; Seeger, Kathy; Quail, Michael A; Harris, David; Barrell, Bart; Alcami, Antonio

2014-08-01

Ectromelia virus (ECTV) is the causative agent of mousepox, a disease of laboratory mouse colonies and an excellent model for human smallpox. We report the genome sequence of two isolates from outbreaks in laboratory mouse colonies in the USA in 1995 and 1999: ECTV-Naval and ECTV-Cornell, respectively. The genome of ECTV-Naval and ECTV-Cornell was sequenced by the 454-Roche technology. The ECTV-Naval genome was also sequenced by the Sanger and Illumina technologies in order to evaluate these technologies for poxvirus genome sequencing. Genomic comparisons revealed that ECTV-Naval and ECTV-Cornell correspond to the same virus isolated from independent outbreaks. Both ECTV-Naval and ECTV-Cornell are extremely virulent in susceptible BALB/c mice, similar to ECTV-Moscow. This is consistent with the ECTV-Naval genome sharing 98.2% DNA sequence identity with that of ECTV-Moscow, and indicates that the genetic differences with ECTV-Moscow do not affect the virulence of ECTV-Naval in the mousepox model of footpad infection. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
The universality of nucleosome organization: from yeast to human

NASA Astrophysics Data System (ADS)

Chereji, Razvan

The basic units of DNA packaging are called nucleosomes. Their locations on the chromosomes play an essential role in gene regulation. We study nucleosome positioning in yeast, fly, mouse, and human, and build biophysical models in order to explain the genome-wide nucleosome organization. We show that DNA sequence alone is not able to generate the phased arrays of nucleosomes observed in vivo near the transcription start sites. We discuss simple models which can account for the formation of nucleosome depleted regions and nucleosome phasing at the gene promoters. We show that the same principles apply to different organisms. References: [1] RV Chereji, D Tolkunov, G Locke, AV Morozov - Phys. Rev. E 83, 050903 (2011) [2] RV Chereji, AV Morozov - J. Stat. Phys. 144, 379 (2011) [3] RV Chereji, AV Morozov - Proc. Natl. Acad. Sci. U.S.A. 111, 5236 (2014) [4] RV Chereji, T-W Kan, et al. - Nucleic Acids Res. (2015) doi: 10.1093/nar/gkv978 [5] RV Chereji, AV Morozov - Brief. Funct. Genomics 14, 50 (2015) [6] HA Cole, J Ocampo, JR Iben, RV Chereji, DJ Clark - Nucleic Acids Res. 42, 12512 (2014) [7] D Ganguli, RV Chereji, J Iben, HA Cole, DJ Clark - Genome Res. 24, 1637 (2014)
Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray

PubMed Central

Carter, Mark G; Sharov, Alexei A; VanBuren, Vincent; Dudekula, Dawood B; Carmack, Condie E; Nelson, Charlie; Ko, Minoru SH

2005-01-01

The ability to quantitatively measure the expression of all genes in a given tissue or cell with a single assay is an exciting promise of gene-expression profiling technology. An in situ-synthesized 60-mer oligonucleotide microarray designed to detect transcripts from all mouse genes was validated, as well as a set of exogenous RNA controls derived from the yeast genome (made freely available without restriction), which allow quantitative estimation of absolute endogenous transcript abundance. PMID:15998450
Two Low Coverage Bird Genomes and a Comparison of Reference-Guided versus De Novo Genome Assemblies

PubMed Central

Card, Daren C.; Schield, Drew R.; Reyes-Velasco, Jacobo; Fujita, Matthew K.; Andrew, Audra L.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Tomback, Diana F.; Ruggiero, Robert P.; Castoe, Todd A.

2014-01-01

As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark's Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies. PMID:25192061
Locations of the ets subfamily members net, elk1, and sap1 (ELK3, ELK1, and ELK4) on three homologous regions of the mouse and human genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Giovane, A.; Sobieszczuk, P.; Mignon, C.

1995-10-10

Net, Elk1, and Sap1 are related members of the Ets oncoprotein family. We show by in situ hybridization on banded chromosomes with specific cDNA probes that their map positions on mouse and human chromosomes (respectively) are net, 10C-D1 and 12q22-q23 (now called ELK3), sap1, 1E3-G and 1q32 (ELK4), and elk1, XA1-A3 and Xp11.2-p11.1 (ELK1), as well as a second locus 14q32 (ELK2) unique to the human genome. The results for the mouse net, sap1, and elk1 and human ELK3 genes are new. The human elk1 mapping confirms a previous study. The human ELK4 localization agrees with data published during themore » preparation of the manuscript. Human ELK3 colocalizes with sap2, and we confirm that they are identical. These results firmly establish for the first time that Net, Elk1, and Sap1 are distinct gene products with different chromosomal localizations in both the mouse and the human genomes. Net, Elk1, and Sap1 are conserved and map to homologous regions of the mouse and human chromosomes. 19 refs., 1 fig., 1 tab.« less
Technical approaches for mouse models of human disease.

PubMed

Justice, Monica J; Siracusa, Linda D; Stewart, A Francis

2011-05-01

The mouse is the leading organism for disease research. A rich resource of genetic variation occurs naturally in inbred and special strains owing to spontaneous mutations. However, one can also obtain desired gene mutations by using the following processes: targeted mutations that eliminate function in the whole organism or in a specific tissue; forward genetic screens using chemicals or transposons; or the introduction of exogenous transgenes as DNAs, bacterial artificial chromosomes (BACs) or reporter constructs. The mouse is the only mammal that provides such a rich resource of genetic diversity coupled with the potential for extensive genome manipulation, and is therefore a powerful application for modeling human disease. This poster review outlines the major genome manipulations available in the mouse that are used to understand human disease: natural variation, reverse genetics, forward genetics, transgenics and transposons. Each of these applications will be essential for understanding the diversity that is being discovered within the human population.
Identification of genetic elements in metabolism by high-throughput mouse phenotyping.

PubMed

Rozman, Jan; Rathkolb, Birgit; Oestereicher, Manuela A; Schütt, Christine; Ravindranath, Aakash Chavan; Leuchtenberger, Stefanie; Sharma, Sapna; Kistler, Martin; Willershäuser, Monja; Brommage, Robert; Meehan, Terrence F; Mason, Jeremy; Haselimashhadi, Hamed; Hough, Tertius; Mallon, Ann-Marie; Wells, Sara; Santos, Luis; Lelliott, Christopher J; White, Jacqueline K; Sorg, Tania; Champy, Marie-France; Bower, Lynette R; Reynolds, Corey L; Flenniken, Ann M; Murray, Stephen A; Nutter, Lauryl M J; Svenson, Karen L; West, David; Tocchini-Valentini, Glauco P; Beaudet, Arthur L; Bosch, Fatima; Braun, Robert B; Dobbie, Michael S; Gao, Xiang; Herault, Yann; Moshiri, Ala; Moore, Bret A; Kent Lloyd, K C; McKerlie, Colin; Masuya, Hiroshi; Tanaka, Nobuhiko; Flicek, Paul; Parkinson, Helen E; Sedlacek, Radislav; Seong, Je Kyung; Wang, Chi-Kuang Leo; Moore, Mark; Brown, Steve D; Tschöp, Matthias H; Wurst, Wolfgang; Klingenspor, Martin; Wolf, Eckhard; Beckers, Johannes; Machicao, Fausto; Peter, Andreas; Staiger, Harald; Häring, Hans-Ulrich; Grallert, Harald; Campillos, Monica; Maier, Holger; Fuchs, Helmut; Gailus-Durner, Valerie; Werner, Thomas; Hrabe de Angelis, Martin

2018-01-18

Metabolic diseases are a worldwide problem but the underlying genetic factors and their relevance to metabolic disease remain incompletely understood. Genome-wide research is needed to characterize so-far unannotated mammalian metabolic genes. Here, we generate and analyze metabolic phenotypic data of 2016 knockout mouse strains under the aegis of the International Mouse Phenotyping Consortium (IMPC) and find 974 gene knockouts with strong metabolic phenotypes. 429 of those had no previous link to metabolism and 51 genes remain functionally completely unannotated. We compared human orthologues of these uncharacterized genes in five GWAS consortia and indeed 23 candidate genes are associated with metabolic disease. We further identify common regulatory elements in promoters of candidate genes. As each regulatory element is composed of several transcription factor binding sites, our data reveal an extensive metabolic phenotype-associated network of co-regulated genes. Our systematic mouse phenotype analysis thus paves the way for full functional annotation of the genome.
A novel class of small RNAs bind to MILI protein in mouse testes.

PubMed

Aravin, Alexei; Gaidatzis, Dimos; Pfeffer, Sébastien; Lagos-Quintana, Mariana; Landgraf, Pablo; Iovino, Nicola; Morris, Patricia; Brownstein, Michael J; Kuramochi-Miyagawa, Satomi; Nakano, Toru; Chien, Minchen; Russo, James J; Ju, Jingyue; Sheridan, Robert; Sander, Chris; Zavolan, Mihaela; Tuschl, Thomas

2006-07-13

Small RNAs bound to Argonaute proteins recognize partially or fully complementary nucleic acid targets in diverse gene-silencing processes. A subgroup of the Argonaute proteins--known as the 'Piwi family'--is required for germ- and stem-cell development in invertebrates, and two Piwi members--MILI and MIWI--are essential for spermatogenesis in mouse. Here we describe a new class of small RNAs that bind to MILI in mouse male germ cells, where they accumulate at the onset of meiosis. The sequences of the over 1,000 identified unique molecules share a strong preference for a 5' uridine, but otherwise cannot be readily classified into sequence families. Genomic mapping of these small RNAs reveals a limited number of clusters, suggesting that these RNAs are processed from long primary transcripts. The small RNAs are 26-31 nucleotides (nt) in length--clearly distinct from the 21-23 nt of microRNAs (miRNAs) or short interfering RNAs (siRNAs)--and we refer to them as 'Piwi-interacting RNAs' or piRNAs. Orthologous human chromosomal regions also give rise to small RNAs with the characteristics of piRNAs, but the cloned sequences are distinct. The identification of this new class of small RNAs provides an important starting point to determine the molecular function of Piwi proteins in mammalian spermatogenesis.
Mouse Models for Drug Discovery. Can New Tools and Technology Improve Translational Power?

PubMed

Zuberi, Aamir; Lutz, Cathleen

2016-12-01

The use of mouse models in biomedical research and preclinical drug evaluation is on the rise. The advent of new molecular genome-altering technologies such as CRISPR/Cas9 allows for genetic mutations to be introduced into the germ line of a mouse faster and less expensively than previous methods. In addition, the rapid progress in the development and use of somatic transgenesis using viral vectors, as well as manipulations of gene expression with siRNAs and antisense oligonucleotides, allow for even greater exploration into genomics and systems biology. These technological advances come at a time when cost reductions in genome sequencing have led to the identification of pathogenic mutations in patient populations, providing unprecedented opportunities in the use of mice to model human disease. The ease of genetic engineering in mice also offers a potential paradigm shift in resource sharing and the speed by which models are made available in the public domain. Predictively, the knowledge alone that a model can be quickly remade will provide relief to resources encumbered by licensing and Material Transfer Agreements. For decades, mouse strains have provided an exquisite experimental tool to study the pathophysiology of the disease and assess therapeutic options in a genetically defined system. However, a major limitation of the mouse has been the limited genetic diversity associated with common laboratory mice. This has been overcome with the recent development of the Collaborative Cross and Diversity Outbred mice. These strains provide new tools capable of replicating genetic diversity to that approaching the diversity found in human populations. The Collaborative Cross and Diversity Outbred strains thus provide a means to observe and characterize toxicity or efficacy of new therapeutic drugs for a given population. The combination of traditional and contemporary mouse genome editing tools, along with the addition of genetic diversity in new modeling systems, are synergistic and serve to make the mouse a better model for biomedical research, enhancing the potential for preclinical drug discovery and personalized medicine. © The Author 2016. Published by Oxford University Press.
A CRISPR Path to Engineering New Genetic Mouse Models for Cardiovascular Research.

PubMed

Miano, Joseph M; Zhu, Qiuyu Martin; Lowenstein, Charles J

2016-06-01

Previous efforts to target the mouse genome for the addition, subtraction, or substitution of biologically informative sequences required complex vector design and a series of arduous steps only a handful of laboratories could master. The facile and inexpensive clustered regularly interspaced short palindromic repeats (CRISPR) method has now superseded traditional means of genome modification such that virtually any laboratory can quickly assemble reagents for developing new mouse models for cardiovascular research. Here, we briefly review the history of CRISPR in prokaryotes, highlighting major discoveries leading to its formulation for genome modification in the animal kingdom. Core components of CRISPR technology are reviewed and updated. Practical pointers for 2-component and 3-component CRISPR editing are summarized with many applications in mice including frameshift mutations, deletion of enhancers and noncoding genes, nucleotide substitution of protein-coding and gene regulatory sequences, incorporation of loxP sites for conditional gene inactivation, and epitope tag integration. Genotyping strategies are presented and topics of genetic mosaicism and inadvertent targeting discussed. Finally, clinical applications and ethical considerations are addressed as the biomedical community eagerly embraces this astonishing innovation in genome editing to tackle previously intractable questions. © 2016 American Heart Association, Inc.
A CRISPR Path to Engineering New Genetic Mouse Models for Cardiovascular Research

PubMed Central

Miano, Joseph M.; Zhu, Qiuyu Martin; Lowenstein, Charles J.

2016-01-01

Previous efforts to target the mouse genome for the addition, subtraction, or substitution of biologically informative sequences required complex vector design and a series of arduous steps only a handful of labs could master. The facile and inexpensive clustered regularly interspaced short palindromic repeats (CRISPR) method has now superseded traditional means of genome modification such that virtually any lab can quickly assemble reagents for developing new mouse models for cardiovascular research. Here we briefly review the history of CRISPR in prokaryotes, highlighting major discoveries leading to its formulation for genome modification in the animal kingdom. Core components of CRISPR technology are reviewed and updated. Practical pointers for two-component and three-component CRISPR editing are summarized with a number of applications in mice including frameshift mutations, deletion of enhancers and non-coding genes, nucleotide substitution of protein-coding and gene regulatory sequences, incorporation of loxP sites for conditional gene inactivation, and epitope tag integration. Genotyping strategies are presented and topics of genetic mosaicism and inadvertent targeting discussed. Finally, clinical applications and ethical considerations are addressed as the biomedical community eagerly embraces this astonishing innovation in genome editing to tackle previously intractable questions. PMID:27102963

CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.

PubMed

Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H

2010-07-06

The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/
Principles of regulatory information conservation between mouse and human

DOE PAGES

Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; ...

2014-11-19

To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human–mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and withmore » genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Lastly, single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.« less
Maternal Supply of Cas9 to Zygotes Facilitates the Efficient Generation of Site-Specific Mutant Mouse Models

PubMed Central

Cebrian-Serrano, Alberto; Zha, Shijun; Hanssen, Lars; Biggs, Daniel; Preece, Christopher

2017-01-01

Genome manipulation in the mouse via microinjection of CRISPR/Cas9 site-specific nucleases has allowed the production time for genetically modified mouse models to be significantly reduced. Successful genome manipulation in the mouse has already been reported using Cas9 supplied by microinjection of a DNA construct, in vitro transcribed mRNA and recombinant protein. Recently the use of transgenic strains of mice overexpressing Cas9 has been shown to facilitate site-specific mutagenesis via maternal supply to zygotes and this route may provide an alternative to exogenous supply. We have investigated the feasibility of supplying Cas9 genetically in more detail and for this purpose we report the generation of a transgenic mice which overexpress Cas9 ubiquitously, via a CAG-Cas9 transgene targeted to the Gt(ROSA26)Sor locus. We show that zygotes prepared from female mice harbouring this transgene are sufficiently loaded with maternally contributed Cas9 for efficient production of embryos and mice harbouring indel, genomic deletion and knock-in alleles by microinjection of guide RNAs and templates alone. We compare the mutagenesis rates and efficacy of mutagenesis using this genetic supply with exogenous Cas9 supply by either mRNA or protein microinjection. In general, we report increased generation rates of knock-in alleles and show that the levels of mutagenesis at certain genome target sites are significantly higher and more consistent when Cas9 is supplied genetically relative to exogenous supply. PMID:28081254
The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics.

PubMed

Gopalakrishnan, Shyam; Samaniego Castruita, Jose A; Sinding, Mikkel-Holger S; Kuderna, Lukas F K; Räikkönen, Jannikke; Petersen, Bent; Sicheritz-Ponten, Thomas; Larson, Greger; Orlando, Ludovic; Marques-Bonet, Tomas; Hansen, Anders J; Dalén, Love; Gilbert, M Thomas P

2017-06-29

An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a boxer dog (Canis lupus familiaris). We generated the first de novo wolf genome (Canis lupus lupus) as an additional choice of reference, and explored what implications may arise when previously published dog and wolf resequencing data are remapped to this reference. Reassuringly, we find that regardless of the reference genome choice, most evolutionary genomic analyses yield qualitatively similar results, including those exploring the structure between the wolves and dogs using admixture and principal component analysis. However, we do observe differences in the genomic coverage of re-mapped samples, the number of variants discovered, and heterozygosity estimates of the samples. In conclusion, the choice of reference is dictated by the aims of the study being undertaken; if the study focuses on the differences between the different dog breeds or the fine structure among dogs, then using the boxer reference genome is appropriate, but if the aim of the study is to look at the variation within wolves and their relationships to dogs, then there are clear benefits to using the de novo assembled wolf reference genome.
Genomic resources for wild populations of the house mouse, Mus musculus and its close relative Mus spretus

PubMed Central

Harr, Bettina; Karakoc, Emre; Neme, Rafik; Teschke, Meike; Pfeifle, Christine; Pezer, Željka; Babiker, Hiba; Linnenbrink, Miriam; Montero, Inka; Scavetta, Rick; Abai, Mohammad Reza; Molins, Marta Puente; Schlegel, Mathias; Ulrich, Rainer G.; Altmüller, Janine; Franitza, Marek; Büntge, Anna; Künzel, Sven; Tautz, Diethard

2016-01-01

Wild populations of the house mouse (Mus musculus) represent the raw genetic material for the classical inbred strains in biomedical research and are a major model system for evolutionary biology. We provide whole genome sequencing data of individuals representing natural populations of M. m. domesticus (24 individuals from 3 populations), M. m. helgolandicus (3 individuals), M. m. musculus (22 individuals from 3 populations) and M. spretus (8 individuals from one population). We use a single pipeline to map and call variants for these individuals and also include 10 additional individuals of M. m. castaneus for which genomic data are publically available. In addition, RNAseq data were obtained from 10 tissues of up to eight adult individuals from each of the three M. m. domesticus populations for which genomic data were collected. Data and analyses are presented via tracks viewable in the UCSC or IGV genome browsers. We also provide information on available outbred stocks and instructions on how to keep them in the laboratory. PMID:27622383
Passenger mutations and aberrant gene expression in congenic tissue plasminogen activator-deficient mouse strains.

PubMed

Szabo, R; Samson, A L; Lawrence, D A; Medcalf, R L; Bugge, T H

2016-08-01

Essentials C57BL/6J-tissue plasminogen activator (tPA)-deficient mice are widely used to study tPA function. Congenic C57BL/6J-tPA-deficient mice harbor large 129-derived chromosomal segments. The 129-derived chromosomal segments contain gene mutations that may confound data interpretation. Passenger mutation-free isogenic tPA-deficient mice were generated for study of tPA function. Background The ability to generate defined null mutations in mice revolutionized the analysis of gene function in mammals. However, gene-deficient mice generated by using 129-derived embryonic stem cells may carry large segments of 129 DNA, even when extensively backcrossed to reference strains, such as C57BL/6J, and this may confound interpretation of experiments performed in these mice. Tissue plasminogen activator (tPA), encoded by the PLAT gene, is a fibrinolytic serine protease that is widely expressed in the brain. A number of neurological abnormalities have been reported in tPA-deficient mice. Objectives To study genetic contamination of tPA-deficient mice. Materials and methods Whole genome expression array analysis, RNAseq expression profiling, low- and high-density single nucleotide polymorphism (SNP) analysis, bioinformatics and genome editing were used to analyze gene expression in tPA-deficient mouse brains. Results and conclusions Genes differentially expressed in the brain of Plat(-/-) mice from two independent colonies highly backcrossed onto the C57BL/6J strain clustered near Plat on chromosome 8. SNP analysis attributed this anomaly to about 20 Mbp of DNA flanking Plat being of 129 origin in both strains. Bioinformatic analysis of these 129-derived chromosomal segments identified a significant number of mutations in genes co-segregating with the targeted Plat allele, including several potential null mutations. Using zinc finger nuclease technology, we generated novel 'passenger mutation'-free isogenic C57BL/6J-Plat(-/-) and FVB/NJ-Plat(-/-) mouse strains by introducing an 11 bp deletion into the exon encoding the signal peptide. These novel mouse strains will be a useful community resource for further exploration of tPA function in physiological and pathological processes. © 2016 International Society on Thrombosis and Haemostasis.
Structure of novel rat major histocompatibility complex class II genes RT1.Ha and Hb

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arimura, Yutaka; Tang, Wei Ran; Koda, Toshiaki

1995-03-01

We have cloned the novel rat MHC class II genes, RT1.Ha and Hb, which are homologous to human HLA-DPA and DPB. RT1.Hb is a pseudogene, whereas RT1.Ha is apparently intact and may have transcriptional potential. In addition, with an RT1.Ha probe, we detecteda single Southern hybridization band in the genome of the mouse. This finding may aford an opportunity to analyze the HLA-DPA homologue in the mouse genome. 18 refs., 4 figs., 1 tab.
Efficient mouse genome engineering by CRISPR-EZ technology.

PubMed

Modzelewski, Andrew J; Chen, Sean; Willis, Brandon J; Lloyd, K C Kent; Wood, Joshua A; He, Lin

2018-06-01

CRISPR/Cas9 technology has transformed mouse genome editing with unprecedented precision, efficiency, and ease; however, the current practice of microinjecting CRISPR reagents into pronuclear-stage embryos remains rate-limiting. We thus developed CRISPR ribonucleoprotein (RNP) electroporation of zygotes (CRISPR-EZ), an electroporation-based technology that outperforms pronuclear and cytoplasmic microinjection in efficiency, simplicity, cost, and throughput. In C57BL/6J and C57BL/6N mouse strains, CRISPR-EZ achieves 100% delivery of Cas9/single-guide RNA (sgRNA) RNPs, facilitating indel mutations (insertions or deletions), exon deletions, point mutations, and small insertions. In a side-by-side comparison in the high-throughput KnockOut Mouse Project (KOMP) pipeline, CRISPR-EZ consistently outperformed microinjection. Here, we provide an optimized protocol covering sgRNA synthesis, embryo collection, RNP electroporation, mouse generation, and genotyping strategies. Using CRISPR-EZ, a graduate-level researcher with basic embryo-manipulation skills can obtain genetically modified mice in 6 weeks. Altogether, CRISPR-EZ is a simple, economic, efficient, and high-throughput technology that is potentially applicable to other mammalian species.
Translating human genetics into mouse: the impact of ultra-rapid in vivo genome editing.

PubMed

Aida, Tomomi; Imahashi, Risa; Tanaka, Kohichi

2014-01-01

Gene-targeted mutant animals, such as knockout or knockin mice, have dramatically improved our understanding of the functions of genes in vivo and the genetic diversity that characterizes health and disease. However, the generation of targeted mice relies on gene targeting in embryonic stem (ES) cells, which is a time-consuming, laborious, and expensive process. The recent groundbreaking development of several genome editing technologies has enabled the targeted alteration of almost any sequence in any cell or organism. These technologies have now been applied to mouse zygotes (in vivo genome editing), thereby providing new avenues for simple, convenient, and ultra-rapid production of knockout or knockin mice without the need for ES cells. Here, we review recent achievements in the production of gene-targeted mice by in vivo genome editing. © 2013 The Authors Development, Growth & Differentiation © 2013 Japanese Society of Developmental Biologists.
Divergent genome evolution caused by regional variation in DNA gain and loss between human and mouse

PubMed Central

Kortschak, R. Daniel

2018-01-01

The forces driving the accumulation and removal of non-coding DNA and ultimately the evolution of genome size in complex organisms are intimately linked to genome structure and organisation. Our analysis provides a novel method for capturing the regional variation of lineage-specific DNA gain and loss events in their respective genomic contexts. To further understand this connection we used comparative genomics to identify genome-wide individual DNA gain and loss events in the human and mouse genomes. Focusing on the distribution of DNA gains and losses, relationships to important structural features and potential impact on biological processes, we found that in autosomes, DNA gains and losses both followed separate lineage-specific accumulation patterns. However, in both species chromosome X was particularly enriched for DNA gain, consistent with its high L1 retrotransposon content required for X inactivation. We found that DNA loss was associated with gene-rich open chromatin regions and DNA gain events with gene-poor closed chromatin regions. Additionally, we found that DNA loss events tended to be smaller than DNA gain events suggesting that they were able to accumulate in gene-rich open chromatin regions due to their reduced capacity to interrupt gene regulatory architecture. GO term enrichment showed that mouse loss hotspots were strongly enriched for terms related to developmental processes. However, these genes were also located in regions with a high density of conserved elements, suggesting that despite high levels of DNA loss, gene regulatory architecture remained conserved. This is consistent with a model in which DNA gain and loss results in turnover or “churning” in regulatory element dense regions of open chromatin, where interruption of regulatory elements is selected against. PMID:29677183
DNA methylation dynamics in mouse preimplantation embryos revealed by mass spectrometry.

PubMed

Okamoto, Yoshinori; Yoshida, Naoko; Suzuki, Toru; Shimozawa, Nobuhiro; Asami, Maki; Matsuda, Tomonari; Kojima, Nakao; Perry, Anthony C F; Takada, Tatsuyuki

2016-01-11

Following fertilization in mammals, paternal genomic 5-methyl-2'-deoxycytidine (5 mC) content is thought to decrease via oxidation to 5-hydroxymethyl-2'-deoxycytidine (5 hmC). This reciprocal model of demethylation and hydroxymethylation is inferred from indirect, non-quantitative methods. We here report direct quantification of genomic 5 mC and 5 hmC in mouse embryos by small scale liquid chromatographic tandem mass spectrometry (SMM). Profiles of absolute 5 mC levels in embryos produced by in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) were almost identical. By 10 h after fertilization, 5 mC levels had declined by ~40%, consistent with active genomic DNA demethylation. Levels of 5 mC in androgenotes (containing only a paternal genome) and parthenogenotes (containing only a maternal genome) underwent active 5 mC loss in the first 6 h, showing that both parental genomes can undergo demethylation independently. We found no evidence for net loss of 5 mC 10-48 h after fertilization, implying that any passive 'demethylation' following DNA replication was balanced by active 5 mC maintenance methylation. However, levels of 5 mC declined during development after 48 h, to 1% (measured as a fraction of G-residues) in blastocysts (~96 h). 5 hmC levels were consistently low (<0.2% of G-residues) throughout development in normal diploid embryos. This work directly quantifies the dynamics of global genomic DNA modification in mouse preimplantation embryos, suggesting that SMM will be applicable to other biomedical situations with limiting sample sizes.
Selective Amplification of the Genome Surrounding Key Placental Genes in Trophoblast Giant Cells.

PubMed

Hannibal, Roberta L; Baker, Julie C

2016-01-25

While most cells maintain a diploid state, polyploid cells exist in many organisms and are particularly prevalent within the mammalian placenta [1], where they can generate more than 900 copies of the genome [2]. Polyploidy is thought to be an efficient method of increasing the content of the genome by avoiding the costly and slow process of cytokinesis [1, 3, 4]. Polyploidy can also affect gene regulation by amplifying a subset of genomic regions required for specific cellular function [1, 3, 4]. This mechanism is found in the fruit fly Drosophila melanogaster, where polyploid ovarian follicle cells amplify genomic regions containing chorion genes, which facilitate secretion of eggshell proteins [5]. Here, we report that genomic amplification also occurs in mammals at selective regions of the genome in parietal trophoblast giant cells (p-TGCs) of the mouse placenta. Using whole-genome sequencing (WGS) and digital droplet PCR (ddPCR) of mouse p-TGCs, we identified five amplified regions, each containing a gene family known to be involved in mammalian placentation: the prolactins (two clusters), serpins, cathepsins, and the natural killer (NK)/C-type lectin (CLEC) complex [6-12]. We report here the first description of amplification at selective genomic regions in mammals and present evidence that this is an important mode of genome regulation in placental TGCs. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

PubMed Central

Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

2003-01-01

Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Impact of the choice of reference genome on the ability of the core genome SNV methodology to distinguish strains of Salmonella enterica serovar Heidelberg.

PubMed

Usongo, Valentine; Berry, Chrystal; Yousfi, Khadidja; Doualla-Bell, Florence; Labbé, Genevieve; Johnson, Roger; Fournier, Eric; Nadon, Celine; Goodridge, Lawrence; Bekal, Sadjia

2018-01-01

Salmonella enterica serovar Heidelberg (S. Heidelberg) is one of the top serovars causing human salmonellosis. The core genome single nucleotide variant pipeline (cgSNV) is one of several whole genome based sequence typing methods used for the laboratory investigation of foodborne pathogens. SNV detection using this method requires a reference genome. The purpose of this study was to investigate the impact of the choice of the reference genome on the cgSNV-informed phylogenetic clustering and inferred isolate relationships. We found that using a draft or closed genome of S. Heidelberg as reference did not impact the ability of the cgSNV methodology to differentiate among 145 S. Heidelberg isolates involved in foodborne outbreaks. We also found that using a distantly related genome such as S. Dublin as choice of reference led to a loss in resolution since some sporadic isolates were found to cluster together with outbreak isolates. In addition, the genetic distances between outbreak isolates as well as between outbreak and sporadic isolates were overall reduced when S. Dublin was used as the reference genome as opposed to S. Heidelberg.
CAR: contig assembly of prokaryotic draft genomes using rearrangements.

PubMed

Lu, Chin Lung; Chen, Kun-Tze; Huang, Shih-Yuan; Chiu, Hsien-Tai

2014-11-28

Next generation sequencing technology has allowed efficient production of draft genomes for many organisms of interest. However, most draft genomes are just collections of independent contigs, whose relative positions and orientations along the genome being sequenced are unknown. Although several tools have been developed to order and orient the contigs of draft genomes, more accurate tools are still needed. In this study, we present a novel reference-based contig assembly (or scaffolding) tool, named as CAR, that can efficiently and more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome of a related organism. Given a set of contigs in multi-FASTA format and a reference genome in FASTA format, CAR can output a list of scaffolds, each of which is a set of ordered and oriented contigs. For validation, we have tested CAR on a real dataset composed of several prokaryotic genomes and also compared its performance with several other reference-based contig assembly tools. Consequently, our experimental results have shown that CAR indeed performs better than all these other reference-based contig assembly tools in terms of sensitivity, precision and genome coverage. CAR serves as an efficient tool that can more accurately order and orient the contigs of a prokaryotic draft genome based on a reference genome. The web server of CAR is freely available at http://genome.cs.nthu.edu.tw/CAR/ and its stand-alone program can also be downloaded from the same website.
Metatranscriptomic analysis of diverse microbial communities reveals core metabolic pathways and microbiome-specific functionality.

PubMed

Jiang, Yue; Xiong, Xuejian; Danska, Jayne; Parkinson, John

2016-01-12

Metatranscriptomics is emerging as a powerful technology for the functional characterization of complex microbial communities (microbiomes). Use of unbiased RNA-sequencing can reveal both the taxonomic composition and active biochemical functions of a complex microbial community. However, the lack of established reference genomes, computational tools and pipelines make analysis and interpretation of these datasets challenging. Systematic studies that compare data across microbiomes are needed to demonstrate the ability of such pipelines to deliver biologically meaningful insights on microbiome function. Here, we apply a standardized analytical pipeline to perform a comparative analysis of metatranscriptomic data from diverse microbial communities derived from mouse large intestine, cow rumen, kimchi culture, deep-sea thermal vent and permafrost. Sequence similarity searches allowed annotation of 19 to 76% of putative messenger RNA (mRNA) reads, with the highest frequency in the kimchi dataset due to its relatively low complexity and availability of closely related reference genomes. Metatranscriptomic datasets exhibited distinct taxonomic and functional signatures. From a metabolic perspective, we identified a common core of enzymes involved in amino acid, energy and nucleotide metabolism and also identified microbiome-specific pathways such as phosphonate metabolism (deep sea) and glycan degradation pathways (cow rumen). Integrating taxonomic and functional annotations within a novel visualization framework revealed the contribution of different taxa to metabolic pathways, allowing the identification of taxa that contribute unique functions. The application of a single, standard pipeline confirms that the rich taxonomic and functional diversity observed across microbiomes is not simply an artefact of different analysis pipelines but instead reflects distinct environmental influences. At the same time, our findings show how microbiome complexity and availability of reference genomes can impact comprehensive annotation of metatranscriptomes. Consequently, beyond the application of standardized pipelines, additional caution must be taken when interpreting their output and performing downstream, microbiome-specific, analyses. The pipeline used in these analyses along with a tutorial has been made freely available for download from our project website: http://www.compsysbio.org/microbiome .
Expression of hepatitis B virus 1.3-fold genome plasmid in an SV40 T-antigen-immortalized mouse hepatic cell line

PubMed Central

Song, Xiu-Guang; Bian, Peng-Fei; Yu, Shu-Li; Zhao, Xiu-Hua; Xu, Wei; Bu, Xue-Hui; Li, Xia; Ma, Li-Xian

2013-01-01

AIM: To investigate the expression of the hepatitis B virus (HBV) 1.3-fold genome plasmid (pHBV1.3) in an immortalized mouse hepatic cell line induced by SV40 T-antigen (SV40T) expression. METHODS: Mouse hepatic cells were isolated from mouse liver tissue fragments from 3-5 d old Kunming mice by the direct collagenase digestion method and cultured in vitro. The pRSV-T plasmid was transfected into mouse hepatic cells to establish an SV40LT-immortalized mouse hepatic cell line. The SV40LT-immortalized mouse hepatic cells were identified and transfected with the pHBV1.3 plasmid. The levels of hepatitis B surface antigen (HBsAg) and hepatitis B e antigen (HBeAg) in the supernatant were determined by an electrochemiluminescence immunoassay at 24, 48, 72 and 96 h after transfection. The expressions of HBsAg and hepatitis B c antigen (HBcAg) in the cells were investigated by indirect immunofluorescence analysis. The presence of HBV DNA replication intermediates in the transfected cells and viral particles in the supernatant of the transfected cell cultures was monitored using the Southern hybridization assay and transmission electronic microscopy, respectively. RESULTS: The pRSV-T plasmid was used to immortalize mouse hepatocytes and an SV40LT-immortalized mouse hepatic cell line was successfully established. SV40LT-immortalized mouse hepatic cells have the same morphology and growth characteristics as primary mouse hepatic cells can be subcultured and produce albumin and cytokeratin-18 in vitro. Immortalized mouse hepatic cells did not show the characteristics of tumor cells, as alpha-fetoprotein levels were comparable (0.58 ± 0.37 vs 0.61 ± 0.31, P = 0.37). SV40LT-immortalized mouse hepatic cells were then transfected with the pHBV1.3 plasmid, and it was found that the HBV genome replicated in SV40LT-immortalized mouse hepatic cells. The levels of HBsAg and HBeAg continuously increased in the supernatant after the transfection of pHBV1.3, and began to decrease 72 h after transfection. The expressions of HBsAg and HBcAg were observed in the pHBV1.3-transfected cells. HBV DNA replication intermediates were also observed at 72 h after transfection, including relaxed circular DNA, double-stranded DNA and single-stranded DNA. Furthermore, a few 42 nm Dane particles, as well as many 22 nm subviral particles with a spherical or filamentous shape, were detected in the supernatant. CONCLUSION: SV40T expression can immortalize mouse hepatic cells, and the pHBV1.3-transfected SV40T-immortalized mouse hepatic cell line can be a new in vitro cell model. PMID:24307795
Expression of hepatitis B virus 1.3-fold genome plasmid in an SV40 T-antigen-immortalized mouse hepatic cell line.

PubMed

Song, Xiu-Guang; Bian, Peng-Fei; Yu, Shu-Li; Zhao, Xiu-Hua; Xu, Wei; Bu, Xue-Hui; Li, Xia; Ma, Li-Xian

2013-11-28

To investigate the expression of the hepatitis B virus (HBV) 1.3-fold genome plasmid (pHBV1.3) in an immortalized mouse hepatic cell line induced by SV40 T-antigen (SV40T) expression. Mouse hepatic cells were isolated from mouse liver tissue fragments from 3-5 d old Kunming mice by the direct collagenase digestion method and cultured in vitro. The pRSV-T plasmid was transfected into mouse hepatic cells to establish an SV40LT-immortalized mouse hepatic cell line. The SV40LT-immortalized mouse hepatic cells were identified and transfected with the pHBV1.3 plasmid. The levels of hepatitis B surface antigen (HBsAg) and hepatitis B e antigen (HBeAg) in the supernatant were determined by an electrochemiluminescence immunoassay at 24, 48, 72 and 96 h after transfection. The expressions of HBsAg and hepatitis B c antigen (HBcAg) in the cells were investigated by indirect immunofluorescence analysis. The presence of HBV DNA replication intermediates in the transfected cells and viral particles in the supernatant of the transfected cell cultures was monitored using the Southern hybridization assay and transmission electronic microscopy, respectively. The pRSV-T plasmid was used to immortalize mouse hepatocytes and an SV40LT-immortalized mouse hepatic cell line was successfully established. SV40LT-immortalized mouse hepatic cells have the same morphology and growth characteristics as primary mouse hepatic cells can be subcultured and produce albumin and cytokeratin-18 in vitro. Immortalized mouse hepatic cells did not show the characteristics of tumor cells, as alpha-fetoprotein levels were comparable (0.58 ± 0.37 vs 0.61 ± 0.31, P = 0.37). SV40LT-immortalized mouse hepatic cells were then transfected with the pHBV1.3 plasmid, and it was found that the HBV genome replicated in SV40LT-immortalized mouse hepatic cells. The levels of HBsAg and HBeAg continuously increased in the supernatant after the transfection of pHBV1.3, and began to decrease 72 h after transfection. The expressions of HBsAg and HBcAg were observed in the pHBV1.3-transfected cells. HBV DNA replication intermediates were also observed at 72 h after transfection, including relaxed circular DNA, double-stranded DNA and single-stranded DNA. Furthermore, a few 42 nm Dane particles, as well as many 22 nm subviral particles with a spherical or filamentous shape, were detected in the supernatant. SV40T expression can immortalize mouse hepatic cells, and the pHBV1.3-transfected SV40T-immortalized mouse hepatic cell line can be a new in vitro cell model.
A mouse diversity panel approach reveals the potential for clinical kidney injury due to DB289 not predicted by classical rodent models.

PubMed

Harrill, Alison H; Desmet, Kristina D; Wolf, Kristina K; Bridges, Arlene S; Eaddy, J Scott; Kurtz, C Lisa; Hall, J Ed; Paine, Mary F; Tidwell, Richard R; Watkins, Paul B

2012-12-01

DB289 is the first oral drug shown in clinical trials to have efficacy in treating African trypanosomiasis (African sleeping sickness). Mild liver toxicity was noted but was not treatment limiting. However, development of DB289 was terminated when several treated subjects developed severe kidney injury, a liability not predicted from preclinical testing. We tested the hypothesis that the kidney safety liability of DB289 would be detected in a mouse diversity panel (MDP) comprised of 34 genetically diverse inbred mouse strains. MDP mice received 10 days of oral treatment with DB289 or vehicle and classical renal biomarkers blood urea nitrogen (BUN) and serum creatinine (sCr), as well as urine biomarkers of kidney injury were measured. While BUN and sCr remained within reference ranges, marked elevations were observed for kidney injury molecule-1 (KIM-1) in the urine of sensitive mouse strains. KIM-1 elevations were not always coincident with elevations in alanine aminotransferase (ALT), suggesting that renal injury was not linked to hepatic injury. Genome-wide association analyses of KIM-1 elevations indicated that genes participating in cholesterol and lipid biosynthesis and transport, oxidative stress, and cytokine release may play a role in DB289 renal injury. Taken together, the data resulting from this study highlight the utility of using an MDP to predict clinically relevant toxicities, to identify relevant toxicity biomarkers that may translate into the clinic, and to identify potential mechanisms underlying toxicities. In addition, the sensitive mouse strains identified in this study may be useful in screening next-in-class compounds for renal injury.
Comparative analysis of the 5{prime} genomic and promoter regions between the mouse (Hdh) and human Huntington disease (HD) gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalchman, M.; Lin, B.; Nasir, J.

1994-09-01

The mouse homologue of the Huntington disease gene (Hdh) has recently been cloned and mapped to a region of synteny with the human, on mouse chromosome 5. The two genes share a high degree of both coding (90% amino acid) and nucleotide (86.2%) identity. We have subsequently performed a detailed comparison of the genomic organization of the 5{prime} region of the two genes encompassing the promoter region and first five exons of both the human and mouse genes. The comparative sequence analysis of the promoter region between HD and Hdh reveals two highly conserved regions. One region (-56 to -118)more » (+1 is the ATG start codon), shared 84% nucleotide identity and another region (-130 to -206) had 81% nucleotide identity. Nine putative Sp1 sites appear in the human promoter region contrasted with only 3 in a similar region in the mouse. Furthermore, 17 and 20 base pair direct repeats present in the HD 5{prime} region are absent in the similar Hdh region. Although both the mouse and human intron/exon boundaries conform to the GT/AG rule, the intron sizes between HD and Hdh are markedly different. The first four introns in Hdh are 15, 7, 5 and 0.5 kb compared to sizes of 10, 15, 7 and 0.5 kb, respectively. Comparison between the mouse and human intronic sequences immediately adjacent to the first five exons (excluding exon 1) reveals only about 46 to 50% identity within the first 60 bp of intronic sequence. Furthermore, we have identified novel polymorphic di-, tri- and tetra-nucleotide repeats in Hdh introns of various mouse strains that are not present in the human. For example, polymorphic CT repeats are present in introns 2 and 4 of Hdh and a novel mouse 56 AAG trinucleotide repeat (interrupted by an AAGG) is also located within intron 2. This information concerning the promoter and genomic organization of both HD and Hdh is critical for designing appropriate gene targetting vectors for studying the normal function of the HD and Hdh genes in model systems.« less

Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk

PubMed Central

Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B.; Huson, Daniel H.; Frick, Julia-Stefanie

2016-01-01

Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. PMID:27071651
Reference-guided assembly of four diverse Arabidopsis thaliana genomes

PubMed Central

Schneeberger, Korbinian; Ossowski, Stephan; Ott, Felix; Klein, Juliane D.; Wang, Xi; Lanz, Christa; Smith, Lisa M.; Cao, Jun; Fitz, Joffrey; Warthmann, Norman; Henz, Stefan R.; Huson, Daniel H.; Weigel, Detlef

2011-01-01

We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly. Comparisons with 2 Mb of dideoxy sequence reveal that the per-base error rate of the reference-guided assemblies was below 1 in 10,000. Our assemblies provide a detailed, genomewide picture of large-scale differences between A. thaliana individuals, most of which are difficult to access with alignment-consensus methods only. We demonstrate their practical relevance in studying the expression differences of polymorphic genes and show how the analysis of sRNA sequencing data can lead to erroneous conclusions if aligned against the reference genome alone. Genome assemblies, raw reads, and further information are accessible through http://1001genomes.org/projects/assemblies.html. PMID:21646520
Sharing reference data and including cows in the reference population improve genomic predictions in Danish Jersey

USDA-ARS?s Scientific Manuscript database

Small reference populations limit the accuracy of genomic prediction in numerically small breeds, such as the Danish Jersey. The objective of this study was to investigate two approaches to improve genomic prediction by increasing the size of the reference population for Danish Jerseys. The first ap...
GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

PubMed

Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

2017-01-25

Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.
SPAR: small RNA-seq portal for analysis of sequencing experiments.

PubMed

Kuksa, Pavel P; Amlie-Wolf, Alexandre; Katanic, Živadin; Valladares, Otto; Wang, Li-San; Leung, Yuk Yee

2018-05-04

The introduction of new high-throughput small RNA sequencing protocols that generate large-scale genomics datasets along with increasing evidence of the significant regulatory roles of small non-coding RNAs (sncRNAs) have highlighted the urgent need for tools to analyze and interpret large amounts of small RNA sequencing data. However, it remains challenging to systematically and comprehensively discover and characterize sncRNA genes and specifically-processed sncRNA products from these datasets. To fill this gap, we present Small RNA-seq Portal for Analysis of sequencing expeRiments (SPAR), a user-friendly web server for interactive processing, analysis, annotation and visualization of small RNA sequencing data. SPAR supports sequencing data generated from various experimental protocols, including smRNA-seq, short total RNA sequencing, microRNA-seq, and single-cell small RNA-seq. Additionally, SPAR includes publicly available reference sncRNA datasets from our DASHR database and from ENCODE across 185 human tissues and cell types to produce highly informative small RNA annotations across all major small RNA types and other features such as co-localization with various genomic features, precursor transcript cleavage patterns, and conservation. SPAR allows the user to compare the input experiment against reference ENCODE/DASHR datasets. SPAR currently supports analyses of human (hg19, hg38) and mouse (mm10) sequencing data. SPAR is freely available at https://www.lisanwanglab.org/SPAR.
Functional genomic screening reveals asparagine dependence as a metabolic vulnerability in sarcoma

PubMed Central

Hettmer, Simone; Schinzel, Anna C; Tchessalova, Daria; Schneider, Michaela; Parker, Christina L; Bronson, Roderick T; Richards, Nigel GJ; Hahn, William C; Wagers, Amy J

2015-01-01

Current therapies for sarcomas are often inadequate. This study sought to identify actionable gene targets by selective targeting of the molecular networks that support sarcoma cell proliferation. Silencing of asparagine synthetase (ASNS), an amidotransferase that converts aspartate into asparagine, produced the strongest inhibitory effect on sarcoma growth in a functional genomic screen of mouse sarcomas generated by oncogenic Kras and disruption of Cdkn2a. ASNS silencing in mouse and human sarcoma cell lines reduced the percentage of S phase cells and impeded new polypeptide synthesis. These effects of ASNS silencing were reversed by exogenous supplementation with asparagine. Also, asparagine depletion via the ASNS inhibitor amino sulfoximine 5 (AS5) or asparaginase inhibited mouse and human sarcoma growth in vitro, and genetic silencing of ASNS in mouse sarcoma cells combined with depletion of plasma asparagine inhibited tumor growth in vivo. Asparagine reliance of sarcoma cells may represent a metabolic vulnerability with potential anti-sarcoma therapeutic value. DOI: http://dx.doi.org/10.7554/eLife.09436.001 PMID:26499495
Existence of host-related DNA sequences in the schistosome genome.

PubMed

Iwamura, Y; Irie, Y; Kominami, R; Nara, T; Yasuraoka, K

1991-06-01

DNA sequences homologous to the mouse intracisternal A particle and endogenous type C retrovirus were detected in the DNAs of Schistosoma japonicum adults and S. mansoni eggs. Furthermore, other kinds of repetitive sequences in the host genome such as mouse type 1 Alu sequence (B1), mouse type 2 Alu sequence (B2) and mo-2 sequence, a mouse mini-satellite, were also detected in the DNAs from adults and eggs of S. japonicum and eggs of S. mansoni. Almost all of the sequences described above were absent in the DNAs of S. mansoni adults. The DNA fingerprints of schistosomes, using the mo-2 sequence, were indistinguishable from each other and resembled those of their murine hosts. Moreover, the mo-2 sequence was hypermethylated in the DNAs of schistosomes and its amount was variable in them. These facts indicate that host-related sequences are actually present in schistosomes and that the mo-2 repetitive sequence exists probably in extra-chromosome.
Augmenting Chinese hamster genome assembly by identifying regions of high confidence.

PubMed

Vishwanathan, Nandita; Bandyopadhyay, Arpan A; Fu, Hsu-Yuan; Sharma, Mohit; Johnson, Kathryn C; Mudge, Joann; Ramaraj, Thiruvarangan; Onsongo, Getiria; Silverstein, Kevin A T; Jacob, Nitya M; Le, Huong; Karypis, George; Hu, Wei-Shou

2016-09-01

Chinese hamster Ovary (CHO) cell lines are the dominant industrial workhorses for therapeutic recombinant protein production. The availability of genome sequence of Chinese hamster and CHO cells will spur further genome and RNA sequencing of producing cell lines. However, the mammalian genomes assembled using shot-gun sequencing data still contain regions of uncertain quality due to assembly errors. Identifying high confidence regions in the assembled genome will facilitate its use for cell engineering and genome engineering. We assembled two independent drafts of Chinese hamster genome by de novo assembly from shotgun sequencing reads and by re-scaffolding and gap-filling the draft genome from NCBI for improved scaffold lengths and gap fractions. We then used the two independent assemblies to identify high confidence regions using two different approaches. First, the two independent assemblies were compared at the sequence level to identify their consensus regions as "high confidence regions" which accounts for at least 78 % of the assembled genome. Further, a genome wide comparison of the Chinese hamster scaffolds with mouse chromosomes revealed scaffolds with large blocks of collinearity, which were also compiled as high-quality scaffolds. Genome scale collinearity was complemented with EST based synteny which also revealed conserved gene order compared to mouse. As cell line sequencing becomes more commonly practiced, the approaches reported here are useful for assessing the quality of assembly and potentially facilitate the engineering of cell lines. Copyright © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Recruiting Human Microbiome Shotgun Data to Site-Specific Reference Genomes

PubMed Central

Xie, Gary; Lo, Chien-Chi; Scholz, Matthew; Chain, Patrick S. G.

2014-01-01

The human body consists of innumerable multifaceted environments that predispose colonization by a number of distinct microbial communities, which play fundamental roles in human health and disease. In addition to community surveys and shotgun metagenomes that seek to explore the composition and diversity of these microbiomes, there are significant efforts to sequence reference microbial genomes from many body sites of healthy adults. To illustrate the utility of reference genomes when studying more complex metagenomes, we present a reference-based analysis of sequence reads generated from 55 shotgun metagenomes, selected from 5 major body sites, including 16 sub-sites. Interestingly, between 13% and 92% (62.3% average) of these shotgun reads were aligned to a then-complete list of 2780 reference genomes, including 1583 references for the human microbiome. However, no reference genome was universally found in all body sites. For any given metagenome, the body site-specific reference genomes, derived from the same body site as the sample, accounted for an average of 58.8% of the mapped reads. While different body sites did differ in abundant genera, proximal or symmetrical body sites were found to be most similar to one another. The extent of variation observed, both between individuals sampled within the same microenvironment, or at the same site within the same individual over time, calls into question comparative studies across individuals even if sampled at the same body site. This study illustrates the high utility of reference genomes and the need for further site-specific reference microbial genome sequencing, even within the already well-sampled human microbiome. PMID:24454771
CSAR-web: a web server of contig scaffolding using algebraic rearrangements.

PubMed

Chen, Kun-Tze; Lu, Chin Lung

2018-05-04

CSAR-web is a web-based tool that allows the users to efficiently and accurately scaffold (i.e. order and orient) the contigs of a target draft genome based on a complete or incomplete reference genome from a related organism. It takes as input a target genome in multi-FASTA format and a reference genome in FASTA or multi-FASTA format, depending on whether the reference genome is complete or incomplete, respectively. In addition, it requires the users to choose either 'NUCmer on nucleotides' or 'PROmer on translated amino acids' for CSAR-web to identify conserved genomic markers (i.e. matched sequence regions) between the target and reference genomes, which are used by the rearrangement-based scaffolding algorithm in CSAR-web to order and orient the contigs of the target genome based on the reference genome. In the output page, CSAR-web displays its scaffolding result in a graphical mode (i.e. scalable dotplot) allowing the users to visually validate the correctness of scaffolded contigs and in a tabular mode allowing the users to view the details of scaffolds. CSAR-web is available online at http://genome.cs.nthu.edu.tw/CSAR-web.
Genomes of the Mouse Collaborative Cross.

PubMed

Srivastava, Anuj; Morgan, Andrew P; Najarian, Maya L; Sarsani, Vishal Kumar; Sigmon, J Sebastian; Shorter, John R; Kashfeen, Anwica; McMullan, Rachel C; Williams, Lucy H; Giusti-Rodríguez, Paola; Ferris, Martin T; Sullivan, Patrick; Hock, Pablo; Miller, Darla R; Bell, Timothy A; McMillan, Leonard; Churchill, Gary A; de Villena, Fernando Pardo-Manuel

2017-06-01

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources. Copyright © 2017 Srivastava et al.
Genomes of the Mouse Collaborative Cross

PubMed Central

Srivastava, Anuj; Morgan, Andrew P.; Najarian, Maya L.; Sarsani, Vishal Kumar; Sigmon, J. Sebastian; Shorter, John R.; Kashfeen, Anwica; McMullan, Rachel C.; Williams, Lucy H.; Giusti-Rodríguez, Paola; Ferris, Martin T.; Sullivan, Patrick; Hock, Pablo; Miller, Darla R.; Bell, Timothy A.; McMillan, Leonard; Churchill, Gary A.; de Villena, Fernando Pardo-Manuel

2017-01-01

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources. PMID:28592495
Cftr gene targeting in mouse embryonic stem cells mediated by Small Fragment Homologous Replacement (SFHR).

PubMed

Sangiuolo, Federica; Scaldaferri, Maria Lucia; Filareto, Antonio; Spitalieri, Paola; Guerra, Lorenzo; Favia, Maria; Caroppo, Rosa; Mango, Ruggiero; Bruscia, Emanuela; Gruenert, Dieter C; Casavola, Valeria; De Felici, Massimo; Novelli, Giuseppe

2008-01-01

Different gene targeting approaches have been developed to modify endogenous genomic DNA in both human and mouse cells. Briefly, the process involves the targeting of a specific mutation in situ leading to the gene correction and the restoration of a normal gene function. Most of these protocols with therapeutic potential are oligonucleotide based, and rely on endogenous enzymatic pathways. One gene targeting approach, "Small Fragment Homologous Replacement (SFHR)", has been found to be effective in modifying genomic DNA. This approach uses small DNA fragments (SDF) to target specific genomic loci and induce sequence and subsequent phenotypic alterations. This study shows that SFHR can stably introduce a 3-bp deletion (deltaF508, the most frequent cystic fibrosis (CF) mutation) into the Cftr (CF Transmembrane Conductance Regulator) locus in the mouse embryonic stem (ES) cell genome. After transfection of deltaF508-SDF into murine ES cells, SFHR-mediated modification was evaluated at the molecular levels on DNA and mRNA obtained from transfected ES cells. About 12% of transcript corresponding to deleted allele was detected, while 60% of the electroporated cells completely lost any measurable CFTR-dependent chloride efflux. The data indicate that the SFHR technique can be used to effectively target and modify genomic sequences in ES cells. Once the SFHR-modified ES cells differentiate into different cell lineages they can be useful for elucidating tissue-specific gene function and for the development of transplantation-based cellular and therapeutic protocols.
Inhibition of colorectal cancer genomic copy number alterations and chromosomal fragile site tumor suppressor FHIT and WWOX deletions by DNA mismatch repair

PubMed Central

Gelincik, Ozkan; Blecua, Pedro; Edelmann, Winfried; Kucherlapati, Raju; Zhou, Kathy; Jasin, Maria; Gümüş, Zeynep H.; Lipkin, Steven M.

2017-01-01

Homologous recombination (HR) enables precise DNA repair after DNA double strand breaks (DSBs) using identical sequence templates, whereas homeologous recombination (HeR) uses only partially homologous sequences. Homeologous recombination introduces mutations through gene conversion and genomic deletions through single-strand annealing (SSA). DNA mismatch repair (MMR) inhibits HeR, but the roles of mammalian MMR MutL homologues (MLH1, PMS2 and MLH3) proteins in HeR suppression are poorly characterized. Here, we demonstrate that mouse embryonic fibroblasts (MEFs) carrying Mlh1, Pms2, and Mlh3 mutations have higher HeR rates, by using 7,863 uniquely mapping paired direct repeat sequences (DRs) in the mouse genome as endogenous gene conversion and SSA reporters. Additionally, when DSBs are induced by gamma-radiation, Mlh1, Pms2 and Mlh3 mutant MEFs have higher DR copy number alterations (CNAs), including DR CNA hotspots previously identified in mouse MMR-deficient colorectal cancer (dMMR CRC). Analysis of The Cancer Genome Atlas CRC data revealed that dMMR CRCs have higher genome-wide DR HeR rates than MMR proficient CRCs, and that dMMR CRCs have deletion hotspots in tumor suppressors FHIT/WWOX at chromosomal fragile sites FRA3B and FRA16D (which have elevated DSB rates) flanked by paired homologous DRs and inverted repeats (IR). Overall, these data provide novel insights into the MMR-dependent HeR inhibition mechanism and its role in tumor suppression. PMID:29069730
Exome sequencing and arrayCGH detection of gene sequence and copy number variation between ILS and ISS mouse strains.

PubMed

Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M

2014-06-01

It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to contribute to the alcohol-related phenotypic differences associated with these strains.
Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias

PubMed Central

2012-01-01

Background High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. Results We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. Conclusion The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses. PMID:22260749
The role of retrotransposons in gene family expansions: insights from the mouse Abp gene family.

PubMed

Janoušek, Václav; Karn, Robert C; Laukaitis, Christina M

2013-05-29

Retrotransposons have been suggested to provide a substrate for non-allelic homologous recombination (NAHR) and thereby promote gene family expansion. Their precise role, however, is controversial. Here we ask whether retrotransposons contributed to the recent expansions of the Androgen-binding protein (Abp) gene families that occurred independently in the mouse and rat genomes. Using dot plot analysis, we found that the most recent duplication in the Abp region of the mouse genome is flanked by L1Md_T elements. Analysis of the sequence of these elements revealed breakpoints that are the relicts of the recombination that caused the duplication, confirming that the duplication arose as a result of NAHR using L1 elements as substrates. L1 and ERVII retrotransposons are considerably denser in the Abp regions than in one Mb flanking regions, while other repeat types are depleted in the Abp regions compared to flanking regions. L1 retrotransposons preferentially accumulated in the Abp gene regions after lineage separation and roughly followed the pattern of Abp gene expansion. By contrast, the proportion of shared vs. lineage-specific ERVII repeats in the Abp region resembles the rest of the genome. We confirmed the role of L1 repeats in Abp gene duplication with the identification of recombinant L1Md_T elements at the edges of the most recent mouse Abp gene duplication. High densities of L1 and ERVII repeats were found in the Abp gene region with abrupt transitions at the region boundaries, suggesting that their higher densities are tightly associated with Abp gene duplication. We observed that the major accumulation of L1 elements occurred after the split of the mouse and rat lineages and that there is a striking overlap between the timing of L1 accumulation and expansion of the Abp gene family in the mouse genome. Establishing a link between the accumulation of L1 elements and the expansion of the Abp gene family and identification of an NAHR-related breakpoint in the most recent duplication are the main contributions of our study.
The role of retrotransposons in gene family expansions: insights from the mouse Abp gene family

PubMed Central

2013-01-01

Background Retrotransposons have been suggested to provide a substrate for non-allelic homologous recombination (NAHR) and thereby promote gene family expansion. Their precise role, however, is controversial. Here we ask whether retrotransposons contributed to the recent expansions of the Androgen-binding protein (Abp) gene families that occurred independently in the mouse and rat genomes. Results Using dot plot analysis, we found that the most recent duplication in the Abp region of the mouse genome is flanked by L1Md_T elements. Analysis of the sequence of these elements revealed breakpoints that are the relicts of the recombination that caused the duplication, confirming that the duplication arose as a result of NAHR using L1 elements as substrates. L1 and ERVII retrotransposons are considerably denser in the Abp regions than in one Mb flanking regions, while other repeat types are depleted in the Abp regions compared to flanking regions. L1 retrotransposons preferentially accumulated in the Abp gene regions after lineage separation and roughly followed the pattern of Abp gene expansion. By contrast, the proportion of shared vs. lineage-specific ERVII repeats in the Abp region resembles the rest of the genome. Conclusions We confirmed the role of L1 repeats in Abp gene duplication with the identification of recombinant L1Md_T elements at the edges of the most recent mouse Abp gene duplication. High densities of L1 and ERVII repeats were found in the Abp gene region with abrupt transitions at the region boundaries, suggesting that their higher densities are tightly associated with Abp gene duplication. We observed that the major accumulation of L1 elements occurred after the split of the mouse and rat lineages and that there is a striking overlap between the timing of L1 accumulation and expansion of the Abp gene family in the mouse genome. Establishing a link between the accumulation of L1 elements and the expansion of the Abp gene family and identification of an NAHR-related breakpoint in the most recent duplication are the main contributions of our study. PMID:23718880
3-Hydroxy-3-methylglutaryl CoA lyase (HL): Mouse and human HL gene (HMGCL) cloning and detection of large gene deletions in two unrelated HL-deficient patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, S.P.; Robert, M.F.; Mitchell, G.A.

1996-04-01

3-hydroxy-3-methylglutaryl CoA lyase (HL, EC 4.1.3.4) catalyzes the cleavage of 3-hydroxy-3-methylglutaryl CoA to acetoacetic acid and acetyl CoA, the final reaction of both ketogenesis and leucine catabolism. Autosomal-recessive HL deficiency in humans results in episodes of hypoketotic hypoglycemia and coma. Using a mouse HL cDNA as a probe, we isolated a clone containing the full-length mouse HL gene that spans about 15 kb of mouse chromosome 4 and contains nine exons. The promoter region of the mouse HL gene contains elements characteristic of a housekeeping gene: a CpG island containing multiple Sp1 binding sites surrounds exon 1, and neither amore » TATA nor a CAAT box are present. We identified multiple transcription start sites in the mouse HL gene, 35 to 9 bases upstream of the translation start codon. We also isolated two human HL genomic clones that include HL exons 2 to 9 within 18 kb. The mouse and human HL genes (HGMW-approved symbol HMGCL) are highly homologous, with identical locations of intron-exon junctions. By genomic Southern blot analysis and exonic PCR, was found 2 of 33 HL-deficient probands to be homozygous for large deletions in the HL gene. 26 refs., 4 figs., 2 tabs.« less
The Mammalian Cell Cycle Regulates Parvovirus Nuclear Capsid Assembly

PubMed Central

Riolobos, Laura; Domínguez, Carlos; Kann, Michael; Almendral, José M.

2015-01-01

It is unknown whether the mammalian cell cycle could impact the assembly of viruses maturing in the nucleus. We addressed this question using MVM, a reference member of the icosahedral ssDNA nuclear parvoviruses, which requires cell proliferation to infect by mechanisms partly understood. Constitutively expressed MVM capsid subunits (VPs) accumulated in the cytoplasm of mouse and human fibroblasts synchronized at G0, G1, and G1/S transition. Upon arrest release, VPs translocated to the nucleus as cells entered S phase, at efficiencies relying on cell origin and arrest method, and immediately assembled into capsids. In synchronously infected cells, the consecutive virus life cycle steps (gene expression, proteins nuclear translocation, capsid assembly, genome replication and encapsidation) proceeded tightly coupled to cell cycle progression from G0/G1 through S into G2 phase. However, a DNA synthesis stress caused by thymidine irreversibly disrupted virus life cycle, as VPs became increasingly retained in the cytoplasm hours post-stress, forming empty capsids in mouse fibroblasts, thereby impairing encapsidation of the nuclear viral DNA replicative intermediates. Synchronously infected cells subjected to density-arrest signals while traversing early S phase also blocked VPs transport, resulting in a similar misplaced cytoplasmic capsid assembly in mouse fibroblasts. In contrast, thymidine and density arrest signals deregulating virus assembly neither perturbed nuclear translocation of the NS1 protein nor viral genome replication occurring under S/G2 cycle arrest. An underlying mechanism of cell cycle control was identified in the nuclear translocation of phosphorylated VPs trimeric assembly intermediates, which accessed a non-conserved route distinct from the importin α2/β1 and transportin pathways. The exquisite cell cycle-dependence of parvovirus nuclear capsid assembly conforms a novel paradigm of time and functional coupling between cellular and virus life cycles. This junction may determine the characteristic parvovirus tropism for proliferative and cancer cells, and its disturbance could critically contribute to persistence in host tissues. PMID:26067441

The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in Pinaceae

Treesearch

David B. Neale; Patrick E. McGuire; Nicholas C. Wheeler; Kristian A. Stevens; Marc W. Crepeau; Charis Cardeno; Aleksey V. Zimin; Daniela Puiu; Geo M. Pertea; U. Uzay Sezen; Claudio Casola; Tomasz E. Koralewski; Robin Paul; Daniel Gonzalez-Ibeas; Sumaira Zaman; Richard Cronn; Mark Yandell; Carson Holt; Charles H. Langley; James A. Yorke; Steven L. Salzberg; Jill L. Wegrzyn

2017-01-01

A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50...
Mouse scrapie responsive gene 1 (Scrg1): genomic organization, physical linkage to sap30, genetic mapping on chromosome 8, and expression in neuronal primary cell cultures.

PubMed

Dron, M; Tartare, X; Guillo, F; Haik, S; Barbin, G; Maury, C; Tovey, M; Dandoy-Dron, F

2000-11-15

We have previously reported a transcript of a novel mouse gene (Scrg1) with increased expression in transmissible spongiform encephalopathies and the cloning of the human mRNA analogue. In this paper, we present the genomic organization of the mouse and human SCRG1 loci, which exhibit a high degree of conservation. The genes are composed of three exons; the two downstream exons contain the protein coding region. The mouse gene is expressed in brain tissue essentially as a 0.7-kb message but also as a minor 2.6-kb mRNA. We have sequenced 20 kb of DNA at the mouse Scrg1 locus and found that the longer transcript is the prolongation of the 0.7-kb mRNA to a polyadenylation site located about 2 kb further downstream. Sequencing revealed that the mouse Scrg1 gene is physically linked to Sap30, a gene that encodes a protein of the histone deacetylase complex, and genetic linkage mapping assigned the localization of Scrg1 to chromosome 8 between Ant1 and Hmg2. Northern blot analysis showed that Scrg1 is under strict developmental control in mouse embryo and is expressed by cells of neuronal origin in vitro. Comparison of the rat, mouse, and human SCRG1 proteins identified a box of 35 identical contiguous amino acids and a characteristic cysteine distribution pattern defining a new protein signature. Copyright 2000 Academic Press.
Genome-wide expression profiling of five mouse models identifies similarities and differences with human psoriasis.

PubMed

Swindell, William R; Johnston, Andrew; Carbajal, Steve; Han, Gangwen; Wohn, Christian; Lu, Jun; Xing, Xianying; Nair, Rajan P; Voorhees, John J; Elder, James T; Wang, Xiao-Jing; Sano, Shigetoshi; Prens, Errol P; DiGiovanni, John; Pittelkow, Mark R; Ward, Nicole L; Gudjonsson, Johann E

2011-04-04

Development of a suitable mouse model would facilitate the investigation of pathomechanisms underlying human psoriasis and would also assist in development of therapeutic treatments. However, while many psoriasis mouse models have been proposed, no single model recapitulates all features of the human disease, and standardized validation criteria for psoriasis mouse models have not been widely applied. In this study, whole-genome transcriptional profiling is used to compare gene expression patterns manifested by human psoriatic skin lesions with those that occur in five psoriasis mouse models (K5-Tie2, imiquimod, K14-AREG, K5-Stat3C and K5-TGFbeta1). While the cutaneous gene expression profiles associated with each mouse phenotype exhibited statistically significant similarity to the expression profile of psoriasis in humans, each model displayed distinctive sets of similarities and differences in comparison to human psoriasis. For all five models, correspondence to the human disease was strong with respect to genes involved in epidermal development and keratinization. Immune and inflammation-associated gene expression, in contrast, was more variable between models as compared to the human disease. These findings support the value of all five models as research tools, each with identifiable areas of convergence to and divergence from the human disease. Additionally, the approach used in this paper provides an objective and quantitative method for evaluation of proposed mouse models of psoriasis, which can be strategically applied in future studies to score strengths of mouse phenotypes relative to specific aspects of human psoriasis.
In silico identification and comparative analysis of differentially expressed genes in human and mouse tissues

PubMed Central

Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing

2006-01-01

Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500
Targeted disruption of the murine Facc gene: Towards the establishment of a mouse model for Fanconi anemia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, M.; Auerbach, W.; Buchwald, M.

1994-09-01

Fanconi anemia (FA) is an autosomal recessive disease characterized by bone marrow failure, congenital malformations and predisposition to malignancies. The gene responsible for the defect in FA group C has been cloned and designated the Fanconi Anemia Complementation Group C gene (FACC). A murine cDNA for this gene (Facc) was also cloned. Here we report our progress in the establishment of a mouse model for FA. The mouse Facc cDNA was used as probe to screen a genomic library of mouse strain 129. More than twenty positive clones were isolated. Three of them were mapped and found to be overlappingmore » clones, encompassing the genomic region from exon 8 to the end of the 3{prime} UTR of the mouse cDNA. A targeting vector was constructed using the most 5{prime} mouse genomic sequence available. The end result of the homologous recombination is that exon 8 is deleted and the neo gene is inserted. The last exon, exon 14, is essential for the complementing function of the FACC gene product; the disruption in the middle of the murine Facc gene should render this locus biologically inactive. This targeting vector was linearized and electroporated into R1 embryonic stem (ES) cells which were derived from the 129 mouse. Of 102 clones screened, 19 positive cell lines were identified. Four targeted cell lines have been used to produce chimeric mice. 129-derived ES cells were aggregated ex vivo into the morulas derived from CD1 mice and then implanted into foster mothers. 22 chimeras have been obtained. Moderately and strongly chimeric mice have been bred to test for germline transmission. Progeny with the expected coat color derived from 2 chimeras are currently being examined to confirm transmission of the targeted allele.« less
GETPrime 2.0: gene- and transcript-specific qPCR primers for 13 species including polymorphisms

PubMed Central

David, Fabrice P.A.; Rougemont, Jacques; Deplancke, Bart

2017-01-01

GETPrime (http://bbcftools.epfl.ch/getprime) is a database with a web frontend providing gene- and transcript-specific, pre-computed qPCR primer pairs. The primers have been optimized for genome-wide specificity and for allowing the selective amplification of one or several splice variants of most known genes. To ease selection, primers have also been ranked according to defined criteria such as genome-wide specificity (with BLAST), amplicon size, and isoform coverage. Here, we report a major upgrade (2.0) of the database: eight new species (yeast, chicken, macaque, chimpanzee, rat, platypus, pufferfish, and Anolis carolinensis) now complement the five already included in the previous version (human, mouse, zebrafish, fly, and worm). Furthermore, the genomic reference has been updated to Ensembl v81 (while keeping earlier versions for backward compatibility) as a result of re-designing the back-end database and automating the import of relevant sections of the Ensembl database in species-independent fashion. This also allowed us to map known polymorphisms to the primers (on average three per primer for human), with the aim of reducing experimental error when targeting specific strains or individuals. Another consequence is that the inclusion of future Ensembl releases and other species has now become a relatively straightforward task. PMID:28053161
Generation of a mouse model for studying the role of upregulated RTEL1 activity in tumorigenesis.

PubMed

Wu, Xiaoli; Sandhu, Sumit; Nabi, Zinnatun; Ding, Hao

2012-10-01

Regulator of telomere length 1 (RTEL1) is a DNA helicase protein that has been demonstrated to be required for the maintenance of telomere length and genomic stability. It has also been found to be essential for DNA homologous recombination during DNA repairing. Human RTEL1 genomic locus (20q13.3) is frequently amplified in multiple types of human cancers, including hepatocellular carcinoma and gastrointestinal tract tumors, indicating that upregulated RTEL1 activity could be important for tumorigenesis. In this study, we have developed a conditional transgenic mouse model that overexpress mouse Rtel1 in a Cre-excision manner. By crossing with a ubiquitous Cre mouse line, we further demonstrated that these established Rtel1 conditional transgenic mice allow to efficiently and highly express a functional Rtel1 that is able to rescue the embryonic defects of Rtel1 null mouse allele. Furthermore, we demonstrated that more than 70% transgenic mice that widely overexpress Rtel1 developed liver tumors that recapitulate many malignant features of human hepatocellular carcinoma (HCC). Our work not only generated a valuable mouse model for determining the role of RTEL1 in the development of cancers, but also provided the first genetic evidence to support that amplification of RTEL1, as observed in several types of human cancers, is tumorigenic.
Proteomic analysis of propiconazole responses in mouse liver: comparison of genomic and proteomic profiles

EPA Science Inventory

We have performed for the first time a comprehensive profiling of changes in protein expression of soluble proteins in livers from mice treated with the mouse liver tumorigen, propiconazole, to uncover the pathways and networks altered by this fungicide. Utilizing twodimensional...
The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

PubMed Central

2004-01-01

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
Quantum dot-fluorescence in situ hybridisation for Ectromelia virus detection based on biotin-streptavidin interactions.

PubMed

Wang, Ting; Zheng, Zhenhua; Zhang, Xian-En; Wang, Hanzhong

2016-09-01

Ectromelia virus (ECTV) is an pathogen that can lead to a lethal, acute toxic disease known as mousepox in mice. Prevention and control of ECTV infection requires the establishment of a rapid and sensitive diagnostic system for detecting the virus. In the present study, we developed a method of quantum-dot-fluorescence based in situ hybridisation for detecting ECTV genome DNA. Using biotin-dUTP to replace dTTP, biotin was incorporated into a DNA probe during polymerase chain reaction. High sensitivity and specificity of ECTV DNA detection were displayed by fluorescent quantum dots based on biotin-streptavidin interactions. ECTV DNA was then detected by streptavidin-conjugated quantum dots that bound the biotin-labelled probe. Results indicated that the established method can visualise ECTV genomic DNA in both infected cells and mouse tissues. To our knowledge, this is the first study reporting quantum-dot-fluorescence based in situ hybridisation for the detection of viral nucleic acids, providing a reference for the identification and detection of other viruses. Copyright © 2016. Published by Elsevier B.V.
Regulatory Features for Odorant Receptor Genes in the Mouse Genome.

PubMed

Degl'Innocenti, Andrea; D'Errico, Anna

2017-01-01

The odorant receptor genes, seven transmembrane receptor genes constituting the vastest mammalian gene multifamily, are expressed monogenically and monoallelicaly in each sensory neuron in the olfactory epithelium. This characteristic, often referred to as the one neuron-one receptor rule, is driven by mostly uncharacterized molecular dynamics, generally named odorant receptor gene choice . Much attention has been paid by the scientific community to the identification of sequences regulating the expression of odorant receptor genes within their loci , where related genes are usually arranged in genomic clusters. A number of studies identified transcription factor binding sites on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers that regulate in cis their transcription, but have been proposed to form interchromosomal networks. Odorant receptor gene choice seems to occur via the local removal of strongly repressive epigenetic markings, put in place during the maturation of the sensory neuron on each odorant receptor locus . Here we review the fast-changing state of art for the study of regulatory features for odorant receptor genes.
Overlapping DNA Methylation Dynamics in Mouse Intestinal Cell Differentiation and Early Stages of Malignant Progression

PubMed Central

Forn, Marta; Díez-Villanueva, Anna; Merlos-Suárez, Anna; Muñoz, Mar; Lois, Sergi; Carriò, Elvira; Jordà, Mireia; Bigas, Anna; Batlle, Eduard; Peinado, Miguel A.

2015-01-01

Mouse models of intestinal crypt cell differentiation and tumorigenesis have been used to characterize the molecular mechanisms underlying both processes. DNA methylation is a key epigenetic mark and plays an important role in cell identity and differentiation programs and cancer. To get insights into the dynamics of cell differentiation and malignant transformation we have compared the DNA methylation profiles along the mouse small intestine crypt and early stages of tumorigenesis. Genome-scale analysis of DNA methylation together with microarray gene expression have been applied to compare intestinal crypt stem cells (EphB2high), differentiated cells (EphB2negative), ApcMin/+ adenomas and the corresponding non-tumor adjacent tissue, together with small and large intestine samples and the colon cancer cell line CT26. Compared with late stages, small intestine crypt differentiation and early stages of tumorigenesis display few and relatively small changes in DNA methylation. Hypermethylated loci are largely shared by the two processes and affect the proximities of promoter and enhancer regions, with enrichment in genes associated with the intestinal stem cell signature and the PRC2 complex. The hypermethylation is progressive, with minute levels in differentiated cells, as compared with intestinal stem cells, and reaching full methylation in advanced stages. Hypomethylation shows different signatures in differentiation and cancer and is already present in the non-tumor tissue adjacent to the adenomas in ApcMin/+ mice, but at lower levels than advanced cancers. This study provides a reference framework to decipher the mechanisms driving mouse intestinal tumorigenesis and also the human counterpart. PMID:25933092
Establishment of a patient-derived orthotopic osteosarcoma mouse model.

PubMed

Blattmann, Claudia; Thiemann, Markus; Stenzinger, Albrecht; Roth, Eva K; Dittmar, Anne; Witt, Hendrik; Lehner, Burkhard; Renker, Eva; Jugold, Manfred; Eichwald, Viktoria; Weichert, Wilko; Huber, Peter E; Kulozik, Andreas E

2015-04-30

Osteosarcoma (OS) is the most common pediatric primary malignant bone tumor. As the prognosis for patients following standard treatment did not improve for almost three decades, functional preclinical models that closely reflect important clinical cancer characteristics are urgently needed to develop and evaluate new treatment strategies. The objective of this study was to establish an orthotopic xenotransplanted mouse model using patient-derived tumor tissue. Fresh tumor tissue from an adolescent female patient with osteosarcoma after relapse was surgically xenografted into the right tibia of 6 immunodeficient BALB/c Nu/Nu mice as well as cultured into medium. Tumor growth was serially assessed by palpation and with magnetic resonance imaging (MRI). In parallel, a primary cell line of the same tumor was established. Histology and high-resolution array-based comparative genomic hybridization (aCGH) were used to investigate both phenotypic and genotypic characteristics of different passages of human xenografts and the cell line compared to the tissue of origin. A primary OS cell line and a primary patient-derived orthotopic xenotranplanted mouse model were established. MRI analyses and histopathology demonstrated an identical architecture in the primary tumor and in the xenografts. Array-CGH analyses of the cell line and all xenografts showed highly comparable patterns of genomic progression. So far, three further primary patient-derived orthotopic xenotranplanted mouse models could be established. We report the first orthotopic OS mouse model generated by transplantation of tumor fragments directly harvested from the patient. This model represents the morphologic and genomic identity of the primary tumor and provides a preclinical platform to evaluate new treatment strategies in OS.
Livestock in biomedical research: history, current status and future prospective.

PubMed

Polejaeva, Irina A; Rutigliano, Heloisa M; Wells, Kevin D

2016-01-01

Livestock models have contributed significantly to biomedical and surgical advances. Their contribution is particularly prominent in the areas of physiology and assisted reproductive technologies, including understanding developmental processes and disorders, from ancient to modern times. Over the past 25 years, biomedical research that traditionally embraced a diverse species approach shifted to a small number of model species (e.g. mice and rats). The initial reasons for focusing the main efforts on the mouse were the availability of murine embryonic stem cells (ESCs) and genome sequence data. This powerful combination allowed for precise manipulation of the mouse genome (knockouts, knockins, transcriptional switches etc.) leading to ground-breaking discoveries on gene functions and regulation, and their role in health and disease. Despite the enormous contribution to biomedical research, mouse models have some major limitations. Their substantial differences compared with humans in body and organ size, lifespan and inbreeding result in pronounced metabolic, physiological and behavioural differences. Comparative studies of strategically chosen domestic species can complement mouse research and yield more rigorous findings. Because genome sequence and gene manipulation tools are now available for farm animals (cattle, pigs, sheep and goats), a larger number of livestock genetically engineered (GE) models will be accessible for biomedical research. This paper discusses the use of cattle, goats, sheep and pigs in biomedical research, provides an overview of transgenic technology in farm animals and highlights some of the beneficial characteristics of large animal models of human disease compared with the mouse. In addition, status and origin of current regulation of GE biomedical models is also reviewed.
CoGI: Towards Compressing Genomes as an Image.

PubMed

Xie, Xiaojing; Zhou, Shuigeng; Guan, Jihong

2015-01-01

Genomic science is now facing an explosive increase of data thanks to the fast development of sequencing technology. This situation poses serious challenges to genomic data storage and transferring. It is desirable to compress data to reduce storage and transferring cost, and thus to boost data distribution and utilization efficiency. Up to now, a number of algorithms / tools have been developed for compressing genomic sequences. Unlike the existing algorithms, most of which treat genomes as one-dimensional text strings and compress them based on dictionaries or probability models, this paper proposes a novel approach called CoGI (the abbreviation of Compressing Genomes as an Image) for genome compression, which transforms the genomic sequences to a two-dimensional binary image (or bitmap), then applies a rectangular partition coding algorithm to compress the binary image. CoGI can be used as either a reference-based compressor or a reference-free compressor. For the former, we develop two entropy-based algorithms to select a proper reference genome. Performance evaluation is conducted on various genomes. Experimental results show that the reference-based CoGI significantly outperforms two state-of-the-art reference-based genome compressors GReEn and RLZ-opt in both compression ratio and compression efficiency. It also achieves comparable compression ratio but two orders of magnitude higher compression efficiency in comparison with XM--one state-of-the-art reference-free genome compressor. Furthermore, our approach performs much better than Gzip--a general-purpose and widely-used compressor, in both compression speed and compression ratio. So, CoGI can serve as an effective and practical genome compressor. The source code and other related documents of CoGI are available at: http://admis.fudan.edu.cn/projects/cogi.htm.
Genomic Locus Modulating IOP in the BXD RI Mouse Strains

PubMed Central

King, Rebecca; Li, Ying; Wang, Jiaxing; Struebing, Felix L.; Geisert, Eldon E.

2018-01-01

Intraocular pressure (IOP) is the primary risk factor for developing glaucoma, yet little is known about the contribution of genomic background to IOP regulation. The present study leverages an array of systems genetics tools to study genomic factors modulating normal IOP in the mouse. The BXD recombinant inbred (RI) strain set was used to identify genomic loci modulating IOP. We measured the IOP in a total of 506 eyes from 38 different strains. Strain averages were subjected to conventional quantitative trait analysis by means of composite interval mapping. Candidate genes were defined, and immunohistochemistry and quantitative PCR (qPCR) were used for validation. Of the 38 BXD strains examined the mean IOP ranged from a low of 13.2mmHg to a high of 17.1mmHg. The means for each strain were used to calculate a genome wide interval map. One significant quantitative trait locus (QTL) was found on Chr.8 (96 to 103 Mb). Within this 7 Mb region only 4 annotated genes were found: Gm15679, Cdh8, Cdh11 and Gm8730. Only two genes (Cdh8 and Cdh11) were candidates for modulating IOP based on the presence of non-synonymous SNPs. Further examination using SIFT (Sorting Intolerant From Tolerant) analysis revealed that the SNPs in Cdh8 (Cadherin 8) were predicted to not change protein function; while the SNPs in Cdh11 (Cadherin 11) would not be tolerated, affecting protein function. Furthermore, immunohistochemistry demonstrated that CDH11 is expressed in the trabecular meshwork of the mouse. We have examined the genomic regulation of IOP in the BXD RI strain set and found one significant QTL on Chr. 8. Within this QTL, there is one good candidate gene, Cdh11. PMID:29496776
Epigenomic Reprogramming of Adult Cardiomyocyte-Derived Cardiac Progenitor Cells

PubMed Central

Zhang, Yiqiang; Zhong, Jiang F; Qiu, Hongyu; Robb MacLellan, W.; Marbán, Eduardo; Wang, Charles

2015-01-01

It has been believed that mammalian adult cardiomyocytes (ACMs) are terminally-differentiated and are unable to proliferate. Recently, using a bi-transgenic ACM fate mapping mouse model and an in vitro culture system, we demonstrated that adult mouse cardiomyocytes were able to dedifferentiate into cardiac progenitor-like cells (CPCs). However, little is known about the molecular basis of their intrinsic cellular plasticity. Here we integrate single-cell transcriptome and whole-genome DNA methylation analyses to unravel the molecular mechanisms underlying the dedifferentiation and cell cycle reentry of mouse ACMs. Compared to parental cardiomyocytes, dedifferentiated mouse cardiomyocyte-derived CPCs (mCPCs) display epigenomic reprogramming with many differentially-methylated regions, both hypermethylated and hypomethylated, across the entire genome. Correlated well with the methylome, our transcriptomic data showed that the genes encoding cardiac structure and function proteins are remarkably down-regulated in mCPCs, while those for cell cycle, proliferation, and stemness are significantly up-regulated. In addition, implantation of mCPCs into infarcted mouse myocardium improves cardiac function with augmented left ventricular ejection fraction. Our study demonstrates that the cellular plasticity of mammalian cardiomyocytes is the result of a well-orchestrated epigenomic reprogramming and a subsequent global transcriptomic alteration. PMID:26657817
Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction.

PubMed

Muley, Vijaykumar Yogesh; Ranjan, Akash

2012-01-01

Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.
Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.

PubMed

Denas, Olgert; Sandstrom, Richard; Cheng, Yong; Beal, Kathryn; Herrero, Javier; Hardison, Ross C; Taylor, James

2015-02-14

Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.
Functionally Charged Polystyrene Particles Activate Immortalized Mouse Microglia (BV2): Cellular and Genomic Response

EPA Science Inventory

The effect of particle surface charge on the biological activation of immortalized mouse microglia (BV2) was examined. Same size (~850-950 nm) spherical polystyrene microparticles (SPM) with net negative (carboxyl, COOH-) or positive (dimethyl amino, CH₃)₂

Proteomic Analysis of Propiconazole Responses in Mouse Liver-Comparison of Genomic and Proteomic Profiles

EPA Science Inventory

We have performed for the first time a comprehensive profiling of changes in protein expression of soluble proteins in livers from mice treated with the mouse liver tumorigen, propiconazole, to uncover the pathways and networks altered by this commonly used fungicide. Utilizing t...
SNPchiMp: a database to disentangle the SNPchip jungle in bovine livestock.

PubMed

Nicolazzi, Ezequiel Luis; Picciolini, Matteo; Strozzi, Francesco; Schnabel, Robert David; Lawley, Cindy; Pirani, Ali; Brew, Fiona; Stella, Alessandra

2014-02-11

Currently, six commercial whole-genome SNP chips are available for cattle genotyping, produced by two different genotyping platforms. Technical issues need to be addressed to combine data that originates from the different platforms, or different versions of the same array generated by the manufacturer. For example: i) genome coordinates for SNPs may refer to different genome assemblies; ii) reference genome sequences are updated over time changing the positions, or even removing sequences which contain SNPs; iii) not all commercial SNP ID's are searchable within public databases; iv) SNPs can be coded using different formats and referencing different strands (e.g. A/B or A/C/T/G alleles, referencing forward/reverse, top/bottom or plus/minus strand); v) Due to new information being discovered, higher density chips do not necessarily include all the SNPs present in the lower density chips; and, vi) SNP IDs may not be consistent across chips and platforms. Most researchers and breed associations manage SNP data in real-time and thus require tools to standardise data in a user-friendly manner. Here we present SNPchiMp, a MySQL database linked to an open access web-based interface. Features of this interface include, but are not limited to, the following functions: 1) referencing the SNP mapping information to the latest genome assembly, 2) extraction of information contained in dbSNP for SNPs present in all commercially available bovine chips, and 3) identification of SNPs in common between two or more bovine chips (e.g. for SNP imputation from lower to higher density). In addition, SNPchiMp can retrieve this information on subsets of SNPs, accessing such data either via physical position on a supported assembly, or by a list of SNP IDs, rs or ss identifiers. This tool combines many different sources of information, that otherwise are time consuming to obtain and difficult to integrate. The SNPchiMp not only provides the information in a user-friendly format, but also enables researchers to perform a large number of operations with a few clicks of the mouse. This significantly reduces the time needed to execute the large number of operations required to manage SNP data.
Reference-quality genome sequence of Aegilops tauschii, the source of wheat D genome, shows that recombination shapes genome structure and evolution

USDA-ARS?s Scientific Manuscript database

Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat and an important genetic resource for wheat. A reference-quality sequence for the Ae. tauschii genome was produced with a combination of ordered-clone sequencing, whole-genome shotgun sequencing, and BioNano optical geno...
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk.

PubMed

Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B; Huson, Daniel H; Frick, Julia-Stefanie

2016-04-25

Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
A genome-wide association study identifies multiple loci for variation in human ear morphology.

PubMed

Adhikari, Kaustubh; Reales, Guillermo; Smith, Andrew J P; Konka, Esra; Palmen, Jutta; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Fuentes, Macarena; Pizarro, María; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Calderón, Rosario; Rosique, Javier; Cheeseman, Michael; Bhutta, Mahmood F; Humphries, Steve E; Gonzalez-José, Rolando; Headon, Denis; Balding, David; Ruiz-Linares, Andrés

2015-06-24

Here we report a genome-wide association study for non-pathological pinna morphology in over 5,000 Latin Americans. We find genome-wide significant association at seven genomic regions affecting: lobe size and attachment, folding of antihelix, helix rolling, ear protrusion and antitragus size (linear regression P values 2 × 10(-8) to 3 × 10(-14)). Four traits are associated with a functional variant in the Ectodysplasin A receptor (EDAR) gene, a key regulator of embryonic skin appendage development. We confirm expression of Edar in the developing mouse ear and that Edar-deficient mice have an abnormally shaped pinna. Two traits are associated with SNPs in a region overlapping the T-Box Protein 15 (TBX15) gene, a major determinant of mouse skeletal development. Strongest association in this region is observed for SNP rs17023457 located in an evolutionarily conserved binding site for the transcription factor Cartilage paired-class homeoprotein 1 (CART1), and we confirm that rs17023457 alters in vitro binding of CART1.
Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains.

PubMed

Ron, Gil; Globerson, Yuval; Moran, Dror; Kaplan, Tommy

2017-12-21

Proximity-ligation methods such as Hi-C allow us to map physical DNA-DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter-enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA-DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA-DNA interaction data.
Adaptive Evolution as a Predictor of Species-Specific Innate Immune Response.

PubMed

Webb, Andrew E; Gerek, Z Nevin; Morgan, Claire C; Walsh, Thomas A; Loscher, Christine E; Edwards, Scott V; O'Connell, Mary J

2015-07-01

It has been proposed that positive selection may be associated with protein functional change. For example, human and macaque have different outcomes to HIV infection and it has been shown that residues under positive selection in the macaque TRIM5α receptor locate to the region known to influence species-specific response to HIV. In general, however, the relationship between sequence and function has proven difficult to fully elucidate, and it is the role of large-scale studies to help bridge this gap in our understanding by revealing major patterns in the data that correlate genotype with function or phenotype. In this study, we investigate the level of species-specific positive selection in innate immune genes from human and mouse. In total, we analyzed 456 innate immune genes using codon-based models of evolution, comparing human, mouse, and 19 other vertebrate species to identify putative species-specific positive selection. Then we used population genomic data from the recently completed Neanderthal genome project, the 1000 human genomes project, and the 17 laboratory mouse genomes project to determine whether the residues that were putatively positively selected are fixed or variable in these populations. We find evidence of species-specific positive selection on both the human and the mouse branches and we show that the classes of genes under positive selection cluster by function and by interaction. Data from this study provide us with targets to test the relationship between positive selection and protein function and ultimately to test the relationship between positive selection and discordant phenotypes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Predicting Survival within the Lung Cancer Histopathological Hierarchy Using a Multi-Scale Genomic Model of Development

PubMed Central

Liu, Hongye; Kho, Alvin T; Kohane, Isaac S; Sun, Yao

2006-01-01

Background The histopathologic heterogeneity of lung cancer remains a significant confounding factor in its diagnosis and prognosis—spurring numerous recent efforts to find a molecular classification of the disease that has clinical relevance. Methods and Findings Molecular profiles of tumors from 186 patients representing four different lung cancer subtypes (and 17 normal lung tissue samples) were compared with a mouse lung development model using principal component analysis in both temporal and genomic domains. An algorithm for the classification of lung cancers using a multi-scale developmental framework was developed. Kaplan–Meier survival analysis was conducted for lung adenocarcinoma patient subgroups identified via their developmental association. We found multi-scale genomic similarities between four human lung cancer subtypes and the developing mouse lung that are prognostically meaningful. Significant association was observed between the localization of human lung cancer cases along the principal mouse lung development trajectory and the corresponding patient survival rate at three distinct levels of classical histopathologic resolution: among different lung cancer subtypes, among patients within the adenocarcinoma subtype, and within the stage I adenocarcinoma subclass. The earlier the genomic association between a human tumor profile and the mouse lung development sequence, the poorer the patient's prognosis. Furthermore, decomposing this principal lung development trajectory identified a gene set that was significantly enriched for pyrimidine metabolism and cell-adhesion functions specific to lung development and oncogenesis. Conclusions From a multi-scale disease modeling perspective, the molecular dynamics of murine lung development provide an effective framework that is not only data driven but also informed by the biology of development for elucidating the mechanisms of human lung cancer biology and its clinical outcome. PMID:16800721
Strain screen and haplotype association mapping of wheel running in inbred mouse strains.

PubMed

Lightfoot, J Timothy; Leamy, Larry; Pomp, Daniel; Turner, Michael J; Fodor, Anthony A; Knab, Amy; Bowen, Robert S; Ferguson, David; Moore-Harrison, Trudy; Hamilton, Alicia

2010-09-01

Previous genetic association studies of physical activity, in both animal and human models, have been limited in number of subjects and genetically homozygous strains used as well as number of genomic markers available for analysis. Expansion of the available mouse physical activity strain screens and the recently published dense single-nucleotide polymorphism (SNP) map of the mouse genome (approximately 8.3 million SNPs) and associated statistical methods allowed us to construct a more generalizable map of the quantitative trait loci (QTL) associated with physical activity. Specifically, we measured wheel running activity in male and female mice (average age 9 wk) in 41 inbred strains and used activity data from 38 of these strains in a haplotype association mapping analysis to determine QTL associated with activity. As seen previously, there was a large range of activity patterns among the strains, with the highest and lowest strains differing significantly in daily distance run (27.4-fold), duration of activity (23.6-fold), and speed (2.9-fold). On a daily basis, female mice ran further (24%), longer (13%), and faster (11%). Twelve QTL were identified, with three (on Chr. 12, 18, and 19) in both male and female mice, five specific to males, and four specific to females. Eight of the 12 QTL, including the 3 general QTL found for both sexes, fell into intergenic areas. The results of this study further support the findings of a moderate to high heritability of physical activity and add general genomic areas applicable to a large number of mouse strains that can be further mined for candidate genes associated with regulation of physical activity. Additionally, results suggest that potential genetic mechanisms arising from traditional noncoding regions of the genome may be involved in regulation of physical activity.
Sequence Diversity, Intersubgroup Relationships, and Origins of the Mouse Leukemia Gammaretroviruses of Laboratory and Wild Mice.

PubMed

Bamunusinghe, Devinka; Naghashfar, Zohreh; Buckler-White, Alicia; Plishka, Ronald; Baliji, Surendranath; Liu, Qingping; Kassner, Joshua; Oler, Andrew J; Hartley, Janet; Kozak, Christine A

2016-04-01

Mouse leukemia viruses (MLVs) are found in the common inbred strains of laboratory mice and in the house mouse subspecies ofMus musculus Receptor usage and envelope (env) sequence variation define three MLV host range subgroups in laboratory mice: ecotropic, polytropic, and xenotropic MLVs (E-, P-, and X-MLVs, respectively). These exogenous MLVs derive from endogenous retroviruses (ERVs) that were acquired by the wild mouse progenitors of laboratory mice about 1 million years ago. We analyzed the genomes of seven MLVs isolated from Eurasian and American wild mice and three previously sequenced MLVs to describe their relationships and identify their possible ERV progenitors. The phylogenetic tree based on the receptor-determining regions ofenvproduced expected host range clusters, but these clusters are not maintained in trees generated from other virus regions. Colinear alignments of the viral genomes identified segmental homologies to ERVs of different host range subgroups. Six MLVs show close relationships to a small xenotropic ERV subgroup largely confined to the inbred mouse Y chromosome.envvariations define three E-MLV subtypes, one of which carries duplications of various sizes, sequences, and locations in the proline-rich region ofenv Outside theenvregion, all E-MLVs are related to different nonecotropic MLVs. These results document the diversity in gammaretroviruses isolated from globally distributedMussubspecies, provide insight into their origins and relationships, and indicate that recombination has had an important role in the evolution of these mutagenic and pathogenic agents. Laboratory mice carry mouse leukemia viruses (MLVs) of three host range groups which were acquired from their wild mouse progenitors. We sequenced the complete genomes of seven infectious MLVs isolated from geographically separated Eurasian and American wild mice and compared them with endogenous germ line retroviruses (ERVs) acquired early in house mouse evolution. We did this because the laboratory mouse viruses derive directly from specific ERVs or arise by recombination between different ERVs. The six distinctively different wild mouse viruses appear to be recombinants, often involving different host range subgroups, and most are related to a distinctive, largely Y-chromosome-linked MLV ERV subtype. MLVs with ecotropic host ranges show the greatest variability with extensive inter- and intrasubtype envelope differences and with homologies to other host range subgroups outside the envelope. The sequence diversity among these wild mouse isolates helps define their relationships and origins and emphasizes the importance of recombination in their evolution. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Aging and chronic alcohol consumption are determinants of p16 gene expression, genomic DNA methylation and p16 promoter methylation in the mouse colon

USDA-ARS?s Scientific Manuscript database

Elder age and chronic alcohol consumption are important risk factors for the development of colon cancer. Each factor can alter genomic and gene-specific DNA methylation. This study examined the effects of aging and chronic alcohol consumption on genomic and p16-specific methylation, and p16 express...
Ageing, chronic alcohol consumption and folate are determinants of genomic DNA methylation, p16 promoter methylation and the expression of p16 in the mouse colon

USDA-ARS?s Scientific Manuscript database

Elder age and chronic alcohol consumption are important risk factors for the development of colon cancer. Each factor can alter genomic and gene-specific DNA methylation. This study examined the effects of aging and chronic alcohol consumption on genomic and p16-specific methylation, and p16 express...
Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals.

PubMed

Taylor, Jeremy F; Whitacre, Lynsey K; Hoff, Jesse L; Tizioto, Polyana C; Kim, JaeWoo; Decker, Jared E; Schnabel, Robert D

2016-08-17

Decreasing sequencing costs and development of new protocols for characterizing global methylation, gene expression patterns and regulatory regions have stimulated the generation of large livestock datasets. Here, we discuss experiences in the analysis of whole-genome and transcriptome sequence data. We analyzed whole-genome sequence (WGS) data from 132 individuals from five canid species (Canis familiaris, C. latrans, C. dingo, C. aureus and C. lupus) and 61 breeds, three bison (Bison bison), 64 water buffalo (Bubalus bubalis) and 297 bovines from 17 breeds. By individual, data vary in extent of reference genome depth of coverage from 4.9X to 64.0X. We have also analyzed RNA-seq data for 580 samples representing 159 Bos taurus and Rattus norvegicus animals and 98 tissues. By aligning reads to a reference assembly and calling variants, we assessed effects of average depth of coverage on the actual coverage and on the number of called variants. We examined the identity of unmapped reads by assembling them and querying produced contigs against the non-redundant nucleic acids database. By imputing high-density single nucleotide polymorphism data on 4010 US registered Angus animals to WGS using Run4 of the 1000 Bull Genomes Project and assessing the accuracy of imputation, we identified misassembled reference sequence regions. We estimate that a 24X depth of coverage is required to achieve 99.5 % coverage of the reference assembly and identify 95 % of the variants within an individual's genome. Genomes sequenced to low average coverage (e.g., <10X) may fail to cover 10 % of the reference genome and identify <75 % of variants. About 10 % of genomic DNA or transcriptome sequence reads fail to align to the reference assembly. These reads include loci missing from the reference assembly and misassembled genes and interesting symbionts, commensal and pathogenic organisms. Assembly errors and a lack of annotation of functional elements significantly limit the utility of the current draft livestock reference assemblies. The Functional Annotation of Animal Genomes initiative seeks to annotate functional elements, while a 70X Pac-Bio assembly for cow is underway and may result in a significantly improved reference assembly.
Development of a Method to Implement Whole-Genome Bisulfite Sequencing of cfDNA from Cancer Patients and a Mouse Tumor Model.

PubMed

Maggi, Elaine C; Gravina, Silvia; Cheng, Haiying; Piperdi, Bilal; Yuan, Ziqiang; Dong, Xiao; Libutti, Steven K; Vijg, Jan; Montagna, Cristina

2018-01-01

The goal of this study was to develop a method for whole genome cell-free DNA (cfDNA) methylation analysis in humans and mice with the ultimate goal to facilitate the identification of tumor derived DNA methylation changes in the blood. Plasma or serum from patients with pancreatic neuroendocrine tumors or lung cancer, and plasma from a murine model of pancreatic adenocarcinoma was used to develop a protocol for cfDNA isolation, library preparation and whole-genome bisulfite sequencing of ultra low quantities of cfDNA, including tumor-specific DNA. The protocol developed produced high quality libraries consistently generating a conversion rate >98% that will be applicable for the analysis of human and mouse plasma or serum to detect tumor-derived changes in DNA methylation.
In vivo binding of PRDM9 reveals interactions with noncanonical genomic sites

PubMed Central

Grey, Corinne; Clément, Julie A.J.; Buard, Jérôme; Leblanc, Benjamin; Gut, Ivo; Gut, Marta; Duret, Laurent

2017-01-01

In mouse and human meiosis, DNA double-strand breaks (DSBs) initiate homologous recombination and occur at specific sites called hotspots. The localization of these sites is determined by the sequence-specific DNA binding domain of the PRDM9 histone methyl transferase. Here, we performed an extensive analysis of PRDM9 binding in mouse spermatocytes. Unexpectedly, we identified a noncanonical recruitment of PRDM9 to sites that lack recombination activity and the PRDM9 binding consensus motif. These sites include gene promoters, where PRDM9 is recruited in a DSB-dependent manner. Another subset reveals DSB-independent interactions between PRDM9 and genomic sites, such as the binding sites for the insulator protein CTCF. We propose that these DSB-independent sites result from interactions between hotspot-bound PRDM9 and genomic sequences located on the chromosome axis. PMID:28336543
Library Resources for Bac End Sequencing. Final Technical Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pieter J. de Jong

2000-10-01

Studies directed towards the specific aims outlined for this research award are summarized. The RPCI II Human Bac Library has been expanded by the addition of 6.9-fold genomic coverage. This segment has been generated from a MBOI partial digest of the same anonymous donor DNA used for the rest of the library. A new cloning vector, pTARBAC1, has been constructed and used in the construction of RPCI-II segment 5. This new cloning vector provides a new strategy in identifying targeted genomic regions and will greatly facilitate a large-scale analysis for positional cloning. A new maleCS7BC/6J mouse BAC library has beenmore » constructed. RPCI-23 contain 576 plates (approx 210,000 clones) and represents approximately 11-fold coverage of the mouse genome.« less
Cloning and sequence analysis of a cDNA encoding the alpha-subunit of mouse beta-N-acetylhexosaminidase and comparison with the human enzyme.

PubMed Central

Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L

1992-01-01

cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Genome-Wide Expression Profiling of Five Mouse Models Identifies Similarities and Differences with Human Psoriasis

PubMed Central

Swindell, William R.; Johnston, Andrew; Carbajal, Steve; Han, Gangwen; Wohn, Christian; Lu, Jun; Xing, Xianying; Nair, Rajan P.; Voorhees, John J.; Elder, James T.; Wang, Xiao-Jing; Sano, Shigetoshi; Prens, Errol P.; DiGiovanni, John; Pittelkow, Mark R.; Ward, Nicole L.; Gudjonsson, Johann E.

2011-01-01

Development of a suitable mouse model would facilitate the investigation of pathomechanisms underlying human psoriasis and would also assist in development of therapeutic treatments. However, while many psoriasis mouse models have been proposed, no single model recapitulates all features of the human disease, and standardized validation criteria for psoriasis mouse models have not been widely applied. In this study, whole-genome transcriptional profiling is used to compare gene expression patterns manifested by human psoriatic skin lesions with those that occur in five psoriasis mouse models (K5-Tie2, imiquimod, K14-AREG, K5-Stat3C and K5-TGFbeta1). While the cutaneous gene expression profiles associated with each mouse phenotype exhibited statistically significant similarity to the expression profile of psoriasis in humans, each model displayed distinctive sets of similarities and differences in comparison to human psoriasis. For all five models, correspondence to the human disease was strong with respect to genes involved in epidermal development and keratinization. Immune and inflammation-associated gene expression, in contrast, was more variable between models as compared to the human disease. These findings support the value of all five models as research tools, each with identifiable areas of convergence to and divergence from the human disease. Additionally, the approach used in this paper provides an objective and quantitative method for evaluation of proposed mouse models of psoriasis, which can be strategically applied in future studies to score strengths of mouse phenotypes relative to specific aspects of human psoriasis. PMID:21483750
Genome-wide increase in histone H2A ubiquitylation in a mouse model of Huntington's disease.

PubMed

McFarland, Karen N; Das, Sudeshna; Sun, Ting Ting; Leyfer, Dmitri; Kim, Mee-Ohk; Xia, Eva; Sangrey, Gavin R; Kuhn, Alexandre; Luthi-Carter, Ruth; Clark, Timothy W; Sadri-Vakili, Ghazaleh; Cha, Jang-Ho J

2013-01-01

Huntington's disease (HD) is a neurodegenerative disorder with selective vulnerability of striatal neurons and involves extensive transcriptional dysregulation early in the disease process. Previous work in cell and mouse models has shown that histone modifications are altered in HD. Specifically, monoubiquitylated histone H2A (uH2A) is present at the promoters of downregulated genes which led to the hypothesis that uH2A plays a role in transcriptional silencing in HD. To broaden our view of uH2A function in transcription in HD, we examined genome-wide binding sites of uH2A in 12-week old striatal tissue from R6/2 transgenic HD mouse model. We used chromatin immunoprecipitation followed by genomic promoter microarray hybridization (ChIP-chip) and then interrogated how these binding sites correlate with transcribed genes. Our analysis reveals that, while uH2A levels are globally increased at the genome in the transgenic (TG) striatum, uH2A localization at a gene did not strongly correlate with the absence of its transcript. Furthermore, analysis of differential ubiquitylation in wild-type (WT) and TG striata did not reveal the expected enrichment of uH2A at genes with decreased expression in the TG striatum. This first description of genome-wide localization of uH2A in an HD model reveals that monoubiquitylation of histone H2A may not function at the level of the individual gene but may rather influence transcription through global chromatin structure.
Generation of a neuro-specific microarray reveals novel differentially expressed noncoding RNAs in mouse models for neurodegenerative diseases.

PubMed

Gstir, Ronald; Schafferer, Simon; Scheideler, Marcel; Misslinger, Matthias; Griehl, Matthias; Daschil, Nina; Humpel, Christian; Obermair, Gerald J; Schmuckermair, Claudia; Striessnig, Joerg; Flucher, Bernhard E; Hüttenhofer, Alexander

2014-12-01

We have generated a novel, neuro-specific ncRNA microarray, covering 1472 ncRNA species, to investigate their expression in different mouse models for central nervous system diseases. Thereby, we analyzed ncRNA expression in two mouse models with impaired calcium channel activity, implicated in Epilepsy or Parkinson's disease, respectively, as well as in a mouse model mimicking pathophysiological aspects of Alzheimer's disease. We identified well over a hundred differentially expressed ncRNAs, either from known classes of ncRNAs, such as miRNAs or snoRNAs or which represented entirely novel ncRNA species. Several differentially expressed ncRNAs in the calcium channel mouse models were assigned as miRNAs and target genes involved in calcium signaling, thus suggesting feedback regulation of miRNAs by calcium signaling. In the Alzheimer mouse model, we identified two snoRNAs, whose expression was deregulated prior to amyloid plaque formation. Interestingly, the presence of snoRNAs could be detected in cerebral spine fluid samples in humans, thus potentially serving as early diagnostic markers for Alzheimer's disease. In addition to known ncRNAs species, we also identified 63 differentially expressed, entirely novel ncRNA candidates, located in intronic or intergenic regions of the mouse genome, genomic locations, which previously have been shown to harbor the majority of functional ncRNAs. © 2014 Gstir et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

Concordance in Genomic Changes Between Mouse Lungs and Human Airway Epithelial Cells Exposed to Diesel Exhaust Particles

EPA Science Inventory

Human and animal toxicity studies have shown that exposure to diesel exhaust particles (DEP) or their constituents affect multiple biological processes including immune and inflammatory pathways, mutagenesis and in some cases carcinogenesis. This study compared genomic changes by...
Improved maize reference genome with single-molecule technologies.

PubMed

Jiao, Yinping; Peluso, Paul; Shi, Jinghua; Liang, Tiffany; Stitzer, Michelle C; Wang, Bo; Campbell, Michael S; Stein, Joshua C; Wei, Xuehong; Chin, Chen-Shan; Guill, Katherine; Regulski, Michael; Kumari, Sunita; Olson, Andrew; Gent, Jonathan; Schneider, Kevin L; Wolfgruber, Thomas K; May, Michael R; Springer, Nathan M; Antoniou, Eric; McCombie, W Richard; Presting, Gernot G; McMullen, Michael; Ross-Ibarra, Jeffrey; Dawe, R Kelly; Hastie, Alex; Rank, David R; Ware, Doreen

2017-06-22

Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing. In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.
Reference-guided de novo assembly approach improves genome reconstruction for related species.

PubMed

Lischer, Heidi E L; Shimizu, Kentaro K

2017-11-10

The development of next-generation sequencing has made it possible to sequence whole genomes at a relatively low cost. However, de novo genome assemblies remain challenging due to short read length, missing data, repetitive regions, polymorphisms and sequencing errors. As more and more genomes are sequenced, reference-guided assembly approaches can be used to assist the assembly process. However, previous methods mostly focused on the assembly of other genotypes within the same species. We adapted and extended a reference-guided de novo assembly approach, which enables the usage of a related reference sequence to guide the genome assembly. In order to compare and evaluate de novo and our reference-guided de novo assembly approaches, we used a simulated data set of a repetitive and heterozygotic plant genome. The extended reference-guided de novo assembly approach almost always outperforms the corresponding de novo assembly program even when a reference of a different species is used. Similar improvements can be observed in high and low coverage situations. In addition, we show that a single evaluation metric, like the widely used N50 length, is not enough to properly rate assemblies as it not always points to the best assembly evaluated with other criteria. Therefore, we used the summed z-scores of 36 different statistics to evaluate the assemblies. The combination of reference mapping and de novo assembly provides a powerful tool to improve genome reconstruction by integrating information of a related genome. Our extension of the reference-guided de novo assembly approach enables the application of this strategy not only within but also between related species. Finally, the evaluation of genome assemblies is often not straight forward, as the truth is not known. Thus one should always use a combination of evaluation metrics, which not only try to assess the continuity but also the accuracy of an assembly.
Localization and regulation of mouse pantothenate kinase 2 [The PanK2 Genes of Mouse and Human Specify Proteins with Distinct Subcellular Locations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leonardi, Roberta; Zhang, Yong-Mei; Lykidis, Athanasios

2007-09-07

Coenzyme A (CoA) biosynthesis is initiated by pantothenatekinase (PanK) and CoA levels are controlled through differentialexpression and feedback regulation of PanK isoforms. PanK2 is amitochondrial protein in humans, but comparative genomics revealed thatacquisition of a mitochondrial targeting signal was limited to primates.Human and mouse PanK2 possessed similar biochemical properties, withinhibition by acetylCoA and activation by palmitoylcarnitine. Mouse PanK2localized in the cytosol, and the expression of PanK2 was higher in humanbrain compared to mouse brain. Differences in expression and subcellularlocalization should be considered in developing a mouse model for humanPanK2 deficiency.
Structure and Function of the Splice Variants of TMPRSS2-ERG, a Prevalent Genomic Alteration in Prostate Cancer

DTIC Science & Technology

2011-09-01

the ETS family of transcription factors showing diverse expression patterns in human tissues (Turner and Watson, 2008). ERG, similar to other...and adult mouse tissues . Most striking of these observations was highly selective and abundant expression of erg protein in endothelial cells of...mouse tissues . We for the first time clarified that endogenous ERG was not expressed in normal mouse prostate epithelium (Mohamed et al., 2010
Chromatin immunoprecipitation of mouse embryos.

PubMed

Voss, Anne K; Dixon, Mathew P; McLennan, Tamara; Kueh, Andrew J; Thomas, Tim

2012-01-01

During prenatal development, a large number of different cell types are formed, the vast majority of which contain identical genetic material. The basis of the great variety in cell phenotype and function is the differential expression of the approximately 25,000 genes in the mammalian genome. Transcriptional activity is regulated at many levels by proteins, including members of the basal transcriptional apparatus, DNA-binding transcription factors, and chromatin-binding proteins. Importantly, chromatin structure dictates the availability of a specific genomic locus for transcriptional activation as well as the efficiency, with which transcription can occur. Chromatin immunoprecipitation (ChIP) is a method to assess if chromatin modifications or proteins are present at a specific locus. ChIP involves the cross linking of DNA and associated proteins and immunoprecipitation using specific antibodies to DNA-associated proteins followed by examination of the co-precipitated DNA sequences or proteins. In the last few years, ChIP has become an essential technique for scientists studying transcriptional regulation and chromatin structure. Using ChIP on mouse embryos, we can document the presence or absence of specific proteins and chromatin modifications at genomic loci in vivo during mammalian development. Here, we describe a ChIP technique adapted for mouse embryos.
Sharing reference data and including cows in the reference population improve genomic predictions in Danish Jersey.

PubMed

Su, G; Ma, P; Nielsen, U S; Aamand, G P; Wiggans, G; Guldbrandtsen, B; Lund, M S

2016-06-01

Small reference populations limit the accuracy of genomic prediction in numerically small breeds, such like Danish Jersey. The objective of this study was to investigate two approaches to improve genomic prediction by increasing size of reference population in Danish Jersey. The first approach was to include North American Jersey bulls in Danish Jersey reference population. The second was to genotype cows and use them as reference animals. The validation of genomic prediction was carried out on bulls and cows, respectively. In validation on bulls, about 300 Danish bulls (depending on traits) born in 2005 and later were used as validation data, and the reference populations were: (1) about 1050 Danish bulls, (2) about 1050 Danish bulls and about 1150 US bulls. In validation on cows, about 3000 Danish cows from 87 young half-sib families were used as validation data, and the reference populations were: (1) about 1250 Danish bulls, (2) about 1250 Danish bulls and about 1150 US bulls, (3) about 1250 Danish bulls and about 4800 cows, (4) about 1250 Danish bulls, 1150 US bulls and 4800 Danish cows. Genomic best linear unbiased prediction model was used to predict breeding values. De-regressed proofs were used as response variables. In the validation on bulls for eight traits, the joint DK-US bull reference population led to higher reliability of genomic prediction than the DK bull reference population for six traits, but not for fertility and longevity. Averaged over the eight traits, the gain was 3 percentage points. In the validation on cows for six traits (fertility and longevity were not available), the gain from inclusion of US bull in reference population was 6.6 percentage points in average over the six traits, and the gain from inclusion of cows was 8.2 percentage points. However, the gains from cows and US bulls were not accumulative. The total gain of including both US bulls and Danish cows was 10.5 percentage points. The results indicate that sharing reference data and including cows in reference population are efficient approaches to increase reliability of genomic prediction. Therefore, genomic selection is promising for numerically small population.
Organization and roles of nucleosomes at mouse meiotic recombination hotspots

PubMed Central

Getun, Irina V.; Wu, Zhen K.; Bois, Philippe R.J.

2012-01-01

Meiotic double strand breaks (DSBs) occur at discrete regions in the genome coined hotspots. Precisely what directs site selection of these DSBs is hotly debated and in particular it is unclear which chromatin features, and regulatory factors are necessary for a genomic region to initiate and resolve DSBs as a crossover (CO) event. In human and mouse, one layer of hotspot selection control is a recognition sequence element present at these sites that is bound by the Prdm9 zinc-finger protein. Furthermore, an overall open chromatin structure is thought to be required to allow access of the recombination machinery, and this is often dictated by the packaging of DNA around nucleosomes. We recently defined the nucleosome occupancy maps of four mouse recombination hotspots throughout meiosis. These analyses revealed no obvious dynamic changes in nucleosome occupancy, suggesting an intrinsic nature of recombinogenic sites, yet they also revealed that nucleosomes define zones of exclusion for CO resolution. Here, we discuss new evidence implicating nucleosome occupancy in recombinogenic repair and its potential roles in controlling chromatin structure at mouse meiotic hotspots. PMID:22572955
Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study.

PubMed

Cerdeira, Louise Teixeira; Carneiro, Adriana Ribeiro; Ramos, Rommel Thiago Jucá; de Almeida, Sintia Silva; D'Afonseca, Vivian; Schneider, Maria Paula Cruz; Baumbach, Jan; Tauch, Andreas; McCulloch, John Anthony; Azevedo, Vasco Ariston Carvalho; Silva, Artur

2011-08-01

Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former. Copyright © 2011 Elsevier B.V. All rights reserved.
The projection of a test genome onto a reference population and applications to humans and archaic hominins.

PubMed

Yang, Melinda A; Harris, Kelley; Slatkin, Montgomery

2014-12-01

We introduce a method for comparing a test genome with numerous genomes from a reference population. Sites in the test genome are given a weight, w, that depends on the allele frequency, x, in the reference population. The projection of the test genome onto the reference population is the average weight for each x, [Formula: see text]. The weight is assigned in such a way that, if the test genome is a random sample from the reference population, then [Formula: see text]. Using analytic theory, numerical analysis, and simulations, we show how the projection depends on the time of population splitting, the history of admixture, and changes in past population size. The projection is sensitive to small amounts of past admixture, the direction of admixture, and admixture from a population not sampled (a ghost population). We compute the projections of several human and two archaic genomes onto three reference populations from the 1000 Genomes project-Europeans, Han Chinese, and Yoruba-and discuss the consistency of our analysis with previously published results for European and Yoruba demographic history. Including higher amounts of admixture between Europeans and Yoruba soon after their separation and low amounts of admixture more recently can resolve discrepancies between the projections and demographic inferences from some previous studies. Copyright © 2014 by the Genetics Society of America.
Disrupting the male germ line to find infertility and contraception targets.

PubMed

Archambeault, Denise R; Matzuk, Martin M

2014-05-01

Genetically-manipulated mouse models have become indispensible for broadening our understanding of genes and pathways related to male germ cell development. Until suitable in vitro systems for studying spermatogenesis are perfected, in vivo models will remain the gold standard for inquiry into testicular function. Here, we discuss exciting advances that are allowing researchers faster, easier, and more customizable access to their mouse models of interest. Specifically, the trans-NIH Knockout Mouse Project (KOMP) is working to generate knockout mouse models of every gene in the mouse genome. The related Knockout Mouse Phenotyping Program (KOMP2) is performing systematic phenotypic analysis of this genome-wide collection of knockout mice, including fertility screening. Together, these programs will not only uncover new genes involved in male germ cell development but also provide the research community with the mouse models necessary for further investigations. In addition to KOMP/KOMP2, another promising development in the field of mouse models is the advent of CRISPR (clustered regularly interspaced short palindromic repeat)-Cas technology. Utilizing 20 nucleotide guide sequences, CRISPR/Cas has the potential to introduce sequence-specific insertions, deletions, and point mutations to produce null, conditional, activated, or reporter-tagged alleles. CRISPR/Cas can also successfully target multiple genes in a single experimental step, forgoing the multiple generations of breeding traditionally required to produce mouse models with deletions, insertions, or mutations in multiple genes. In addition, CRISPR/Cas can be used to create mouse models carrying variants identical to those identified in infertile human patients, providing the opportunity to explore the effects of such mutations in an in vivo system. Both the KOMP/KOMP2 projects and the CRISPR/Cas system provide powerful, accessible genetic approaches to the study of male germ cell development in the mouse. A more complete understanding of male germ cell biology is critical for the identification of novel targets for potential non-hormonal contraceptive intervention. Copyright © 2014. Published by Elsevier Masson SAS.
Genome engineering via homologous recombination in mouse embryonic stem (ES) cells: an amazingly versatile tool for the study of mammalian biology.

PubMed

Babinet, C; Cohen-Tannoudji, M

2001-09-01

The ability to introduce genetic modifications in the germ line of complex organisms has been a long-standing goal of those who study developmental biology. In this regard, the mouse, a favorite model for the study of the mammals, is unique: indeed not only is it possible since the late seventies, to add genes to the mouse genome like in several other complex organisms but also to perform gene replacement and modification. This has been made possible via two technological breakthroughs: 1) the isolation and culture of embryonic stem cells (ES), which have the unique ability to colonize all the tissues of an host embryo including its germ line; 2) the development of methods allowing homologous recombination between an incoming DNA and its cognate chromosomal sequence (gene "targeting"). As a result, it has become possible to create mice bearing null mutations in any cloned gene (knock-out mice). Such a possibility has revolutionized the genetic approach of almost all aspects of the biology of the mouse. In recent years, the scope of gene targeting has been widened even more, due to the refinement of the knock-out technology: other types of genetic modifications may now be created, including subtle mutations (point mutations, micro deletions or insertions, etc.) and chromosomal rearrangements such as large deletions, duplications and translocations. Finally, methods have been devised which permit the creation of conditional mutations, allowing the study of gene function throughout the life of an animal, when gene inactivation entails embryonic lethality. In this paper, we present an overview of the methods and scenarios used for the programmed modification of mouse genome, and we underline their enormous interest for the study of mammalian biology.
Genomic profiles of low-grade murine gliomas evolve during progression to glioblastoma. | Office of Cancer Genomics

Cancer.gov

Background: Gliomas are diverse neoplasms with multiple molecular subtypes. How tumor-initiating mutations relate to molecular subtypes as these tumors evolve during malignant progression remains unclear.Methods: We used genetically engineered mouse models, histopathology, genetic lineage tracing, expression profiling, and copy number analyses to examine how genomic tumor diversity evolves during the course of malignant progression from low- to high-grade disease.
The detailed 3D multi-loop aggregate/rosette chromatin architecture and functional dynamic organization of the human and mouse genomes.

PubMed

Knoch, Tobias A; Wachsmuth, Malte; Kepper, Nick; Lesnussa, Michael; Abuseiris, Anis; Ali Imam, A M; Kolovos, Petros; Zuin, Jessica; Kockx, Christel E M; Brouwer, Rutger W W; van de Werken, Harmen J G; van IJcken, Wilfred F J; Wendt, Kerstin S; Grosveld, Frank G

2016-01-01

The dynamic three-dimensional chromatin architecture of genomes and its co-evolutionary connection to its function-the storage, expression, and replication of genetic information-is still one of the central issues in biology. Here, we describe the much debated 3D architecture of the human and mouse genomes from the nucleosomal to the megabase pair level by a novel approach combining selective high-throughput high-resolution chromosomal interaction capture ( T2C ), polymer simulations, and scaling analysis of the 3D architecture and the DNA sequence. The genome is compacted into a chromatin quasi-fibre with ~5 ± 1 nucleosomes/11 nm, folded into stable ~30-100 kbp loops forming stable loop aggregates/rosettes connected by similar sized linkers. Minor but significant variations in the architecture are seen between cell types and functional states. The architecture and the DNA sequence show very similar fine-structured multi-scaling behaviour confirming their co-evolution and the above. This architecture, its dynamics, and accessibility, balance stability and flexibility ensuring genome integrity and variation enabling gene expression/regulation by self-organization of (in)active units already in proximity. Our results agree with the heuristics of the field and allow "architectural sequencing" at a genome mechanics level to understand the inseparable systems genomic properties.
Whole Genome Sequencing of Greater Amberjack (Seriola dumerili) for SNP Identification on Aligned Scaffolds and Genome Structural Variation Analysis Using Parallel Resequencing

PubMed Central

Aokic, Jun-ya; Kawase, Junya; Hamada, Kazuhisa; Fujimoto, Hiroshi; Yamamoto, Ikki; Usuki, Hironori

2018-01-01

Greater amberjack (Seriola dumerili) is distributed in tropical and temperate waters worldwide and is an important aquaculture fish. We carried out de novo sequencing of the greater amberjack genome to construct a reference genome sequence to identify single nucleotide polymorphisms (SNPs) for breeding amberjack by marker-assisted or gene-assisted selection as well as to identify functional genes for biological traits. We obtained 200 times coverage and constructed a high-quality genome assembly using next generation sequencing technology. The assembled sequences were aligned onto a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map by sequence homology. A total of 215 of the longest amberjack sequences, with a total length of 622.8 Mbp (92% of the total length of the genome scaffolds), were lined up on the yellowtail RH map. We resequenced the whole genomes of 20 greater amberjacks and mapped the resulting sequences onto the reference genome sequence. About 186,000 nonredundant SNPs were successfully ordered on the reference genome. Further, we found differences in the genome structural variations between two greater amberjack populations using BreakDancer. We also analyzed the greater amberjack transcriptome and mapped the annotated sequences onto the reference genome sequence. PMID:29785397
Inclusion of Population-specific Reference Panel from India to the 1000 Genomes Phase 3 Panel Improves Imputation Accuracy.

PubMed

Ahmad, Meraj; Sinha, Anubhav; Ghosh, Sreya; Kumar, Vikrant; Davila, Sonia; Yajnik, Chittaranjan S; Chandak, Giriraj R

2017-07-27

Imputation is a computational method based on the principle of haplotype sharing allowing enrichment of genome-wide association study datasets. It depends on the haplotype structure of the population and density of the genotype data. The 1000 Genomes Project led to the generation of imputation reference panels which have been used globally. However, recent studies have shown that population-specific panels provide better enrichment of genome-wide variants. We compared the imputation accuracy using 1000 Genomes phase 3 reference panel and a panel generated from genome-wide data on 407 individuals from Western India (WIP). The concordance of imputed variants was cross-checked with next-generation re-sequencing data on a subset of genomic regions. Further, using the genome-wide data from 1880 individuals, we demonstrate that WIP works better than the 1000 Genomes phase 3 panel and when merged with it, significantly improves the imputation accuracy throughout the minor allele frequency range. We also show that imputation using only South Asian component of the 1000 Genomes phase 3 panel works as good as the merged panel, making it computationally less intensive job. Thus, our study stresses that imputation accuracy using 1000 Genomes phase 3 panel can be further improved by including population-specific reference panels from South Asia.
Identification and functional analysis of long non-coding RNAs in human and mouse early embryos based on single-cell transcriptome data

PubMed Central

Qiu, Jia-jun; Ren, Zhao-rui; Yan, Jing-bin

2016-01-01

Epigenetics regulations have an important role in fertilization and proper embryonic development, and several human diseases are associated with epigenetic modification disorders, such as Rett syndrome, Beckwith-Wiedemann syndrome and Angelman syndrome. However, the dynamics and functions of long non-coding RNAs (lncRNAs), one type of epigenetic regulators, in human pre-implantation development have not yet been demonstrated. In this study, a comprehensive analysis of human and mouse early-stage embryonic lncRNAs was performed based on public single-cell RNA sequencing data. Expression profile analysis revealed that lncRNAs are expressed in a developmental stage–specific manner during human early-stage embryonic development, whereas a more temporal-specific expression pattern was identified in mouse embryos. Weighted gene co-expression network analysis suggested that lncRNAs involved in human early-stage embryonic development are associated with several important functions and processes, such as oocyte maturation, zygotic genome activation and mitochondrial functions. We also found that the network of lncRNAs involved in zygotic genome activation was highly preservative between human and mouse embryos, whereas in other stages no strong correlation between human and mouse embryo was observed. This study provides insight into the molecular mechanism underlying lncRNA involvement in human pre-implantation embryonic development. PMID:27542205
Genomes as geography: using GIS technology to build interactive genome feature maps

PubMed Central

Dolan, Mary E; Holden, Constance C; Beard, M Kate; Bult, Carol J

2006-01-01

Background Many commonly used genome browsers display sequence annotations and related attributes as horizontal data tracks that can be toggled on and off according to user preferences. Most genome browsers use only simple keyword searches and limit the display of detailed annotations to one chromosomal region of the genome at a time. We have employed concepts, methodologies, and tools that were developed for the display of geographic data to develop a Genome Spatial Information System (GenoSIS) for displaying genomes spatially, and interacting with genome annotations and related attribute data. In contrast to the paradigm of horizontally stacked data tracks used by most genome browsers, GenoSIS uses the concept of registered spatial layers composed of spatial objects for integrated display of diverse data. In addition to basic keyword searches, GenoSIS supports complex queries, including spatial queries, and dynamically generates genome maps. Our adaptation of the geographic information system (GIS) model in a genome context supports spatial representation of genome features at multiple scales with a versatile and expressive query capability beyond that supported by existing genome browsers. Results We implemented an interactive genome sequence feature map for the mouse genome in GenoSIS, an application that uses ArcGIS, a commercially available GIS software system. The genome features and their attributes are represented as spatial objects and data layers that can be toggled on and off according to user preferences or displayed selectively in response to user queries. GenoSIS supports the generation of custom genome maps in response to complex queries about genome features based on both their attributes and locations. Our example application of GenoSIS to the mouse genome demonstrates the powerful visualization and query capability of mature GIS technology applied in a novel domain. Conclusion Mapping tools developed specifically for geographic data can be exploited to display, explore and interact with genome data. The approach we describe here is organism independent and is equally useful for linear and circular chromosomes. One of the unique capabilities of GenoSIS compared to existing genome browsers is the capacity to generate genome feature maps dynamically in response to complex attribute and spatial queries. PMID:16984652
Comparison of TCDD-elicited genome-wide hepatic gene expression in Sprague–Dawley rats and C57BL/6 mice

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nault, Rance; Kim, Suntae; Zacharewski, Timothy R., E-mail: tzachare@msu.edu

2013-03-01

Although the structure and function of the AhR are conserved, emerging evidence suggests that downstream effects are species-specific. In this study, rat hepatic gene expression data from the DrugMatrix database (National Toxicology Program) were compared to mouse hepatic whole-genome gene expression data following treatment with 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). For the DrugMatrix study, male Sprague–Dawley rats were gavaged daily with 20 μg/kg TCDD for 1, 3 and 5 days, while female C57BL/6 ovariectomized mice were examined 1, 3 and 7 days after a single oral gavage of 30 μg/kg TCDD. A total of 649 rat and 1386 mouse genes (|fold change| ≥more » 1.5, P1(t) ≥ 0.99) were differentially expressed following treatment. HomoloGene identified 11,708 orthologs represented across the rat Affymetrix 230 2.0 GeneChip (12,310 total orthologs), and the mouse 4 × 44K v.1 Agilent oligonucleotide array (17,578 total orthologs). Comparative analysis found 563 and 922 orthologs differentially expressed in response to TCDD in the rat and mouse, respectively, with 70 responses associated with immune function and lipid metabolism in common to both. Moreover, QRTPCR analysis of Ceacam1, showed divergent expression (induced in rat; repressed in mouse) functionally consistent with TCDD-elicited hepatic steatosis in the mouse but not the rat. Functional analysis identified orthologs involved in nucleotide binding and acetyltransferase activity in rat, while mouse-specific responses were associated with steroid, phospholipid, fatty acid, and carbohydrate metabolism. These results provide further evidence that TCDD elicits species-specific regulation of distinct gene networks, and outlines considerations for future comparisons of publicly available microarray datasets. - Highlights: ► We performed a whole-genome comparison of TCDD-regulated genes in mice and rats. ► Previous species comparisons were extended using data from the DrugMatrix database. ► Less than 15% of TCDD-regulated orthologs were common to mice and rats. ► Considerations for the comparison of publicly available datasets are described.« less
Genes Altered by Intracisternal A Particles in Mouse Mammary Tumorigenesis

DTIC Science & Technology

1997-07-01

mouse Mus musculus as well as most other rodents (1). They are defective retroviruses which contain 3’ and 5’ long terminal repeat (LTR) sequences and... musculus (C57BL/6J) X Mus spretus backcross was obtained for The Jackson Laboratory (Bar Harbor, Maine) and used for localization of the pl7b(kokopelli...understand the nature of the potential mutation found in the tumors I decided to localize pl7b within the mouse genome. I screened a Mus musculus musculus X

New Approaches and Technologies to Sequence de novo Plant reference Genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

ScienceCinema

Schmutz, Jeremy

2018-02-01

Jeremy Schmutz of the HudsonAlpha Institute for Biotechnology on New approaches and technologies to sequence de novo plant reference genomes at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.
Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome

USDA-ARS?s Scientific Manuscript database

PacBio long-read sequencing technology is increasingly popular in genome sequence assembly and transcriptome cataloguing. Recently, a new-generation pig reference genome was assembled based on long reads from this technology. To finely annotate this genome assembly, transcriptomes of nine tissues fr...
New Approaches and Technologies to Sequence de novo Plant reference Genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmutz, Jeremy

2013-03-01

Jeremy Schmutz of the HudsonAlpha Institute for Biotechnology on New approaches and technologies to sequence de novo plant reference genomes at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.
Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

PubMed Central

Kim, Woonsu; Park, Hyesun; Seo, Seongwon

2016-01-01

The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID:26992093
The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae

PubMed Central

Neale, David B.; McGuire, Patrick E.; Wheeler, Nicholas C.; Stevens, Kristian A.; Crepeau, Marc W.; Cardeno, Charis; Zimin, Aleksey V.; Puiu, Daniela; Pertea, Geo M.; Sezen, U. Uzay; Casola, Claudio; Koralewski, Tomasz E.; Paul, Robin; Gonzalez-Ibeas, Daniel; Zaman, Sumaira; Cronn, Richard; Yandell, Mark; Holt, Carson; Langley, Charles H.; Yorke, James A.; Salzberg, Steven L.; Wegrzyn, Jill L.

2017-01-01

A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp). Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms. PMID:28751502
Targeting the histone methyltransferase G9a activates imprinted genes and improves survival of a mouse model of Prader–Willi syndrome

PubMed Central

Kim, Yuna; Lee, Hyeong-Min; Xiong, Yan; Sciaky, Noah; Hulbert, Samuel W; Cao, Xinyu; Everitt, Jeffrey I; Jin, Jian; Roth, Bryan L; Jiang, Yong-hui

2017-01-01

Prader–Willi syndrome (PWS) is an imprinting disorder caused by a deficiency of paternally expressed gene(s) in the 15q11–q13 chromosomal region. The regulation of imprinted gene expression in this region is coordinated by an imprinting center (PWS-IC). In individuals with PWS, genes responsible for PWS on the maternal chromosome are present, but repressed epigenetically, which provides an opportunity for the use of epigenetic therapy to restore expression from the maternal copies of PWS-associated genes. Through a high-content screen (HCS) of >9,000 small molecules, we discovered that UNC0638 and UNC0642—two selective inhibitors of euchromatic histone lysine N-methyltransferase-2 (EHMT2, also known as G9a)—activated the maternal (m) copy of candidate genes underlying PWS, including the SnoRNA cluster SNORD116, in cells from humans with PWS and also from a mouse model of PWS carrying a paternal (p) deletion from small nuclear ribonucleoprotein N (Snrpn (S)) to ubiquitin protein ligase E3A (Ube3a (U)) (mouse model referred to hereafter as m+/pΔS−U). Both UNC0642 and UNC0638 caused a selective reduction of the dimethylation of histone H3 lysine 9 (H3K9me2) at PWS-IC, without changing DNA methylation, when analyzed by bisulfite genomic sequencing. This indicates that histone modification is essential for the imprinting of candidate genes underlying PWS. UNC0642 displayed therapeutic effects in the PWS mouse model by improving the survival and the growth of m+/pΔS−U newborn pups. This study provides the first proof of principle for an epigenetics-based therapy for PWS. PMID:28024084
Fetal and Placental DNA Stimulation of TLR9: A Mechanism Possibly Contributing to the Pro-inflammatory Events During Parturition.

PubMed

Goldfarb, Ilona Telefus; Adeli, Sharareh; Berk, Tucker; Phillippe, Mark

2018-05-01

While there is evidence for a relationship between cell-free fetal DNA (cffDNA) and parturition, questions remain regarding whether cffDNA could trigger a pro-inflammatory response on the pathway to parturition. We hypothesized that placental and/or fetal DNA stimulates toll-like receptor 9 (TLR9) leading to secretion of pro-inflammatory cytokines by macrophage cells. Four in vitro DNA stimulation studies were performed using RAW 264.7 mouse peritoneal macrophage cells incubated in media containing the following DNA particles: an oligodeoxynucleotide (ODN2395), intact genomic DNA (from mouse placentas, fetuses and adult liver), mouse DNA complexed with DOTAP (a cationic liposome forming compound), and telomere-depleted mouse DNA. Interleukin 6 (IL6) secretion was measured in the media by enzyme-linked immunosorbent assay; and the cell pellet was homogenized for protein content (picograms IL6/mg protein). Robust IL6 secretion was observed in response to ODN2395 (a CpG-rich TLR9 agonist), mouse DNA-DOTAP complexes, and telomere-depleted mouse DNA in concentrations of 5 to 15 μg/mL. In contrast, ODN A151 (containing telomere sequence motifs), intact genomic mouse DNA, and restriction enzyme-digested DNA had no effect on IL6 secretion. The IL6 response was significantly inhibited by chloroquine (10 μg/mL), thereby confirming the important role for TLR9 in the response by macrophage cells. DNA derived from mouse placentas and fetuses, and depleted of telomeric sequences, stimulates a robust pro-inflammatory response by macrophage cells, thereby supporting the hypothesis that cffDNA is able to stimulate an innate immune response that could trigger the onset of parturition. These findings are of clinical importance, as we search for effective treatment/prevention of preterm parturition.
GETPrime 2.0: gene- and transcript-specific qPCR primers for 13 species including polymorphisms.

PubMed

David, Fabrice P A; Rougemont, Jacques; Deplancke, Bart

2017-01-04

GETPrime (http://bbcftools.epfl.ch/getprime) is a database with a web frontend providing gene- and transcript-specific, pre-computed qPCR primer pairs. The primers have been optimized for genome-wide specificity and for allowing the selective amplification of one or several splice variants of most known genes. To ease selection, primers have also been ranked according to defined criteria such as genome-wide specificity (with BLAST), amplicon size, and isoform coverage. Here, we report a major upgrade (2.0) of the database: eight new species (yeast, chicken, macaque, chimpanzee, rat, platypus, pufferfish, and Anolis carolinensis) now complement the five already included in the previous version (human, mouse, zebrafish, fly, and worm). Furthermore, the genomic reference has been updated to Ensembl v81 (while keeping earlier versions for backward compatibility) as a result of re-designing the back-end database and automating the import of relevant sections of the Ensembl database in species-independent fashion. This also allowed us to map known polymorphisms to the primers (on average three per primer for human), with the aim of reducing experimental error when targeting specific strains or individuals. Another consequence is that the inclusion of future Ensembl releases and other species has now become a relatively straightforward task. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hybrid Sterility Locus on Chromosome X Controls Meiotic Recombination Rate in Mouse

PubMed Central

Balcova, Maria; Faltusova, Barbora; Gergelits, Vaclav; Bhattacharyya, Tanmoy; Mihola, Ondrej; Trachtulec, Zdenek; Knopf, Corinna; Fotopulosova, Vladana; Chvatalova, Irena; Gregorova, Sona; Forejt, Jiri

2016-01-01

Meiotic recombination safeguards proper segregation of homologous chromosomes into gametes, affects genetic variation within species, and contributes to meiotic chromosome recognition, pairing and synapsis. The Prdm9 gene has a dual role, it controls meiotic recombination by determining the genomic position of crossover hotspots and, in infertile hybrids of house mouse subspecies Mus m. musculus (Mmm) and Mus m. domesticus (Mmd), it further functions as the major hybrid sterility gene. In the latter role Prdm9 interacts with the hybrid sterility X 2 (Hstx2) genomic locus on Chromosome X (Chr X) by a still unknown mechanism. Here we investigated the meiotic recombination rate at the genome-wide level and its possible relation to hybrid sterility. Using immunofluorescence microscopy we quantified the foci of MLH1 DNA mismatch repair protein, the cytological counterparts of reciprocal crossovers, in a panel of inter-subspecific chromosome substitution strains. Two autosomes, Chr 7 and Chr 11, significantly modified the meiotic recombination rate, yet the strongest modifier, designated meiotic recombination 1, Meir1, emerged in the 4.7 Mb Hstx2 genomic locus on Chr X. The male-limited transgressive effect of Meir1 on recombination rate parallels the male-limited transgressive role of Hstx2 in hybrid male sterility. Thus, both genetic factors, the Prdm9 gene and the Hstx2/Meir1 genomic locus, indicate a link between meiotic recombination and hybrid sterility. A strong female-specific modifier of meiotic recombination rate with the effect opposite to Meir1 was localized on Chr X, distally to Meir1. Mapping Meir1 to a narrow candidate interval on Chr X is an important first step towards positional cloning of the respective gene(s) responsible for variation in the global recombination rate between closely related mouse subspecies. PMID:27104744
Genomic Methylation Inhibits Expression of Hepatitis B Virus Envelope Protein in Transgenic Mice: A Non-Infectious Mouse Model to Study Silencing of HBV Surface Antigen Genes.

PubMed

Graumann, Franziska; Churin, Yuri; Tschuschner, Annette; Reifenberg, Kurt; Glebe, Dieter; Roderfeld, Martin; Roeb, Elke

2015-01-01

The Hepatitis B virus genome persists in the nucleus of virus infected hepatocytes where it serves as template for viral mRNA synthesis. Epigenetic modifications, including methylation of the CpG islands contribute to the regulation of viral gene expression. The present study investigates the effects of spontaneous age dependent loss of hepatitis B surface protein- (HBs) expression due to HBV-genome specific methylation as well as its proximate positive effects in HBs transgenic mice. Liver and serum of HBs transgenic mice aged 5-33 weeks were analyzed by Western blot, immunohistochemistry, serum analysis, PCR, and qRT-PCR. From the third month of age hepatic loss of HBs was observed in 20% of transgenic mice. The size of HBs-free area and the relative number of animals with these effects increased with age and struck about 55% of animals aged 33 weeks. Loss of HBs-expression was strongly correlated with amelioration of serum parameters ALT and AST. In addition lower HBs-expression went on with decreased ER-stress. The loss of surface protein expression started on transcriptional level and appeared to be regulated epigenetically by DNA methylation. The amount of the HBs-expression correlated negatively with methylation of HBV DNA in the mouse genome. Our data suggest that methylation of specific CpG sites controls gene expression even in HBs-transgenic mice with truncated HBV genome. More important, the loss of HBs expression and intracellular aggregation ameliorated cell stress and liver integrity. Thus, targeted modulation of HBs expression may offer new therapeutic approaches. Furthermore, HBs-transgenic mice depict a non-infectious mouse model to study one possible mechanism of HBs gene silencing by hypermethylation.
Hybrid Sterility Locus on Chromosome X Controls Meiotic Recombination Rate in Mouse.

PubMed

Balcova, Maria; Faltusova, Barbora; Gergelits, Vaclav; Bhattacharyya, Tanmoy; Mihola, Ondrej; Trachtulec, Zdenek; Knopf, Corinna; Fotopulosova, Vladana; Chvatalova, Irena; Gregorova, Sona; Forejt, Jiri

2016-04-01

Meiotic recombination safeguards proper segregation of homologous chromosomes into gametes, affects genetic variation within species, and contributes to meiotic chromosome recognition, pairing and synapsis. The Prdm9 gene has a dual role, it controls meiotic recombination by determining the genomic position of crossover hotspots and, in infertile hybrids of house mouse subspecies Mus m. musculus (Mmm) and Mus m. domesticus (Mmd), it further functions as the major hybrid sterility gene. In the latter role Prdm9 interacts with the hybrid sterility X 2 (Hstx2) genomic locus on Chromosome X (Chr X) by a still unknown mechanism. Here we investigated the meiotic recombination rate at the genome-wide level and its possible relation to hybrid sterility. Using immunofluorescence microscopy we quantified the foci of MLH1 DNA mismatch repair protein, the cytological counterparts of reciprocal crossovers, in a panel of inter-subspecific chromosome substitution strains. Two autosomes, Chr 7 and Chr 11, significantly modified the meiotic recombination rate, yet the strongest modifier, designated meiotic recombination 1, Meir1, emerged in the 4.7 Mb Hstx2 genomic locus on Chr X. The male-limited transgressive effect of Meir1 on recombination rate parallels the male-limited transgressive role of Hstx2 in hybrid male sterility. Thus, both genetic factors, the Prdm9 gene and the Hstx2/Meir1 genomic locus, indicate a link between meiotic recombination and hybrid sterility. A strong female-specific modifier of meiotic recombination rate with the effect opposite to Meir1 was localized on Chr X, distally to Meir1. Mapping Meir1 to a narrow candidate interval on Chr X is an important first step towards positional cloning of the respective gene(s) responsible for variation in the global recombination rate between closely related mouse subspecies.
Gene expression based mouse brain parcellation using Markov random field regularized non-negative matrix factorization

NASA Astrophysics Data System (ADS)

Pathak, Sayan D.; Haynor, David R.; Thompson, Carol L.; Lein, Ed; Hawrylycz, Michael

2009-02-01

Understanding the geography of genetic expression in the mouse brain has opened previously unexplored avenues in neuroinformatics. The Allen Brain Atlas (www.brain-map.org) (ABA) provides genome-wide colorimetric in situ hybridization (ISH) gene expression images at high spatial resolution, all mapped to a common three-dimensional 200μm3 spatial framework defined by the Allen Reference Atlas (ARA) and is a unique data set for studying expression based structural and functional organization of the brain. The goal of this study was to facilitate an unbiased data-driven structural partitioning of the major structures in the mouse brain. We have developed an algorithm that uses nonnegative matrix factorization (NMF) to perform parts based analysis of ISH gene expression images. The standard NMF approach and its variants are limited in their ability to flexibly integrate prior knowledge, in the context of spatial data. In this paper, we introduce spatial connectivity as an additional regularization in NMF decomposition via the use of Markov Random Fields (mNMF). The mNMF algorithm alternates neighborhood updates with iterations of the standard NMF algorithm to exploit spatial correlations in the data. We present the algorithm and show the sub-divisions of hippocampus and somatosensory-cortex obtained via this approach. The results are compared with established neuroanatomic knowledge. We also highlight novel gene expression based sub divisions of the hippocampus identified by using the mNMF algorithm.
The Association Between Molecular Markers in Colorectal Sessile Serrated Polyps and Colorectal Cancer Risk

DTIC Science & Technology

2017-08-01

mouse and human colon epithelium; Aim 2.) Perform genome editing using CRISPR /Cas9 on immortalized human colon epithelial cells to introduce CRC...relevant gene mutations; Aim 3.) Use CRISPR /Cas9 genome editing in colon organoid cultures to introduce CRC relevant gene mutations into primary colon cells
DNA Methylation Errors in Cloned Mouse Sperm by Germ Line Barrier Evasion.

PubMed

Koike, Tasuku; Wakai, Takuya; Jincho, Yuko; Sakashita, Akihiko; Kobayashi, Hisato; Mizutani, Eiji; Wakayama, Sayaka; Miura, Fumihito; Ito, Takashi; Kono, Tomohiro

2016-06-01

The germ line reprogramming barrier resets parental epigenetic modifications according to sex, conferring totipotency to mammalian embryos upon fertilization. However, it is not known whether epigenetic errors are committed during germ line reprogramming that are then transmitted to germ cells, and consequently to offspring. We addressed this question in the present study by performing a genome-wide DNA methylation analysis using a target postbisulfite sequencing method in order to identify DNA methylation errors in cloned mouse sperm. The sperm genomes of two somatic cell-cloned mice (CL1 and CL7) contained significantly higher numbers of differentially methylated CpG sites (P = 0.0045 and P = 0.0116). As a result, they had higher numbers of differentially methylated CpG islands. However, there was no evidence that these sites were transmitted to the sperm genome of offspring. These results suggest that DNA methylation errors resulting from embryo cloning are transmitted to the sperm genome by evading the germ line reprogramming barrier. © 2016 by the Society for the Study of Reproduction, Inc.
Mouse assay for determination of arsenic bioavailability in contaminated soils.

PubMed

Bradham, Karen D; Diamond, Gary L; Scheckel, Kirk G; Hughes, Michael F; Casteel, Stan W; Miller, Bradley W; Klotzbach, Julie M; Thayer, William C; Thomas, David J

2013-01-01

A mouse assay for measuring the relative bioavailability (RBA) of arsenic (As) in soil was developed. In this study, results are presented of RBA assays of 16 soils, including multiple assays of the same soils, which provide a quantitative assessment of reproducibility of mouse assay results, as well as a comparison of results from the mouse assay with results from a swine and monkey assay applied to the same test soils. The mouse assay is highly reproducible; three repeated assays on the same soils yielded RBA estimates that ranged from 1 to 3% of the group mean. The mouse, monkey, and swine models yielded similar results for some, but not all, test materials. RBA estimates for identical soils (nine test soils and three standard reference materials [SRM]) assayed in mice and swine were significantly correlated (r = 0.70). Swine RBA estimates for 6 of the 12 test materials were higher than those from the mouse assay. RBA estimates for three standard reference materials (SRM) were not statistically different (mouse/swine ratio ranged from 0.86-1). When four test soils from the same orchard were assessed in the mouse, monkey, and swine assays, the mean soil As RBA were not statistically different. Mouse and swine models predicted similar steady state urinary excretion fractions (UEF) for As of 62 and 74%, respectively, during repeated ingestion doses of sodium arsenate, the water-soluble As form used as the reference in the calculation of RBA. In the mouse assay, the UEF for water soluble As(V) (sodium arsenate) and As(III) (sodium [meta] arsenite) were 62% and 66%, respectively, suggesting similar absolute bioavailabilities for the two As species. The mouse assay can serve as a highly cost-effective alternative or supplement to monkey and swine assays for improving As risk assessments by providing site-specific assessments of RBA of As in soils.
Distinguishing between cancer driver and passenger gene alteration candidates via cross-species comparison: a pilot study.

PubMed

Ji, Xinglai; Tang, Jie; Halberg, Richard; Busam, Dana; Ferriera, Steve; Peña, Maria Marjorette O; Venkataramu, Chinnambally; Yeatman, Timothy J; Zhao, Shaying

2010-08-13

We are developing a cross-species comparison strategy to distinguish between cancer driver- and passenger gene alteration candidates, by utilizing the difference in genomic location of orthologous genes between the human and other mammals. As an initial test of this strategy, we conducted a pilot study with human colorectal cancer (CRC) and its mouse model C57BL/6J ApcMin/+, focusing on human 5q22.2 and 18q21.1-q21.2. We first performed bioinformatics analysis on the evolution of 5q22.2 and 18q21.1-q21.2 regions. Then, we performed exon-targeted sequencing, real time quantitative polymerase chain reaction (qPCR), and real time quantitative reverse transcriptase PCR (qRT-PCR) analyses on a number of genes of both regions with both human and mouse colon tumors. These two regions (5q22.2 and 18q21.1-q21.2) are frequently deleted in human CRCs and encode genuine colorectal tumor suppressors APC and SMAD4. They also encode genes such as MCC (mutated in colorectal cancer) with their role in CRC etiology unknown. We have discovered that both regions are evolutionarily unstable, resulting in genes that are clustered in each human region being found scattered at several distinct loci in the genome of many other species. For instance, APC and MCC are within 200 kb apart in human 5q22.2 but are 10 Mb apart in the mouse genome. Importantly, our analyses revealed that, while known CRC driver genes APC and SMAD4 were disrupted in both human colorectal tumors and tumors from ApcMin/+ mice, the questionable MCC gene was disrupted in human tumors but appeared to be intact in mouse tumors. These results indicate that MCC may not actually play any causative role in early colorectal tumorigenesis. We also hypothesize that its disruption in human CRCs is likely a mere result of its close proximity to APC in the human genome. Expanding this pilot study to the entire genome may identify more questionable genes like MCC, facilitating the discovery of new CRC driver gene candidates.
Necklace: combining reference and assembled transcriptomes for more comprehensive RNA-Seq analysis.

PubMed

Davidson, Nadia M; Oshlack, Alicia

2018-05-01

RNA sequencing (RNA-seq) analyses can benefit from performing a genome-guided and de novo assembly, in particular for species where the reference genome or the annotation is incomplete. However, tools for integrating an assembled transcriptome with reference annotation are lacking. Necklace is a software pipeline that runs genome-guided and de novo assembly and combines the resulting transcriptomes with reference genome annotations. Necklace constructs a compact but comprehensive superTranscriptome out of the assembled and reference data. Reads are subsequently aligned and counted in preparation for differential expression testing. Necklace allows a comprehensive transcriptome to be built from a combination of assembled and annotated transcripts, which results in a more comprehensive transcriptome for the majority of organisms. In addition RNA-seq data are mapped back to this newly created superTranscript reference to enable differential expression testing with standard methods.
High-throughput behavioral phenotyping of drug and alcohol susceptibility traits in the expanded panel of BXD recombinant inbred strains

DOE Office of Scientific and Technical Information (OSTI.GOV)

Philip, Vivek M; Ansah, T; Blaha, C,

Genetic reference populations, particularly the BXD recombinant inbred strains, are a valuable resource for the discovery of the bio-molecular substrates and genetic drivers responsible for trait variation and co- ariation. This approach can be profitably applied in the analysis of susceptibility and mechanisms of drug and alcohol use disorders for which many predisposing behaviors may predict occurrence and manifestation of increased preference for these substances. Many of these traits are modeled by common mouse behavioral assays, facilitating the detection of patterns and sources of genetic co-regulation of predisposing phenotypes and substance consumption. Members of the Tennessee Mouse Genome Consortium havemore » obtained behavioral phenotype data from 260 measures related to multiple behavioral assays across several domains: self-administration, response to, and withdrawal from cocaine, MDMA, morphine and alcohol; novelty seeking; behavioral despair and related neurological phenomena; pain sensitivity; stress sensitivity; anxiety; hyperactivity; and sleep/wake cycles. All traits have been measured in both sexes and the recently expanded panel of 69 additional BXD recombinant inbred strains (N=69). Sex differences and heritability estimates were obtained for each trait, and a comparison of early (N = 32) and recent BXD RI lines was performed. Primary data is publicly available for heritability, sex difference and genetic analyses using www.GeneNetwork.org. These analyses include QTL detection and genetic analysis of gene expression. Stored results from these analyses are available at http://ontologicaldiscovery.org for comparison to other genomic analysis results. Together with the results of related studies, these data form a public resource for integrative systems genetic analysis of neurobehavioral traits.« less
Multi-tissue DNA methylation age predictor in mouse.

PubMed

Stubbs, Thomas M; Bonder, Marc Jan; Stark, Anne-Katrien; Krueger, Felix; von Meyenn, Ferdinand; Stegle, Oliver; Reik, Wolf

2017-04-11

DNA methylation changes at a discrete set of sites in the human genome are predictive of chronological and biological age. However, it is not known whether these changes are causative or a consequence of an underlying ageing process. It has also not been shown whether this epigenetic clock is unique to humans or conserved in the more experimentally tractable mouse. We have generated a comprehensive set of genome-scale base-resolution methylation maps from multiple mouse tissues spanning a wide range of ages. Many CpG sites show significant tissue-independent correlations with age which allowed us to develop a multi-tissue predictor of age in the mouse. Our model, which estimates age based on DNA methylation at 329 unique CpG sites, has a median absolute error of 3.33 weeks and has similar properties to the recently described human epigenetic clock. Using publicly available datasets, we find that the mouse clock is accurate enough to measure effects on biological age, including in the context of interventions. While females and males show no significant differences in predicted DNA methylation age, ovariectomy results in significant age acceleration in females. Furthermore, we identify significant differences in age-acceleration dependent on the lipid content of the diet. Here we identify and characterise an epigenetic predictor of age in mice, the mouse epigenetic clock. This clock will be instrumental for understanding the biology of ageing and will allow modulation of its ticking rate and resetting the clock in vivo to study the impact on biological age.
Extensive sequencing of seven human genomes to characterize benchmark reference materials

PubMed Central

Zook, Justin M.; Catoe, David; McDaniel, Jennifer; Vang, Lindsay; Spies, Noah; Sidow, Arend; Weng, Ziming; Liu, Yuling; Mason, Christopher E.; Alexander, Noah; Henaff, Elizabeth; McIntyre, Alexa B.R.; Chandramohan, Dhruva; Chen, Feng; Jaeger, Erich; Moshrefi, Ali; Pham, Khoa; Stedman, William; Liang, Tiffany; Saghbini, Michael; Dzakula, Zeljko; Hastie, Alex; Cao, Han; Deikus, Gintaras; Schadt, Eric; Sebra, Robert; Bashir, Ali; Truty, Rebecca M.; Chang, Christopher C.; Gulbahce, Natali; Zhao, Keyan; Ghosh, Srinka; Hyland, Fiona; Fu, Yutao; Chaisson, Mark; Xiao, Chunlin; Trow, Jonathan; Sherry, Stephen T.; Zaranek, Alexander W.; Ball, Madeleine; Bobe, Jason; Estep, Preston; Church, George M.; Marks, Patrick; Kyriazopoulou-Panagiotopoulou, Sofia; Zheng, Grace X.Y.; Schnall-Levin, Michael; Ordonez, Heather S.; Mudivarti, Patrice A.; Giorda, Kristina; Sheng, Ying; Rypdal, Karoline Bjarnesdatter; Salit, Marc

2016-01-01

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly. PMID:27271295

Applications and Limitations of Mouse Models for Understanding Human Atherosclerosis

PubMed Central

von Scheidt, Moritz; Zhao, Yuqi; Kurt, Zeyneb; Pan, Calvin; Zeng, Lingyao; Yang, Xia; Schunkert, Heribert; Lusis, Aldons J.

2017-01-01

Most of the biological understanding of mechanisms underlying coronary artery disease (CAD) derives from studies of mouse models. The identification of multiple CAD loci and strong candidate genes in large human genome-wide association studies (GWAS) presented an opportunity to examine the relevance of mouse models for the human disease. We comprehensively reviewed the mouse literature, including 827 literature-derived genes, and compared it to human data. First, we observed striking concordance of risk factors for atherosclerosis in mice and humans. Second, there was highly significant overlap of mouse genes with human genes identified by GWAS. In particular, of the 46 genes with strong association signals in CAD-GWAS that were studied in mouse models all but one exhibited consistent effects on atherosclerosis-related phenotypes. Third, we compared 178 CAD-associated pathways derived from human GWAS with 263 from mouse studies and observed that over 50% were consistent between both species. PMID:27916529
Mouse chromosomal mapping of a murine leukemia virus integration region (Mis-1) first identified in rat thymic leukemia.

PubMed Central

Jolicoeur, P; Villeneuve, L; Rassart, E; Kozak, C

1985-01-01

We have previously identified a region of genomic DNA which constitutes the site of frequent provirus integration in rat thymomas induced by Moloney murine leukemia virus (Lemay and Jolicoeur, Proc. Natl. Acad. Sci. USA 81:38-42, 1984). This genetic locus is now designated Mis-1 (Moloney integration site). Cellular sequences homologous to Mis-1 are present in mouse DNA. Using a series of hamster-mouse somatic cell hybrids, we mapped the Mis-1 locus to mouse chromosome 15. Frequent chromosome 15 aberrations have been described in mouse thymomas. Mis-1 represents a putative new oncogene which might be involved in the initiation or maintenance or both of these neoplasms. Images PMID:4068142
Construction and Annotation of a High Density SNP Linkage Map of the Atlantic Salmon (Salmo salar) Genome.

PubMed

Tsai, Hsin Y; Robledo, Diego; Lowe, Natalie R; Bekaert, Michael; Taggart, John B; Bron, James E; Houston, Ross D

2016-07-07

High density linkage maps are useful tools for fine-scale mapping of quantitative trait loci, and characterization of the recombination landscape of a species' genome. Genomic resources for Atlantic salmon (Salmo salar) include a well-assembled reference genome, and high density single nucleotide polymorphism (SNP) arrays. Our aim was to create a high density linkage map, and to align it with the reference genome assembly. Over 96,000 SNPs were mapped and ordered on the 29 salmon linkage groups using a pedigreed population comprising 622 fish from 60 nuclear families, all genotyped with the 'ssalar01' high density SNP array. The number of SNPs per group showed a high positive correlation with physical chromosome length (r = 0.95). While the order of markers on the genetic and physical maps was generally consistent, areas of discrepancy were identified. Approximately 6.5% of the previously unmapped reference genome sequence was assigned to chromosomes using the linkage map. Male recombination rate was lower than females across the vast majority of the genome, but with a notable peak in subtelomeric regions. Finally, using RNA-Seq data to annotate the reference genome, the mapped SNPs were categorized according to their predicted function, including annotation of ∼2500 putative nonsynonymous variants. The highest density SNP linkage map for any salmonid species has been created, annotated, and integrated with the Atlantic salmon reference genome assembly. This map highlights the marked heterochiasmy of salmon, and provides a useful resource for salmonid genetics and genomics research. Copyright © 2016 Tsai et al.
Haploinsufficiency of Anx7 tumor suppressor gene and consequent genomic instability promotes tumorigenesis in the Anx7(+/-) mouse

PubMed Central

Srivastava, Meera; Montagna, Cristina; Leighton, Ximena; Glasman, Mirta; Naga, Shanmugam; Eidelman, Ofer; Ried, Thomas; Pollard, Harvey B.

2003-01-01

Annexin 7 (ANX7) acts as a tumor suppressor gene in prostate cancer, where loss of heterozygosity and reduction of ANX7 protein expression is associated with aggressive metastatic tumors. To investigate the mechanism by which this gene controls tumor development, we have developed an Anx7(+/-) knockout mouse. As hypothesized, the Anx7(+/-) mouse has a cancer-prone phenotype. The emerging tumors express low levels of Anx7 protein. Nonetheless, the wild-type Anx7 allele is detectable in laser-capture microdissection-derived tumor tissue cells. Genome array analysis of hepatocellular carcinoma tissue indicates that the Anx7(+/-) genotype is accompanied by profound reductions of expression of several other tumor suppressor genes, DNA repair genes, and apoptosis-related genes. In situ analysis by tissue imprinting from chromosomes in the primary tumor and spectral karyotyping analysis of derived cell lines identify chromosomal instability and clonal chromosomal aberrations. Furthermore, whereas 23% of the mutant mice develop spontaneous neoplasms, all mice exhibit growth anomalies, including gender-specific gigantism and organomegaly. We conclude that haploinsufficiency of Anx7 expression appears to drive disease progression to cancer because of genomic instability through a discrete signaling pathway involving other tumor suppressor genes, DNA-repair genes, and apoptosis-related genes. PMID:14608035
Genomic interval engineering of mice identified a novel modulator of triglyceride production

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Y.; Jong, M.C.; Frazer, K.A.

1999-10-01

To accelerate the biological annotation of novel genes discovered in sequenced of mammalian genomes, we are creating large deletions in the mouse genome targeted to include clusters of such genes. Here we describe the targeted deletion of a 450 kb region on mouse chromosome 11 which, based on computational analysis of the deleted murine sequences and human 5q orthologous sequences, codes for nine putative genes. Mice homozygous for the deletion had a variety of abnormalities including severe hypertriglyceridemia, hepatic and cardiac enlargement, growth retardation and premature mortality. Analysis of triglyceride metabolism in these animals demonstrated a several-fold increase in hepaticmore » very-low density lipoprotein (VLDL) triglyceride secretion, the most prevalent mechanism responsible for hypertriglyceridemia in humans. A series of mouse BAC and human YAC transgenes covering different intervals of the 450 kb deleted region were assessed for their ability to complement the deletion induced abnormalities. These studies revealed that OCTN2, a gene recently shown to play a role in carnitine transport, was able to correct the triglyceride abnormalities. The discovery of this previously unappreciated relationship between OCTN2, carnitine and hepatic triglyceride production is of particular importance due to the clinical consequence of hypertriglyceridemia and the paucity of genes known to modulate triglyceride secretion.« less
Multiethnic genome-wide meta-analysis of ectopic fat depots identifies loci associated with adipocyte development and differentiation.

PubMed

Chu, Audrey Y; Deng, Xuan; Fisher, Virginia A; Drong, Alexander; Zhang, Yang; Feitosa, Mary F; Liu, Ching-Ti; Weeks, Olivia; Choh, Audrey C; Duan, Qing; Dyer, Thomas D; Eicher, John D; Guo, Xiuqing; Heard-Costa, Nancy L; Kacprowski, Tim; Kent, Jack W; Lange, Leslie A; Liu, Xinggang; Lohman, Kurt; Lu, Lingyi; Mahajan, Anubha; O'Connell, Jeffrey R; Parihar, Ankita; Peralta, Juan M; Smith, Albert V; Zhang, Yi; Homuth, Georg; Kissebah, Ahmed H; Kullberg, Joel; Laqua, René; Launer, Lenore J; Nauck, Matthias; Olivier, Michael; Peyser, Patricia A; Terry, James G; Wojczynski, Mary K; Yao, Jie; Bielak, Lawrence F; Blangero, John; Borecki, Ingrid B; Bowden, Donald W; Carr, John Jeffrey; Czerwinski, Stefan A; Ding, Jingzhong; Friedrich, Nele; Gudnason, Vilmunder; Harris, Tamara B; Ingelsson, Erik; Johnson, Andrew D; Kardia, Sharon L R; Langefeld, Carl D; Lind, Lars; Liu, Yongmei; Mitchell, Braxton D; Morris, Andrew P; Mosley, Thomas H; Rotter, Jerome I; Shuldiner, Alan R; Towne, Bradford; Völzke, Henry; Wallaschofski, Henri; Wilson, James G; Allison, Matthew; Lindgren, Cecilia M; Goessling, Wolfram; Cupples, L Adrienne; Steinhauser, Matthew L; Fox, Caroline S

2017-01-01

Variation in body fat distribution contributes to the metabolic sequelae of obesity. The genetic determinants of body fat distribution are poorly understood. The goal of this study was to gain new insights into the underlying genetics of body fat distribution by conducting sample-size-weighted fixed-effects genome-wide association meta-analyses in up to 9,594 women and 8,738 men of European, African, Hispanic and Chinese ancestry, with and without sex stratification, for six traits associated with ectopic fat (hereinafter referred to as ectopic-fat traits). In total, we identified seven new loci associated with ectopic-fat traits (ATXN1, UBE2E2, EBF1, RREB1, GSDMB, GRAMD3 and ENSA; P < 5 × 10 -8 ; false discovery rate < 1%). Functional analysis of these genes showed that loss of function of either Atxn1 or Ube2e2 in primary mouse adipose progenitor cells impaired adipocyte differentiation, suggesting physiological roles for ATXN1 and UBE2E2 in adipogenesis. Future studies are necessary to further explore the mechanisms by which these genes affect adipocyte biology and how their perturbations contribute to systemic metabolic disease.
Expression of Genomic Functional Estrogen Receptor 1 in Mouse Sertoli Cells

PubMed Central

Lin, Jing; Zhu, Jia; Li, Xian; Li, Shengqiang; Lan, Zijian; Ko, Jay

2014-01-01

There is no consensus whether Sertoli cells express estrogen receptor 1 (Esr1). Reverse transcription-polymerase chain reaction, Western blot, and immunofluorescence demonstrated that mouse Sertoli cell lines, TM4, MSC-1, and 15P-1, and purified primary mouse Sertoli cells (PSCs) contained Esr1 messenger RNA and proteins. Incubation of Sertoli cells with 17β-estradiol (E2) or ESR1 agonist stimulated the expression of an estrogen responsive gene Greb1, which was prevented by ESR inhibitor or ESR1 antagonist. Overexpression of Esr1 in MSC-1 enhanced E2-induced Greb1 expression, while knockdown of Esr1 by small interfering RNA in TM4 attenuated the response. Furthermore, E2-induced Greb1 expression was abolished in the PSCs isolated from Amh-Cre/Esr1-floxed mice in which Esr1 in Sertoli cells were selectively deleted. Chromatin immunoprecipitation assays indicated that E2-induced Greb1 expression in Sertoli cells was mediated by binding of ESR1 to estrogen responsive elements. In summary, ligand-dependent nuclear ESR1 was present in mouse Sertoli cells and mediates a classical genomic action of estrogens. PMID:24615934
Chromatin Immunoprecipitation in Early Mouse Embryos.

PubMed

García-González, Estela G; Roque-Ramirez, Bladimir; Palma-Flores, Carlos; Hernández-Hernández, J Manuel

2018-01-01

Epigenetic regulation is achieved at many levels by different factors such as tissue-specific transcription factors, members of the basal transcriptional apparatus, chromatin-binding proteins, and noncoding RNAs. Importantly, chromatin structure dictates the availability of a specific genomic locus for transcriptional activation as well as the efficiency with which transcription can occur. Chromatin immunoprecipitation (ChIP) is a method that allows elucidating gene regulation at the molecular level by assessing if chromatin modifications or proteins are present at a specific locus. Initially, the majority of ChIP experiments were performed on cultured cell lines and more recently this technique has been adapted to a variety of tissues in different model organisms. Using ChIP on mouse embryos, it is possible to document the presence or absence of specific proteins and chromatin modifications at genomic loci in vivo during mammalian development and to get biological meaning from observations made on tissue culture analyses. We describe here a ChIP protocol on freshly isolated mouse embryonic somites for in vivo analysis of muscle specific transcription factor binding on chromatin. This protocol has been easily adapted to other mouse embryonic tissues and has also been successfully scaled up to perform ChIP-Seq.
Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

PubMed Central

Grassi, Elena; Damasco, Christian; Silengo, Lorenzo; Oti, Martin; Provero, Paolo; Di Cunto, Ferdinando

2008-01-01

Background Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates. Methodology/Principal Findings We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases. Conclusion Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes. PMID:18369433
Human Heart Mitochondrial DNA Is Organized in Complex Catenated Networks Containing Abundant Four-way Junctions and Replication Forks*

PubMed Central

Pohjoismäki, Jaakko L. O.; Goffart, Steffi; Tyynismaa, Henna; Willcox, Smaranda; Ide, Tomomi; Kang, Dongchon; Suomalainen, Anu; Karhunen, Pekka J.; Griffith, Jack D.; Holt, Ian J.; Jacobs, Howard T.

2009-01-01

Analysis of human heart mitochondrial DNA (mtDNA) by electron microscopy and agarose gel electrophoresis revealed a complete absence of the θ-type replication intermediates seen abundantly in mtDNA from all other tissues. Instead only Y- and X-junctional forms were detected after restriction digestion. Uncut heart mtDNA was organized in tangled complexes of up to 20 or more genome equivalents, which could be resolved to genomic monomers, dimers, and linear fragments by treatment with the decatenating enzyme topoisomerase IV plus the cruciform-cutting T7 endonuclease I. Human and mouse brain also contained a population of such mtDNA forms, which were absent, however, from mouse, rabbit, or pig heart. Overexpression in transgenic mice of two proteins involved in mtDNA replication, namely human mitochondrial transcription factor A or the mouse Twinkle DNA helicase, generated abundant four-way junctions in mtDNA of heart, brain, and skeletal muscle. The organization of mtDNA of human heart as well as of mouse and human brain in complex junctional networks replicating via a presumed non-θ mechanism is unprecedented in mammals. PMID:19525233
ReprDB and panDB: minimalist databases with maximal microbial representation.

PubMed

Zhou, Wei; Gay, Nicole; Oh, Julia

2018-01-18

Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.
Completed Genome Sequences of Strains from 36 Serotypes of Salmonella

PubMed Central

Robertson, James; Yoshida, Catherine; Gurnik, Simone; Rankin, Marisa

2018-01-01

ABSTRACT We report here the completed closed genome sequences of strains representing 36 serotypes of Salmonella. These genome sequences will provide useful references for understanding the genetic variation between serotypes, particularly as references for mapping of raw reads or to create assemblies of higher quality, as well as to aid in studies of comparative genomics of Salmonella. PMID:29348347
Somatic cell nuclear transfer: infinite reproduction of a unique diploid genome.

PubMed

Kishigami, Satoshi; Wakayama, Sayaka; Hosoi, Yoshihiko; Iritani, Akira; Wakayama, Teruhiko

2008-06-10

In mammals, a diploid genome of an individual following fertilization of an egg and a spermatozoon is unique and irreproducible. This implies that the generated unique diploid genome is doomed with the individual ending. Even as cultured cells from the individual, they cannot normally proliferate in perpetuity because of the "Hayflick limit". However, Dolly, the sheep cloned from an adult mammary gland cell, changes this scenario. Somatic cell nuclear transfer (SCNT) enables us to produce offspring without germ cells, that is, to "passage" a unique diploid genome. Animal cloning has also proven to be a powerful research tool for reprogramming in many mammals, notably mouse and cow. The mechanism underlying reprogramming, however, remains largely unknown and, animal cloning has been inefficient as a result. More momentously, in addition to abortion and fetal mortality, some cloned animals display possible premature aging phenotypes including early death and short telomere lengths. Under these inauspicious conditions, is it really possible for SCNT to preserve a diploid genome? Delightfully, in mouse and recently in primate, using SCNT we can produce nuclear transfer ES cells (ntES) more efficiently, which can preserve the eternal lifespan for the "passage" of a unique diploid genome. Further, new somatic cloning technique using histone-deacetylase inhibitors has been developed which can significantly increase the previous cloning rates two to six times. Here, we introduce SCNT and its value as a preservation tool for a diploid genome while reviewing aging of cloned animals on cellular and individual levels.
A versatile genome-scale PCR-based pipeline for high-definition DNA FISH.

PubMed

Bienko, Magda; Crosetto, Nicola; Teytelman, Leonid; Klemm, Sandy; Itzkovitz, Shalev; van Oudenaarden, Alexander

2013-02-01

We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a database of over 4.3 million primer pairs targeting the human and mouse genomes that is readily usable for rapid and flexible generation of probes.
Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels.

PubMed

Gao, Xiaoyi; Haritunians, Talin; Marjoram, Paul; McKean-Cowdin, Roberta; Torres, Mina; Taylor, Kent D; Rotter, Jerome I; Gauderman, William J; Varma, Rohit

2012-01-01

Genotype imputation is a vital tool in genome-wide association studies (GWAS) and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous, and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR + CEU + YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation based analysis in Latinos.
Hush puppy: a new mouse mutant with pinna, ossicle, and inner ear defects.

PubMed

Pau, Henry; Fuchs, Helmut; de Angelis, Martin Hrabé; Steel, Karen P

2005-01-01

Deafness can be associated with abnormalities of the pinna, ossicles, and cochlea. The authors studied a newly generated mouse mutant with pinna defects and asked whether these defects are associated with peripheral auditory or facial skeletal abnormalities, or both. Furthermore, the authors investigated where the mutation responsible for these defects was located in the mouse genome. The hearing of hush puppy mutants was assessed by Preyer reflex and electrophysiological measurement. The morphological features of their middle and inner ears were investigated by microdissection, paint-filling of the labyrinth, and scanning electron microscopy. Skeletal staining of skulls was performed to assess the craniofacial dimensions. Genome scanning was performed using microsatellite markers to localize the mutation to a chromosomal region. Some hush puppy mutants showed early onset of hearing impairment. They had small, bat-like pinnae and normal malleus but abnormal incus and stapes. Some mutants had asymmetrical defects and showed reduced penetrance of the ear abnormalities. Paint-filling of newborns' inner ears revealed no morphological abnormality, although half of the mice studied were expected to carry the mutation. Reduced numbers of outer hair cells were demonstrated in mutants' cochlea on scanning electron microscopy. Skeletal staining showed that the mutants have significantly shorter snouts and mandibles. Genome scan revealed that the mutation lies on chromosome 8 between markers D8Mit58 and D8Mit289. The study results indicate developmental problems of the first and second branchial arches and otocyst as a result of a single gene mutation. Similar defects are found in humans, and hush puppy provides a mouse model for investigation of such defects.
Genomic localization of the Z/EG transgene in the mouse genome.

PubMed

Colombo, Sophie; Kumasaka, Mayuko; Lobe, Corrinne; Larue, Lionel

2010-02-01

The Z/EG transgenic mouse line, produced by Novak et al., displays tissue-specific EGFP expression after Cre-mediated recombination. The autofluorescence of EGFP allows the visualization of cells of interest displaying Cre recombination. The initial construct was designed such that cells without Cre recombination express the beta-galactosidase marker, facilitating counterselection. We used inverse PCR to identify the site of integration of the Z/EG transgene, to improve the efficiency of homozygous Z/EG mouse production. Recombined cells produced large amounts of EGFP protein, resulting in higher levels of fluorescence and therefore greater contrast with nonrecombined cells. We mapped the transgene to the G1 region of chromosome 5. This random insertion was found to have occurred 230-bp upstream from the start codon of the Rasa4 gene. The insertion of the Z/EG transgene in the C57BL/6 genetic background had no effect on Rasa4 expression. Homozygous Z/EG mice therefore had no obvious phenotype. (c) 2009 Wiley-Liss, Inc.
Easi-CRISPR for creating knock-in and conditional knockout mouse models using long ssDNA donors.

PubMed

Miura, Hiromi; Quadros, Rolen M; Gurumurthy, Channabasavaiah B; Ohtsuka, Masato

2018-01-01

CRISPR/Cas9-based genome editing can easily generate knockout mouse models by disrupting the gene sequence, but its efficiency for creating models that require either insertion of exogenous DNA (knock-in) or replacement of genomic segments is very poor. The majority of mouse models used in research involve knock-in (reporters or recombinases) or gene replacement (e.g., conditional knockout alleles containing exons flanked by LoxP sites). A few methods for creating such models have been reported that use double-stranded DNA as donors, but their efficiency is typically 1-10% and therefore not suitable for routine use. We recently demonstrated that long single-stranded DNAs (ssDNAs) serve as very efficient donors, both for insertion and for gene replacement. We call this method efficient additions with ssDNA inserts-CRISPR (Easi-CRISPR) because it is a highly efficient technology (efficiency is typically 30-60% and reaches as high as 100% in some cases). The protocol takes ∼2 months to generate the founder mice.
Disease Model Discovery from 3,328 Gene Knockouts by The International Mouse Phenotyping Consortium

PubMed Central

Meehan, Terrence F.; Conte, Nathalie; West, David B.; Jacobsen, Julius O.; Mason, Jeremy; Warren, Jonathan; Chen, Chao-Kung; Tudose, Ilinca; Relac, Mike; Matthews, Peter; Karp, Natasha; Santos, Luis; Fiegel, Tanja; Ring, Natalie; Westerberg, Henrik; Greenaway, Simon; Sneddon, Duncan; Morgan, Hugh; Codner, Gemma F; Stewart, Michelle E; Brown, James; Horner, Neil; Haendel, Melissa; Washington, Nicole; Mungall, Christopher J.; Reynolds, Corey L; Gallegos, Juan; Gailus-Durner, Valerie; Sorg, Tania; Pavlovic, Guillaume; Bower, Lynette R; Moore, Mark; Morse, Iva; Gao, Xiang; Tocchini-Valentini, Glauco P; Obata, Yuichi; Cho, Soo Young; Seong, Je Kyung; Seavitt, John; Beaudet, Arthur L.; Dickinson, Mary E.; Herault, Yann; Wurst, Wolfgang; de Angelis, Martin Hrabe; Lloyd, K.C. Kent; Flenniken, Ann M; Nutter, Lauryl MJ; Newbigging, Susan; McKerlie, Colin; Justice, Monica J.; Murray, Stephen A.; Svenson, Karen L.; Braun, Robert E.; White, Jacqueline K.; Bradley, Allan; Flicek, Paul; Wells, Sara; Skarnes, William C.; Adams, David J.; Parkinson, Helen; Mallon, Ann-Marie; Brown, Steve D.M.; Smedley, Damian

2017-01-01

Although next generation sequencing has revolutionised the ability to associate variants with human diseases, diagnostic rates and development of new therapies are still limited by our lack of knowledge of function and pathobiological mechanism for most genes. To address this challenge, the International Mouse Phenotyping Consortium (IMPC) is creating a genome- and phenome-wide catalogue of gene function by characterizing new knockout mouse strains across diverse biological systems through a broad set of standardised phenotyping tests, with all mice made readily available to the biomedical community. Analysing the first 3328 genes reveals models for 360 diseases including the first for type C Bernard-Soulier, Bardet-Biedl-5 and Gordon Holmes syndromes. 90% of our phenotype annotations are novel, providing the first functional evidence for 1092 genes and candidates in unsolved diseases such as Arrhythmogenic Right Ventricular Dysplasia 3. Finally, we describe our role in variant functional validation with the 100,000 Genomes and other projects. PMID:28650483
FANTOM5 CAGE profiles of human and mouse samples.

PubMed

Noguchi, Shuhei; Arakawa, Takahiro; Fukuda, Shiro; Furuno, Masaaki; Hasegawa, Akira; Hori, Fumi; Ishikawa-Kato, Sachi; Kaida, Kaoru; Kaiho, Ai; Kanamori-Katayama, Mutsumi; Kawashima, Tsugumi; Kojima, Miki; Kubosaki, Atsutaka; Manabe, Ri-Ichiroh; Murata, Mitsuyoshi; Nagao-Sato, Sayaka; Nakazato, Kenichi; Ninomiya, Noriko; Nishiyori-Sueki, Hiromi; Noma, Shohei; Saijyo, Eri; Saka, Akiko; Sakai, Mizuho; Simon, Christophe; Suzuki, Naoko; Tagami, Michihira; Watanabe, Shoko; Yoshida, Shigehiro; Arner, Peter; Axton, Richard A; Babina, Magda; Baillie, J Kenneth; Barnett, Timothy C; Beckhouse, Anthony G; Blumenthal, Antje; Bodega, Beatrice; Bonetti, Alessandro; Briggs, James; Brombacher, Frank; Carlisle, Ailsa J; Clevers, Hans C; Davis, Carrie A; Detmar, Michael; Dohi, Taeko; Edge, Albert S B; Edinger, Matthias; Ehrlund, Anna; Ekwall, Karl; Endoh, Mitsuhiro; Enomoto, Hideki; Eslami, Afsaneh; Fagiolini, Michela; Fairbairn, Lynsey; Farach-Carson, Mary C; Faulkner, Geoffrey J; Ferrai, Carmelo; Fisher, Malcolm E; Forrester, Lesley M; Fujita, Rie; Furusawa, Jun-Ichi; Geijtenbeek, Teunis B; Gingeras, Thomas; Goldowitz, Daniel; Guhl, Sven; Guler, Reto; Gustincich, Stefano; Ha, Thomas J; Hamaguchi, Masahide; Hara, Mitsuko; Hasegawa, Yuki; Herlyn, Meenhard; Heutink, Peter; Hitchens, Kelly J; Hume, David A; Ikawa, Tomokatsu; Ishizu, Yuri; Kai, Chieko; Kawamoto, Hiroshi; Kawamura, Yuki I; Kempfle, Judith S; Kenna, Tony J; Kere, Juha; Khachigian, Levon M; Kitamura, Toshio; Klein, Sarah; Klinken, S Peter; Knox, Alan J; Kojima, Soichi; Koseki, Haruhiko; Koyasu, Shigeo; Lee, Weonju; Lennartsson, Andreas; Mackay-Sim, Alan; Mejhert, Niklas; Mizuno, Yosuke; Morikawa, Hiromasa; Morimoto, Mitsuru; Moro, Kazuyo; Morris, Kelly J; Motohashi, Hozumi; Mummery, Christine L; Nakachi, Yutaka; Nakahara, Fumio; Nakamura, Toshiyuki; Nakamura, Yukio; Nozaki, Tadasuke; Ogishima, Soichi; Ohkura, Naganari; Ohno, Hiroshi; Ohshima, Mitsuhiro; Okada-Hatakeyama, Mariko; Okazaki, Yasushi; Orlando, Valerio; Ovchinnikov, Dmitry A; Passier, Robert; Patrikakis, Margaret; Pombo, Ana; Pradhan-Bhatt, Swati; Qin, Xian-Yang; Rehli, Michael; Rizzu, Patrizia; Roy, Sugata; Sajantila, Antti; Sakaguchi, Shimon; Sato, Hiroki; Satoh, Hironori; Savvi, Suzana; Saxena, Alka; Schmidl, Christian; Schneider, Claudio; Schulze-Tanzil, Gundula G; Schwegmann, Anita; Sheng, Guojun; Shin, Jay W; Sugiyama, Daisuke; Sugiyama, Takaaki; Summers, Kim M; Takahashi, Naoko; Takai, Jun; Tanaka, Hiroshi; Tatsukawa, Hideki; Tomoiu, Andru; Toyoda, Hiroo; van de Wetering, Marc; van den Berg, Linda M; Verardo, Roberto; Vijayan, Dipti; Wells, Christine A; Winteringham, Louise N; Wolvetang, Ernst; Yamaguchi, Yoko; Yamamoto, Masayuki; Yanagi-Mizuochi, Chiyo; Yoneda, Misako; Yonekura, Yohei; Zhang, Peter G; Zucchelli, Silvia; Abugessaisa, Imad; Arner, Erik; Harshbarger, Jayson; Kondo, Atsushi; Lassmann, Timo; Lizio, Marina; Sahin, Serkan; Sengstag, Thierry; Severin, Jessica; Shimoji, Hisashi; Suzuki, Masanori; Suzuki, Harukazu; Kawai, Jun; Kondo, Naoto; Itoh, Masayoshi; Daub, Carsten O; Kasukawa, Takeya; Kawaji, Hideya; Carninci, Piero; Forrest, Alistair R R; Hayashizaki, Yoshihide

2017-08-29

In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.

FANTOM5 CAGE profiles of human and mouse samples

PubMed Central

Noguchi, Shuhei; Arakawa, Takahiro; Fukuda, Shiro; Furuno, Masaaki; Hasegawa, Akira; Hori, Fumi; Ishikawa-Kato, Sachi; Kaida, Kaoru; Kaiho, Ai; Kanamori-Katayama, Mutsumi; Kawashima, Tsugumi; Kojima, Miki; Kubosaki, Atsutaka; Manabe, Ri-ichiroh; Murata, Mitsuyoshi; Nagao-Sato, Sayaka; Nakazato, Kenichi; Ninomiya, Noriko; Nishiyori-Sueki, Hiromi; Noma, Shohei; Saijyo, Eri; Saka, Akiko; Sakai, Mizuho; Simon, Christophe; Suzuki, Naoko; Tagami, Michihira; Watanabe, Shoko; Yoshida, Shigehiro; Arner, Peter; Axton, Richard A.; Babina, Magda; Baillie, J. Kenneth; Barnett, Timothy C.; Beckhouse, Anthony G.; Blumenthal, Antje; Bodega, Beatrice; Bonetti, Alessandro; Briggs, James; Brombacher, Frank; Carlisle, Ailsa J.; Clevers, Hans C.; Davis, Carrie A.; Detmar, Michael; Dohi, Taeko; Edge, Albert S.B.; Edinger, Matthias; Ehrlund, Anna; Ekwall, Karl; Endoh, Mitsuhiro; Enomoto, Hideki; Eslami, Afsaneh; Fagiolini, Michela; Fairbairn, Lynsey; Farach-Carson, Mary C.; Faulkner, Geoffrey J.; Ferrai, Carmelo; Fisher, Malcolm E.; Forrester, Lesley M.; Fujita, Rie; Furusawa, Jun-ichi; Geijtenbeek, Teunis B.; Gingeras, Thomas; Goldowitz, Daniel; Guhl, Sven; Guler, Reto; Gustincich, Stefano; Ha, Thomas J.; Hamaguchi, Masahide; Hara, Mitsuko; Hasegawa, Yuki; Herlyn, Meenhard; Heutink, Peter; Hitchens, Kelly J.; Hume, David A.; Ikawa, Tomokatsu; Ishizu, Yuri; Kai, Chieko; Kawamoto, Hiroshi; Kawamura, Yuki I.; Kempfle, Judith S.; Kenna, Tony J.; Kere, Juha; Khachigian, Levon M.; Kitamura, Toshio; Klein, Sarah; Klinken, S. Peter; Knox, Alan J.; Kojima, Soichi; Koseki, Haruhiko; Koyasu, Shigeo; Lee, Weonju; Lennartsson, Andreas; Mackay-sim, Alan; Mejhert, Niklas; Mizuno, Yosuke; Morikawa, Hiromasa; Morimoto, Mitsuru; Moro, Kazuyo; Morris, Kelly J.; Motohashi, Hozumi; Mummery, Christine L.; Nakachi, Yutaka; Nakahara, Fumio; Nakamura, Toshiyuki; Nakamura, Yukio; Nozaki, Tadasuke; Ogishima, Soichi; Ohkura, Naganari; Ohno, Hiroshi; Ohshima, Mitsuhiro; Okada-Hatakeyama, Mariko; Okazaki, Yasushi; Orlando, Valerio; Ovchinnikov, Dmitry A.; Passier, Robert; Patrikakis, Margaret; Pombo, Ana; Pradhan-Bhatt, Swati; Qin, Xian-Yang; Rehli, Michael; Rizzu, Patrizia; Roy, Sugata; Sajantila, Antti; Sakaguchi, Shimon; Sato, Hiroki; Satoh, Hironori; Savvi, Suzana; Saxena, Alka; Schmidl, Christian; Schneider, Claudio; Schulze-Tanzil, Gundula G.; Schwegmann, Anita; Sheng, Guojun; Shin, Jay W.; Sugiyama, Daisuke; Sugiyama, Takaaki; Summers, Kim M.; Takahashi, Naoko; Takai, Jun; Tanaka, Hiroshi; Tatsukawa, Hideki; Tomoiu, Andru; Toyoda, Hiroo; van de Wetering, Marc; van den Berg, Linda M.; Verardo, Roberto; Vijayan, Dipti; Wells, Christine A.; Winteringham, Louise N.; Wolvetang, Ernst; Yamaguchi, Yoko; Yamamoto, Masayuki; Yanagi-Mizuochi, Chiyo; Yoneda, Misako; Yonekura, Yohei; Zhang, Peter G.; Zucchelli, Silvia; Abugessaisa, Imad; Arner, Erik; Harshbarger, Jayson; Kondo, Atsushi; Lassmann, Timo; Lizio, Marina; Sahin, Serkan; Sengstag, Thierry; Severin, Jessica; Shimoji, Hisashi; Suzuki, Masanori; Suzuki, Harukazu; Kawai, Jun; Kondo, Naoto; Itoh, Masayoshi; Daub, Carsten O.; Kasukawa, Takeya; Kawaji, Hideya; Carninci, Piero; Forrest, Alistair R.R.; Hayashizaki, Yoshihide

2017-01-01

In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, cell lines, and time series samples during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities. PMID:28850106
Involvement of Atm and Trp53 in neural cell loss due to Terf2 inactivation during mouse brain development.

PubMed

Kim, Jusik; Choi, Inseo; Lee, Youngsoo

2017-11-01

Maintenance of genomic integrity is one of the critical features for proper neurodevelopment and inhibition of neurological diseases. The signals from both ATM and ATR to TP53 are well-known mechanisms to remove neural cells with DNA damage during neurogenesis. Here we examined the involvement of Atm and Atr in genomic instability due to Terf2 inactivation during mouse brain development. Selective inactivation of Terf2 in neural progenitors induced apoptosis, resulting in a complete loss of the brain structure. This neural loss was rescued partially in both Atm and Trp53 deficiency, but not in an Atr-deficient background in the mouse. Atm inactivation resulted in incomplete brain structures, whereas p53 deficiency led to the formation of multinucleated giant neural cells and the disruption of the brain structure. These giant neural cells disappeared in Lig4 deficiency. These data demonstrate ATM and TP53 are important for the maintenance of telomere homeostasis and the surveillance of telomere dysfunction during neurogenesis.
Molecular determinants of nucleosome retention at CpG-rich sequences in mouse spermatozoa.

PubMed

Erkek, Serap; Hisano, Mizue; Liang, Ching-Yeu; Gill, Mark; Murr, Rabih; Dieker, Jürgen; Schübeler, Dirk; van der Vlag, Johan; Stadler, Michael B; Peters, Antoine H F M

2013-07-01

In mammalian spermatozoa, most but not all of the genome is densely packaged by protamines. Here we reveal the molecular logic underlying the retention of nucleosomes in mouse spermatozoa, which contain only 1% residual histones. We observe high enrichment throughout the genome of nucleosomes at CpG-rich sequences that lack DNA methylation. Residual nucleosomes are largely composed of the histone H3.3 variant and are trimethylated at Lys4 of histone H3 (H3K4me3). Canonical H3.1 and H3.2 histones are also enriched at CpG-rich promoters marked by Polycomb-mediated H3K27me3, a modification predictive of gene repression in preimplantation embryos. Histone variant-specific nucleosome retention in sperm is strongly associated with nucleosome turnover in round spermatids. Our data show evolutionary conservation of the basic principles of nucleosome retention in mouse and human sperm, supporting a model of epigenetic inheritance by nucleosomes between generations.
Complete Genome Sequence of Escherichia coli Strain M8, Isolated from ob/ob Mice

PubMed Central

Siddharth, Jay; Membrez, Mathieu; Chakrabarti, Anirikh; Betrisey, Bertrand; Chou, Chieh Jason

2017-01-01

ABSTRACT Escherichia coli is one of the common inhabitants of the mammalian gastrointestinal track. We isolated a strain from an ob/ob mouse and performed whole-genome sequencing, which yielded a chromosome of ~5.1 Mb and three plasmids of ~160 kb, ~6 kb, and ~4 kb. PMID:28572322
Genome wide analysis of DNA methylation and gene expression changes in the mouse lung following subchronic arsenate exposure

EPA Science Inventory

Alterations in DNA methylation have been proposed as a mechanism for the complex toxicological effects of arsenic. In this study, whole genome DNA methylation and gene expression changes were evaluated in lungs from female mice exposed for 90 days to 50 ppm arsenate (As) in drink...
Mechanisms for c-myc Induced Mouse Mammary Gland Carcinogenesis and for the Synergistic Role of TGF(alpha) in the Process

DTIC Science & Technology

2001-07-01

1997 Glucose deprivation- induced cytotoxicity in drug resistant genomic status of the c-myc locus in infiltrating ductal human breast carcinoma MCF-7...AD Award Number: DAMD17-00-1-0270 TITLE: Mechanisms for c-myc Induced Mouse Mammary Gland Carcinogenesis and for the Synergistic Role of TGFOX in the...AND SUBTITLE 5. FUNDING NUMBERS Mechanisms for c-myc Induced Mouse Mammary Gland DAMD17-00-1-0270 Carcinogenesis and for the Synergistic Role of TGFa in
A combined reference panel from the 1000 Genomes and UK10K projects improved rare variant imputation in European and Chinese samples

PubMed Central

Chou, Wen-Chi; Zheng, Hou-Feng; Cheng, Chia-Ho; Yan, Han; Wang, Li; Han, Fang; Richards, J. Brent; Karasik, David; Kiel, Douglas P.; Hsu, Yi-Hsiang

2016-01-01

Imputation using the 1000 Genomes haplotype reference panel has been widely adapted to estimate genotypes in genome wide association studies. To evaluate imputation quality with a relatively larger reference panel and a reference panel composed of different ethnic populations, we conducted imputations in the Framingham Heart Study and the North Chinese Study using a combined reference panel from the 1000 Genomes (N = 1,092) and UK10K (N = 3,781) projects. For rare variants with 0.01% < MAF ≤ 0.5%, imputation in the Framingham Heart Study with the combined reference panel increased well-imputed genotypes (with imputation quality score ≥0.4) from 62.9% to 76.1% when compared to imputation with the 1000 Genomes. For the North Chinese samples, imputation of rare variants with 0.01% < MAF ≤ 0.5% with the combined reference panel increased well-imputed genotypes by from 49.8% to 61.8%. The predominant European ancestry of the UK10K and the combined reference panels may explain why there was less of an increase in imputation success in the North Chinese samples. Our results underscore the importance and potential of larger reference panels to impute rare variants, while recognizing that increasing ethnic specific variants in reference panels may result in better imputation for genotypes in some ethnic groups. PMID:28004816
Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

PubMed Central

Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

2016-01-01

SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population. PMID:26367794
Genomic background-related activation of microglia and reduced β-amyloidosis in a mouse model of Alzheimer's disease.

PubMed

Fröhlich, Christina; Paarmann, Kristin; Steffen, Johannes; Stenzel, Jan; Krohn, Markus; Heinze, Hans-Jochen; Pahnke, Jens

2013-03-01

Alzheimer's disease (AD) is by far the most common neurodegenerative disease. AD is histologically characterized not only by extracellular senile plaques and vascular deposits consisting of β-amyloid (Aβ) but also by accompanying neuroinflammatory processes involving the brain's microglia. The importance of the microglia is still in controversial discussion, which currently favors a protective function in disease progression. Recent findings by different research groups highlighted the importance of strain-specific and mitochondria-specific genomic variations in mouse models of cerebral β-amyloidosis. Here, we want to summarize our previously presented data and add new results that draw attention towards the consideration of strain-specific genomic alterations in the setting of APP transgenes. We present data from APP-transgenic mice in commonly used C57Bl/6J and FVB/N genomic backgrounds and show a direct influence on the kinetics of Aβ deposition and the activity of resident microglia. Plaque size, plaque deposition rate and the total amount of Aβ are highest in C57Bl/6J mice as compared to the FVB/N genomic background, which can be explained at least partially by a reduced microglia activity towards amyloid deposits in the C57BL/6J strain.
A Survey for Novel Imprinted Genes in the Mouse Placenta by mRNA-seq

PubMed Central

Wang, Xu; Soloway, Paul D.; Clark, Andrew G.

2011-01-01

Many questions about the regulation, functional specialization, computational prediction, and evolution of genomic imprinting would be better addressed by having an exhaustive genome-wide catalog of genes that display parent-of-origin differential expression. As a first-pass scan for novel imprinted genes, we performed mRNA-seq experiments on embryonic day 17.5 (E17.5) mouse placenta cDNA samples from reciprocal cross F1 progeny of AKR and PWD mouse strains and quantified the allele-specific expression and the degree of parent-of-origin allelic imbalance. We confirmed the imprinting status of 23 known imprinted genes in the placenta and found that 12 genes reported previously to be imprinted in other tissues are also imprinted in mouse placenta. Through a well-replicated design using an orthogonal allelic-expression technology, we verified 5 novel imprinted genes that were not previously known to be imprinted in mouse (Pde10, Phf17, Phactr2, Zfp64, and Htra3). Our data suggest that most of the strongly imprinted genes have already been identified, at least in the placenta, and that evidence supports perhaps 100 additional weakly imprinted genes. Despite previous appearance that the placenta tends to display an excess of maternally expressed imprinted genes, with the addition of our validated set of placenta-imprinted genes, this maternal bias has disappeared. PMID:21705755
Pharmacokinetic and Genomic Effects of Arsenite in Drinking Water on Mouse Lung in a 30-Day Exposure

PubMed Central

Chilakapati, Jaya; Wallace, Kathleen; Hernandez-Zavala, Araceli; Moore, Tanya; Ren, Hongzu

2015-01-01

The 2 objectives of this subchronic study were to determine the arsenite drinking water exposure dependent increases in female C3H mouse liver and lung tissue arsenicals and to characterize the dose response (to 0, 0.05, 0.25, 1, 10, and 85 ppm arsenite in drinking water for 30 days and a purified AIN-93M diet) for genomic mouse lung expression patterns. Mouse lungs were analyzed for inorganic arsenic, monomethylated, and dimethylated arsenicals by hydride generation atomic absorption spectroscopy. The total lung mean arsenical levels were 1.4, 22.5, 30.1, 50.9, 105.3, and 316.4 ng/g lung tissue after 0, 0.05, 0.25, 1, 10, and 85 ppm, respectively. At 85 ppm, the total mean lung arsenical levels increased 14-fold and 131-fold when compared to either the lowest noncontrol dose (0.05 ppm) or the control dose, respectively. We found that arsenic exposure elicited minimal numbers of differentially expressed genes (DEGs; 77, 38, 90, 87, and 87 DEGs) after 0.05, 0.25, 1, 10, and 85 ppm, respectively, which were associated with cardiovascular disease, development, differentiation, apoptosis, proliferation, and stress response. After 30 days of arsenite exposure, this study showed monotonic increases in mouse lung arsenical (total arsenic and dimethylarsinic acid) concentrations but no clear dose-related increases in DEG numbers. PMID:26674514
Pharmacokinetic and Genomic Effects of Arsenite in Drinking Water on Mouse Lung in a 30-Day Exposure.

PubMed

Chilakapati, Jaya; Wallace, Kathleen; Hernandez-Zavala, Araceli; Moore, Tanya; Ren, Hongzu; Kitchin, Kirk T

2015-01-01

The 2 objectives of this subchronic study were to determine the arsenite drinking water exposure dependent increases in female C3H mouse liver and lung tissue arsenicals and to characterize the dose response (to 0, 0.05, 0.25, 1, 10, and 85 ppm arsenite in drinking water for 30 days and a purified AIN-93M diet) for genomic mouse lung expression patterns. Mouse lungs were analyzed for inorganic arsenic, monomethylated, and dimethylated arsenicals by hydride generation atomic absorption spectroscopy. The total lung mean arsenical levels were 1.4, 22.5, 30.1, 50.9, 105.3, and 316.4 ng/g lung tissue after 0, 0.05, 0.25, 1, 10, and 85 ppm, respectively. At 85 ppm, the total mean lung arsenical levels increased 14-fold and 131-fold when compared to either the lowest noncontrol dose (0.05 ppm) or the control dose, respectively. We found that arsenic exposure elicited minimal numbers of differentially expressed genes (DEGs; 77, 38, 90, 87, and 87 DEGs) after 0.05, 0.25, 1, 10, and 85 ppm, respectively, which were associated with cardiovascular disease, development, differentiation, apoptosis, proliferation, and stress response. After 30 days of arsenite exposure, this study showed monotonic increases in mouse lung arsenical (total arsenic and dimethylarsinic acid) concentrations but no clear dose-related increases in DEG numbers.
Progress toward a low budget reference grade genome assembly

USDA-ARS?s Scientific Manuscript database

Reference quality de novo genome assemblies were once solely the domain of large, well-funded genome projects. While next-generation short read technology removed some of the cost barriers, accurate chromosome-scale assembly remains a real challenge. Here we present efforts to de novo assemble the...
Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology.

PubMed

Otto, Thomas D; Sanders, Mandy; Berriman, Matthew; Newbold, Chris

2010-07-15

The accuracy of reference genomes is important for downstream analysis but a low error rate requires expensive manual interrogation of the sequence. Here, we describe a novel algorithm (Iterative Correction of Reference Nucleotides) that iteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy. Using Plasmodium falciparum (81% A + T content) as an extreme example, we show that the algorithm is highly accurate and corrects over 2000 errors in the reference sequence. We give examples of its application to numerous other eukaryotic and prokaryotic genomes and suggest additional applications. The software is available at http://icorn.sourceforge.net
Allele-specific control of replication timing and genome organization during development.

PubMed

Rivera-Mulia, Juan Carlos; Dimond, Andrew; Vera, Daniel; Trevilla-Garcia, Claudia; Sasaki, Takayo; Zimmerman, Jared; Dupont, Catherine; Gribnau, Joost; Fraser, Peter; Gilbert, David M

2018-05-07

DNA replication occurs in a defined temporal order known as the replication-timing (RT) program. RT is regulated during development in discrete chromosomal units, coordinated with transcriptional activity and 3D genome organization. Here, we derived distinct cell types from F1 hybrid musculus X castaneus mouse crosses and exploited the high single nucleotide polymorphism (SNP) density to characterize allelic differences in RT (Repli-seq), genome organization (Hi-C and promoter-capture Hi-C), gene expression (total nuclear RNA-seq) and chromatin accessibility (ATAC-seq). We also present HARP: a new computational tool for sorting SNPs in phased genomes to efficiently measure allele-specific genome-wide data. Analysis of six different hybrid mESC clones with different genomes (C57BL/6, 129/sv and CAST/Ei), parental configurations and gender revealed significant RT asynchrony between alleles across ~12% of the autosomal genome linked to sub-species genomes but not to parental origin, growth conditions or gender. RT asynchrony in mESCs strongly correlated with changes in Hi-C compartments between alleles but not SNP density, gene expression, imprinting or chromatin accessibility. We then tracked mESC RT asynchronous regions during development by analyzing differentiated cell types including extraembryonic endoderm stem (XEN) cells, 4 male and female primary mouse embryonic fibroblasts (MEFs) and neural precursor cells (NPCs) differentiated in vitro from mESCs with opposite parental configurations. We found that RT asynchrony and allelic discordance in Hi-C compartments seen in mESCs was largely lost in all differentiated cell types, coordinated with a more uniform Hi-C compartment arrangement, suggesting that genome organization of homologues converges to similar folding patterns during cell fate commitment. Published by Cold Spring Harbor Laboratory Press.
Decomposing genomic variance using information from GWA, GWE and eQTL analysis.

PubMed

Ehsani, A; Janss, L; Pomp, D; Sørensen, P

2016-04-01

A commonly used procedure in genome-wide association (GWA), genome-wide expression (GWE) and expression quantitative trait locus (eQTL) analyses is based on a bottom-up experimental approach that attempts to individually associate molecular variants with complex traits. Top-down modeling of the entire set of genomic data and partitioning of the overall variance into subcomponents may provide further insight into the genetic basis of complex traits. To test this approach, we performed a whole-genome variance components analysis and partitioned the genomic variance using information from GWA, GWE and eQTL analyses of growth-related traits in a mouse F2 population. We characterized the mouse trait genetic architecture by ordering single nucleotide polymorphisms (SNPs) based on their P-values and studying the areas under the curve (AUCs). The observed traits were found to have a genomic variance profile that differed significantly from that expected of a trait under an infinitesimal model. This situation was particularly true for both body weight and body fat, for which the AUCs were much higher compared with that of glucose. In addition, SNPs with a high degree of trait-specific regulatory potential (SNPs associated with subset of transcripts that significantly associated with a specific trait) explained a larger proportion of the genomic variance than did SNPs with high overall regulatory potential (SNPs associated with transcripts using traditional eQTL analysis). We introduced AUC measures of genomic variance profiles that can be used to quantify relative importance of SNPs as well as degree of deviation of a trait's inheritance from an infinitesimal model. The shape of the curve aids global understanding of traits: The steeper the left-hand side of the curve, the fewer the number of SNPs controlling most of the phenotypic variance. © 2015 Stichting International Foundation for Animal Genetics.
Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing.

PubMed

Liu, Yu; Koyutürk, Mehmet; Maxwell, Sean; Xiang, Min; Veigl, Martina; Cooper, Richard S; Tayo, Bamidele O; Li, Li; LaFramboise, Thomas; Wang, Zhenghe; Zhu, Xiaofeng; Chance, Mark R

2014-08-16

Sequences up to several megabases in length have been found to be present in individual genomes but absent in the human reference genome. These sequences may be common in populations, and their absence in the reference genome may indicate rare variants in the genomes of individuals who served as donors for the human genome project. As the reference genome is used in probe design for microarray technology and mapping short reads in next generation sequencing (NGS), this missing sequence could be a source of bias in functional genomic studies and variant analysis. One End Anchor (OEA) and/or orphan reads from paired-end sequencing have been used to identify novel sequences that are absent in reference genome. However, there is no study to investigate the distribution, evolution and functionality of those sequences in human populations. To systematically identify and study the missing common sequences (micSeqs), we extended the previous method by pooling OEA reads from large number of individuals and applying strict filtering methods to remove false sequences. The pipeline was applied to data from phase 1 of the 1000 Genomes Project. We identified 309 micSeqs that are present in at least 1% of the human population, but absent in the reference genome. We confirmed 76% of these 309 micSeqs by comparison to other primate genomes, individual human genomes, and gene expression data. Furthermore, we randomly selected fifteen micSeqs and confirmed their presence using PCR validation in 38 additional individuals. Functional analysis using published RNA-seq and ChIP-seq data showed that eleven micSeqs are highly expressed in human brain and three micSeqs contain transcription factor (TF) binding regions, suggesting they are functional elements. In addition, the identified micSeqs are absent in non-primates and show dynamic acquisition during primate evolution culminating with most micSeqs being present in Africans, suggesting some micSeqs may be important sources of human diversity. 76% of micSeqs were confirmed by a comparative genomics approach. Fourteen micSeqs are expressed in human brain or contain TF binding regions. Some micSeqs are primate-specific, conserved and may play a role in the evolution of primates.
Stewardship of the Maize B73 feference genome assembly

USDA-ARS?s Scientific Manuscript database

The release of version 4 of the B73 reference genome assembly is imminent. However, continued improvement of the assembly is likely to fall to the maize research community. Toward this end, and recognizing the importance of an accurate and well-curated reference genome, MaizeGDB, Gramene, and the Ge...
The Fecal Viral Flora of Wild Rodents

PubMed Central

Phan, Tung G.; Kapusinszky, Beatrix; Wang, Chunlin; Rose, Robert K.; Lipton, Howard L.; Delwart, Eric L.

2011-01-01

The frequent interactions of rodents with humans make them a common source of zoonotic infections. To obtain an initial unbiased measure of the viral diversity in the enteric tract of wild rodents we sequenced partially purified, randomly amplified viral RNA and DNA in the feces of 105 wild rodents (mouse, vole, and rat) collected in California and Virginia. We identified in decreasing frequency sequences related to the mammalian viruses families Circoviridae, Picobirnaviridae, Picornaviridae, Astroviridae, Parvoviridae, Papillomaviridae, Adenoviridae, and Coronaviridae. Seventeen small circular DNA genomes containing one or two replicase genes distantly related to the Circoviridae representing several potentially new viral families were characterized. In the Picornaviridae family two new candidate genera as well as a close genetic relative of the human pathogen Aichi virus were characterized. Fragments of the first mouse sapelovirus and picobirnaviruses were identified and the first murine astrovirus genome was characterized. A mouse papillomavirus genome and fragments of a novel adenovirus and adenovirus-associated virus were also sequenced. The next largest fraction of the rodent fecal virome was related to insect viruses of the Densoviridae, Iridoviridae, Polydnaviridae, Dicistroviriade, Bromoviridae, and Virgaviridae families followed by plant virus-related sequences in the Nanoviridae, Geminiviridae, Phycodnaviridae, Secoviridae, Partitiviridae, Tymoviridae, Alphaflexiviridae, and Tombusviridae families reflecting the largely insect and plant rodent diet. Phylogenetic analyses of full and partial viral genomes therefore revealed many previously unreported viral species, genera, and families. The close genetic similarities noted between some rodent and human viruses might reflect past zoonoses. This study increases our understanding of the viral diversity in wild rodents and highlights the large number of still uncharacterized viruses in mammals. PMID:21909269
Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals

PubMed Central

2008-01-01

Background The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) α, β and γ subunits. Further investigation of 14 α-like (Abpa) and 13 β- or γ-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Results Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. Conclusion We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification. PMID:18269759

Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals.

PubMed

Laukaitis, Christina M; Heger, Andreas; Blakley, Tyler D; Munclinger, Pavel; Ponting, Chris P; Karn, Robert C

2008-02-12

The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) alpha, beta and gamma subunits. Further investigation of 14 alpha-like (Abpa) and 13 beta- or gamma-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification.
The Role of microRNA miR-101 in Prostate Cancer Progression

DTIC Science & Technology

2012-09-01

genome -wide mapping of PcG binding in human fibroblasts, human ES cells, mouse ES cells, and Drosophila 37-41 . All of the studies demonstrated that...development. Mamm Genome 2002; 13(9): 493-503. 15. Simon J, Chiang A, Bender W, Shimell MJ, O’Connor M. Elements of the Drosophila bithorax complex... sequencing analysis of the miR-203 genomic region revealed cancer-specific DNA methylation in a region proximal to miR-203 in prostate cancer tissues
A reference genetic map of C. clementina hort. ex Tan.; citrus evolution inferences from comparative mapping

PubMed Central

2012-01-01

Background Most modern citrus cultivars have an interspecific origin. As a foundational step towards deciphering the interspecific genome structures, a reference whole genome sequence was produced by the International Citrus Genome Consortium from a haploid derived from Clementine mandarin. The availability of a saturated genetic map of Clementine was identified as an essential prerequisite to assist the whole genome sequence assembly. Clementine is believed to be a ‘Mediterranean’ mandarin × sweet orange hybrid, and sweet orange likely arose from interspecific hybridizations between mandarin and pummelo gene pools. The primary goals of the present study were to establish a Clementine reference map using codominant markers, and to perform comparative mapping of pummelo, sweet orange, and Clementine. Results Five parental genetic maps were established from three segregating populations, which were genotyped with Single Nucleotide Polymorphism (SNP), Simple Sequence Repeats (SSR) and Insertion-Deletion (Indel) markers. An initial medium density reference map (961 markers for 1084.1 cM) of the Clementine was established by combining male and female Clementine segregation data. This Clementine map was compared with two pummelo maps and a sweet orange map. The linear order of markers was highly conserved in the different species. However, significant differences in map size were observed, which suggests a variation in the recombination rates. Skewed segregations were much higher in the male than female Clementine mapping data. The mapping data confirmed that Clementine arose from hybridization between ‘Mediterranean’ mandarin and sweet orange. The results identified nine recombination break points for the sweet orange gamete that contributed to the Clementine genome. Conclusions A reference genetic map of citrus, used to facilitate the chromosome assembly of the first citrus reference genome sequence, was established. The high conservation of marker order observed at the interspecific level should allow reasonable inferences of most citrus genome sequences by mapping next-generation sequencing (NGS) data in the reference genome sequence. The genome of the haploid Clementine used to establish the citrus reference genome sequence appears to have been inherited primarily from the ‘Mediterranean’ mandarin. The high frequency of skewed allelic segregations in the male Clementine data underline the probable extent of deviation from Mendelian segregation for characters controlled by heterozygous loci in male parents. PMID:23126659
The 4D nucleome project.

PubMed

Dekker, Job; Belmont, Andrew S; Guttman, Mitchell; Leshyk, Victor O; Lis, John T; Lomvardas, Stavros; Mirny, Leonid A; O'Shea, Clodagh C; Park, Peter J; Ren, Bing; Politz, Joan C Ritland; Shendure, Jay; Zhong, Sheng

2017-09-13

The 4D Nucleome Network aims to develop and apply approaches to map the structure and dynamics of the human and mouse genomes in space and time with the goal of gaining deeper mechanistic insights into how the nucleus is organized and functions. The project will develop and benchmark experimental and computational approaches for measuring genome conformation and nuclear organization, and investigate how these contribute to gene regulation and other genome functions. Validated experimental technologies will be combined with biophysical approaches to generate quantitative models of spatial genome organization in different biological states, both in cell populations and in single cells.
The 4D Nucleome Project

PubMed Central

Dekker, Job; Belmont, Andrew S.; Guttman, Mitchell; Leshyk, Victor O.; Lis, John T.; Lomvardas, Stavros; Mirny, Leonid A.; O’Shea, Clodagh C.; Park, Peter J.; Ren, Bing; Ritland Politz, Joan C.; Shendure, Jay; Zhong, Sheng

2017-01-01

Preface The 4D Nucleome Network aims to develop and apply approaches to map the structure and dynamics of the human and mouse genomes in space and time with the goal of gaining deeper mechanistic understanding of how the nucleus is organized and functions. The project will develop and benchmark experimental and computational approaches for measuring genome conformation and nuclear organization, and investigate how these contribute to gene regulation and other genome functions. Validated experimental approaches will be combined with biophysical modeling to generate quantitative models of spatial genome organization in different biological states, both in cell populations and in single cells. PMID:28905911
Identification of transcriptional regulators in the mouse immune system

PubMed Central

Jojic, Vladimir; Shay, Tal; Sylvia, Katelyn; Zuk, Or; Sun, Xin; Kang, Joonsoo; Regev, Aviv; Koller, Daphne

2013-01-01

The differentiation of hematopoietic stem cells into immune cells has been extensively studied in mammals, but the transcriptional circuitry controlling it is still only partially understood. Here, the Immunological Genome Project gene expression profiles across mouse immune lineages allowed us to systematically analyze these circuits. Using a computational algorithm called Ontogenet, we uncovered differentiation-stage specific regulators of mouse hematopoiesis, identifying many known hematopoietic regulators, and 175 new candidate regulators, their target genes, and the cell types in which they act. Among the novel regulators, we highlight the role of ETV5 in γδT cells differntiation. Since the transcriptional program of human and mouse cells is highly conserved1, it is likely that many lessons learned from the mouse model apply to humans. PMID:23624555
Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.

PubMed

Himes, Blanca E; Sheppard, Keith; Berndt, Annerose; Leme, Adriana S; Myers, Rachel A; Gignoux, Christopher R; Levin, Albert M; Gauderman, W James; Yang, James J; Mathias, Rasika A; Romieu, Isabelle; Torgerson, Dara G; Roth, Lindsey A; Huntsman, Scott; Eng, Celeste; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Postma, Dirkje S; Nieuwenhuis, Maartje A E; Vonk, Judith M; Lima, John J; Irvin, Charles G; Peters, Stephen P; Kubo, Michiaki; Tamari, Mayumi; Nakamura, Yusuke; Litonjua, Augusto A; Tantisira, Kelan G; Raby, Benjamin A; Bleecker, Eugene R; Meyers, Deborah A; London, Stephanie J; Barnes, Kathleen C; Gilliland, Frank D; Williams, L Keoki; Burchard, Esteban G; Nicolae, Dan L; Ober, Carole; DeMeo, Dawn L; Silverman, Edwin K; Paigen, Beverly; Churchill, Gary; Shapiro, Steve D; Weiss, Scott T

2013-01-01

Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR). The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify asthma-related genes by integrating AHR associations in mouse with human genome-wide association study (GWAS) data. We used Efficient Mixed Model Association (EMMA) analysis to conduct a GWAS of baseline AHR measures from males and females of 31 mouse strains. Genes near or containing SNPs with EMMA p-values <0.001 were selected for further study in human GWAS. The results of the previously reported EVE consortium asthma GWAS meta-analysis consisting of 12,958 diverse North American subjects from 9 study centers were used to select a subset of homologous genes with evidence of association with asthma in humans. Following validation attempts in three human asthma GWAS (i.e., Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG) and two human AHR GWAS (i.e., SHARP, DAG), the Kv channel interacting protein 4 (KCNIP4) gene was identified as nominally associated with both asthma and AHR at a gene- and SNP-level. In EVE, the smallest KCNIP4 association was at rs6833065 (P-value 2.9e-04), while the strongest associations for Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG were 1.5e-03, 1.0e-03, 3.1e-03 at rs7664617, rs4697177, rs4696975, respectively. At a SNP level, the strongest association across all asthma GWAS was at rs4697177 (P-value 1.1e-04). The smallest P-values for association with AHR were 2.3e-03 at rs11947661 in SHARP and 2.1e-03 at rs402802 in DAG. Functional studies are required to validate the potential involvement of KCNIP4 in modulating asthma susceptibility and/or AHR. Our results suggest that a useful approach to identify genes associated with human asthma is to leverage mouse AHR association data.
Accuracy of estimation of genomic breeding values in pigs using low-density genotypes and imputation.

PubMed

Badke, Yvonne M; Bates, Ronald O; Ernst, Catherine W; Fix, Justin; Steibel, Juan P

2014-04-16

Genomic selection has the potential to increase genetic progress. Genotype imputation of high-density single-nucleotide polymorphism (SNP) genotypes can improve the cost efficiency of genomic breeding value (GEBV) prediction for pig breeding. Consequently, the objectives of this work were to: (1) estimate accuracy of genomic evaluation and GEBV for three traits in a Yorkshire population and (2) quantify the loss of accuracy of genomic evaluation and GEBV when genotypes were imputed under two scenarios: a high-cost, high-accuracy scenario in which only selection candidates were imputed from a low-density platform and a low-cost, low-accuracy scenario in which all animals were imputed using a small reference panel of haplotypes. Phenotypes and genotypes obtained with the PorcineSNP60 BeadChip were available for 983 Yorkshire boars. Genotypes of selection candidates were masked and imputed using tagSNP in the GeneSeek Genomic Profiler (10K). Imputation was performed with BEAGLE using 128 or 1800 haplotypes as reference panels. GEBV were obtained through an animal-centric ridge regression model using de-regressed breeding values as response variables. Accuracy of genomic evaluation was estimated as the correlation between estimated breeding values and GEBV in a 10-fold cross validation design. Accuracy of genomic evaluation using observed genotypes was high for all traits (0.65-0.68). Using genotypes imputed from a large reference panel (accuracy: R(2) = 0.95) for genomic evaluation did not significantly decrease accuracy, whereas a scenario with genotypes imputed from a small reference panel (R(2) = 0.88) did show a significant decrease in accuracy. Genomic evaluation based on imputed genotypes in selection candidates can be implemented at a fraction of the cost of a genomic evaluation using observed genotypes and still yield virtually the same accuracy. On the other side, using a very small reference panel of haplotypes to impute training animals and candidates for selection results in lower accuracy of genomic evaluation.
Cross-species identification of genomic drivers of squamous cell carcinoma development across preneoplastic intermediates

PubMed Central

Chitsazzadeh, Vida; Coarfa, Cristian; Drummond, Jennifer A.; Nguyen, Tri; Joseph, Aaron; Chilukuri, Suneel; Charpiot, Elizabeth; Adelmann, Charles H.; Ching, Grace; Nguyen, Tran N.; Nicholas, Courtney; Thomas, Valencia D.; Migden, Michael; MacFarlane, Deborah; Thompson, Erika; Shen, Jianjun; Takata, Yoko; McNiece, Kayla; Polansky, Maxim A.; Abbas, Hussein A.; Rajapakshe, Kimal; Gower, Adam; Spira, Avrum; Covington, Kyle R.; Xiao, Weimin; Gunaratne, Preethi; Pickering, Curtis; Frederick, Mitchell; Myers, Jeffrey N.; Shen, Li; Yao, Hui; Su, Xiaoping; Rapini, Ronald P.; Wheeler, David A.; Hawk, Ernest T.; Flores, Elsa R.; Tsai, Kenneth Y.

2016-01-01

Cutaneous squamous cell carcinoma (cuSCC) comprises 15–20% of all skin cancers, accounting for over 700,000 cases in USA annually. Most cuSCC arise in association with a distinct precancerous lesion, the actinic keratosis (AK). To identify potential targets for molecularly targeted chemoprevention, here we perform integrated cross-species genomic analysis of cuSCC development through the preneoplastic AK stage using matched human samples and a solar ultraviolet radiation-driven Hairless mouse model. We identify the major transcriptional drivers of this progression sequence, showing that the key genomic changes in cuSCC development occur in the normal skin to AK transition. Our data validate the use of this ultraviolet radiation-driven mouse cuSCC model for cross-species analysis and demonstrate that cuSCC bears deep molecular similarities to multiple carcinogen-driven SCCs from diverse sites, suggesting that cuSCC may serve as an effective, accessible model for multiple SCC types and that common treatment and prevention strategies may be feasible. PMID:27574101
Single-Stranded γPNAs for In Vivo Site-Specific Genome Editing via Watson-Crick Recognition

PubMed Central

Bahal, Raman; Quijano, Elias; McNeer, Nicole Ali; Liu, Yanfeng; Bhunia, Dinesh C.; López-Giráldez, Francesco; Fields, Rachel J.; Saltzman, W. Mark; Ly, Danith H.; Glazer, Peter M.

2014-01-01

Triplex-forming peptide nucleic acids (PNAs) facilitate gene editing by stimulating recombination of donor DNAs within genomic DNA via site-specific formation of altered helical structures that further stimulate DNA repair. However, PNAs designed for triplex formation are sequence restricted to homopurine sites. Herein we describe a novel strategy where next generation single-stranded gamma PNAs (γPNAs) containing miniPEG substitutions at the gamma position can target genomic DNA in mouse bone marrow at mixed-sequence sites to induce targeted gene editing. In addition to enhanced binding, γPNAs confer increased solubility and improved formulation into poly(lactic-co-glycolic acid) (PLGA) nanoparticles for efficient intracellular delivery. Single-stranded γPNAs induce targeted gene editing at frequencies of 0.8% in mouse bone marrow cells treated ex vivo and 0.1% in vivo via IV injection, without detectable toxicity. These results suggest that γPNAs may provide a new tool for induced gene editing based on Watson-Crick recognition without sequence restriction. PMID:25174576
Single-stranded γPNAs for in vivo site-specific genome editing via Watson-Crick recognition.

PubMed

Bahal, Raman; Quijano, Elias; McNeer, Nicole A; Liu, Yanfeng; Bhunia, Dinesh C; Lopez-Giraldez, Francesco; Fields, Rachel J; Saltzman, William M; Ly, Danith H; Glazer, Peter M

2014-01-01

Triplex-forming peptide nucleic acids (PNAs) facilitate gene editing by stimulating recombination of donor DNAs within genomic DNA via site-specific formation of altered helical structures that further stimulate DNA repair. However, PNAs designed for triplex formation are sequence restricted to homopurine sites. Herein we describe a novel strategy where next generation single-stranded gamma PNAs (γPNAs) containing miniPEG substitutions at the gamma position can target genomic DNA in mouse bone marrow at mixed-sequence sites to induce targeted gene editing. In addition to enhanced binding, γPNAs confer increased solubility and improved formulation into poly(lactic-co-glycolic acid) (PLGA) nanoparticles for efficient intracellular delivery. Single-stranded γPNAs induce targeted gene editing at frequencies of 0.8% in mouse bone marrow cells treated ex vivo and 0.1% in vivo via IV injection, without detectable toxicity. These results suggest that γPNAs may provide a new tool for induced gene editing based on Watson-Crick recognition without sequence restriction.
Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library.

PubMed

Koike-Yusa, Hiroko; Li, Yilong; Tan, E-Pien; Velasco-Herrera, Martin Del Castillo; Yusa, Kosuke

2014-03-01

Identification of genes influencing a phenotype of interest is frequently achieved through genetic screening by RNA interference (RNAi) or knockouts. However, RNAi may only achieve partial depletion of gene activity, and knockout-based screens are difficult in diploid mammalian cells. Here we took advantage of the efficiency and high throughput of genome editing based on type II, clustered, regularly interspaced, short palindromic repeats (CRISPR)-CRISPR-associated (Cas) systems to introduce genome-wide targeted mutations in mouse embryonic stem cells (ESCs). We designed 87,897 guide RNAs (gRNAs) targeting 19,150 mouse protein-coding genes and used a lentiviral vector to express these gRNAs in ESCs that constitutively express Cas9. Screening the resulting ESC mutant libraries for resistance to either Clostridium septicum alpha-toxin or 6-thioguanine identified 27 known and 4 previously unknown genes implicated in these phenotypes. Our results demonstrate the potential for efficient loss-of-function screening using the CRISPR-Cas9 system.
Genome Editing in Mouse Spermatogonial Stem/Progenitor Cells Using Engineered Nucleases

PubMed Central

Fanslow, Danielle A.; Wirt, Stacey E.; Barker, Jenny C.; Connelly, Jon P.; Porteus, Matthew H.; Dann, Christina Tenenhaus

2014-01-01

Editing the genome to create specific sequence modifications is a powerful way to study gene function and promises future applicability to gene therapy. Creation of precise modifications requires homologous recombination, a very rare event in most cell types that can be stimulated by introducing a double strand break near the target sequence. One method to create a double strand break in a particular sequence is with a custom designed nuclease. We used engineered nucleases to stimulate homologous recombination to correct a mutant gene in mouse “GS” (germline stem) cells, testicular derived cell cultures containing spermatogonial stem cells and progenitor cells. We demonstrated that gene-corrected cells maintained several properties of spermatogonial stem/progenitor cells including the ability to colonize following testicular transplantation. This proof of concept for genome editing in GS cells impacts both cell therapy and basic research given the potential for GS cells to be propagated in vitro, contribute to the germline in vivo following testicular transplantation or become reprogrammed to pluripotency in vitro. PMID:25409432
Performance of genotype imputation for low frequency and rare variants from the 1000 genomes.

PubMed

Zheng, Hou-Feng; Rong, Jing-Jing; Liu, Ming; Han, Fang; Zhang, Xing-Wei; Richards, J Brent; Wang, Li

2015-01-01

Genotype imputation is now routinely applied in genome-wide association studies (GWAS) and meta-analyses. However, most of the imputations have been run using HapMap samples as reference, imputation of low frequency and rare variants (minor allele frequency (MAF) < 5%) are not systemically assessed. With the emergence of next-generation sequencing, large reference panels (such as the 1000 Genomes panel) are available to facilitate imputation of these variants. Therefore, in order to estimate the performance of low frequency and rare variants imputation, we imputed 153 individuals, each of whom had 3 different genotype array data including 317k, 610k and 1 million SNPs, to three different reference panels: the 1000 Genomes pilot March 2010 release (1KGpilot), the 1000 Genomes interim August 2010 release (1KGinterim), and the 1000 Genomes phase1 November 2010 and May 2011 release (1KGphase1) by using IMPUTE version 2. The differences between these three releases of the 1000 Genomes data are the sample size, ancestry diversity, number of variants and their frequency spectrum. We found that both reference panel and GWAS chip density affect the imputation of low frequency and rare variants. 1KGphase1 outperformed the other 2 panels, at higher concordance rate, higher proportion of well-imputed variants (info>0.4) and higher mean info score in each MAF bin. Similarly, 1M chip array outperformed 610K and 317K. However for very rare variants (MAF ≤ 0.3%), only 0-1% of the variants were well imputed. We conclude that the imputation of low frequency and rare variants improves with larger reference panels and higher density of genome-wide genotyping arrays. Yet, despite a large reference panel size and dense genotyping density, very rare variants remain difficult to impute.
Speed congenics: accelerated genome recovery using genetic markers.

PubMed

Visscher, P M

1999-08-01

Genetic markers throughout the genome can be used to speed up 'recovery' of the recipient genome in the backcrossing phase of the construction of a congenic strain. The prediction of the genomic proportion during backcrossing depends on the assumptions regarding the distribution of chromosome segments, the population structure, the marker spacing and the selection strategy. In this study simulation was used to investigate the rate of recovery of the recipient genome for a mouse, Drosophila and Arabidopsis genome. It was shown that an incorrect assumption of a binomial distribution of chromosome segments, and failing to take account of a reduction in variance in genomic proportion due to selection, can lead to a downward bias of up to two generations in the estimation of the number of generations required for the formation of a congenic strain.
Improved imputation accuracy in Hispanic/Latino populations with larger and more diverse reference panels: applications in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL)

PubMed Central

Nelson, Sarah C.; Stilp, Adrienne M.; Papanicolaou, George J.; Taylor, Kent D.; Rotter, Jerome I.; Thornton, Timothy A.; Laurie, Cathy C.

2016-01-01

Imputation is commonly used in genome-wide association studies to expand the set of genetic variants available for analysis. Larger and more diverse reference panels, such as the final Phase 3 of the 1000 Genomes Project, hold promise for improving imputation accuracy in genetically diverse populations such as Hispanics/Latinos in the USA. Here, we sought to empirically evaluate imputation accuracy when imputing to a 1000 Genomes Phase 3 versus a Phase 1 reference, using participants from the Hispanic Community Health Study/Study of Latinos. Our assessments included calculating the correlation between imputed and observed allelic dosage in a subset of samples genotyped on a supplemental array. We observed that the Phase 3 reference yielded higher accuracy at rare variants, but that the two reference panels were comparable at common variants. At a sample level, the Phase 3 reference improved imputation accuracy in Hispanic/Latino samples from the Caribbean more than for Mainland samples, which we attribute primarily to the additional reference panel samples available in Phase 3. We conclude that a 1000 Genomes Project Phase 3 reference panel can yield improved imputation accuracy compared with Phase 1, particularly for rare variants and for samples of certain genetic ancestry compositions. Our findings can inform imputation design for other genome-wide association studies of participants with diverse ancestries, especially as larger and more diverse reference panels continue to become available. PMID:27346520
Improving draft genome contiguity with reference-derived in silico mate-pair libraries.

PubMed

Grau, José Horacio; Hackl, Thomas; Koepfli, Klaus-Peter; Hofreiter, Michael

2018-05-01

Contiguous genome assemblies are a highly valued biological resource because of the higher number of completely annotated genes and genomic elements that are usable compared to fragmented draft genomes. Nonetheless, contiguity is difficult to obtain if only low coverage data and/or only distantly related reference genome assemblies are available. In order to improve genome contiguity, we have developed Cross-Species Scaffolding-a new pipeline that imports long-range distance information directly into the de novo assembly process by constructing mate-pair libraries in silico. We show how genome assembly metrics and gene prediction dramatically improve with our pipeline by assembling two primate genomes solely based on ∼30x coverage of shotgun sequencing data.
Microarray-Based Comparative Genomic Hybridization Using Sex-Matched Reference DNA Provides Greater Sensitivity for Detection of Sex Chromosome Imbalances than Array-Comparative Genomic Hybridization with Sex-Mismatched Reference DNA

PubMed Central

Yatsenko, Svetlana A.; Shaw, Chad A.; Ou, Zhishuo; Pursley, Amber N.; Patel, Ankita; Bi, Weimin; Cheung, Sau Wai; Lupski, James R.; Chinault, A. Craig; Beaudet, Arthur L.

2009-01-01

In array-comparative genomic hybridization (array-CGH) experiments, the measurement of DNA copy number of sex chromosomal regions depends on the sex of the patient and the reference DNAs used. We evaluated the ability of bacterial artificial chromosomes/P1-derived artificial and oligonucleotide array-CGH analyses to detect constitutional sex chromosome imbalances using sex-mismatched reference DNAs. Twenty-two samples with imbalances involving either the X or Y chromosome, including deletions, duplications, triplications, derivative or isodicentric chromosomes, and aneuploidy, were analyzed. Although concordant results were obtained for approximately one-half of the samples when using sex-mismatched and sex-matched reference DNAs, array-CGH analyses with sex-mismatched reference DNAs did not detect genomic imbalances that were detected using sex-matched reference DNAs in 6 of 22 patients. Small duplications and deletions of the X chromosome were most difficult to detect in female and male patients, respectively, when sex-mismatched reference DNAs were used. Sex-matched reference DNAs in array-CGH analyses provides optimal sensitivity and enables an automated statistical evaluation for the detection of sex chromosome imbalances when compared with an experimental design using sex-mismatched reference DNAs. Using sex-mismatched reference DNAs in array-CGH analyses may generate false-negative, false-positive, and ambiguous results for sex chromosome-specific probes, thus masking potential pathogenic genomic imbalances. Therefore, to optimize both detection of clinically relevant sex chromosome imbalances and ensure proper experimental performance, we suggest that alternative internal controls be developed and used instead of using sex-mismatched reference DNAs. PMID:19324990
VERSE: a novel approach to detect virus integration in host genomes through reference genome customization.

PubMed

Wang, Qingguo; Jia, Peilin; Zhao, Zhongming

2015-01-01

Fueled by widespread applications of high-throughput next generation sequencing (NGS) technologies and urgent need to counter threats of pathogenic viruses, large-scale studies were conducted recently to investigate virus integration in host genomes (for example, human tumor genomes) that may cause carcinogenesis or other diseases. A limiting factor in these studies, however, is rapid virus evolution and resulting polymorphisms, which prevent reads from aligning readily to commonly used virus reference genomes, and, accordingly, make virus integration sites difficult to detect. Another confounding factor is host genomic instability as a result of virus insertions. To tackle these challenges and improve our capability to identify cryptic virus-host fusions, we present a new approach that detects Virus intEgration sites through iterative Reference SEquence customization (VERSE). To the best of our knowledge, VERSE is the first approach to improve detection through customizing reference genomes. Using 19 human tumors and cancer cell lines as test data, we demonstrated that VERSE substantially enhanced the sensitivity of virus integration site detection. VERSE is implemented in the open source package VirusFinder 2 that is available at http://bioinfo.mc.vanderbilt.edu/VirusFinder/.
Limitations of the Mycobacterium tuberculosis reference genome H37Rv in the detection of virulence-related loci.

PubMed

O'Toole, Ronan F; Gautam, Sanjay S

2017-10-01

The genome sequence of Mycobacterium tuberculosis strain H37Rv is an important and valuable reference point in the study of M. tuberculosis phylogeny, molecular epidemiology, and drug-resistance mutations. However, it is becoming apparent that use of H37Rv as a sole reference genome in analysing clinical isolates presents some limitations to fully investigating M. tuberculosis virulence. Here, we examine the presence of single locus variants and the absence of entire genes in H37Rv with respect to strains that are responsible for cases and outbreaks of tuberculosis. We discuss how these polymorphisms may affect phenotypic properties of H37Rv including pathogenicity. Based on our observations and those of other researchers, we propose that use of a single reference genome, H37Rv, is not sufficient for the detection and characterisation of M. tuberculosis virulence-related loci. We recommend incorporation of genome sequences of other reference strains, in particular, direct clinical isolates, in such analyses in addition to H37Rv. Copyright © 2017 Elsevier Inc. All rights reserved.

Cloning, structure, and chromosome localization of the mouse glutaryl-CoA dehydrogenase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koeller, D.M.; DiGiulio, A.; Frerman, F.E.

Glutaryl-CoA dehydrogenase (GCDH) is a nuclear-encoded, mitochondrial matrix enzyme. In humans, deficiency of GCDH leads to glutaric acidemia type I, and inherited disorder of amino acid metabolism characterized by a progressive neurodegenerative disease. In this report we describe the cloning and structure of the mouse GCDH (Gcdh) gene and cDNA and its chromosomal localization. The mouse Gcdh cDNA is 1.75 kb long and contains and open reading frame of 438 amino acids. The amino acid sequences of mouse, human, and pig GCDH are highly conserved. The mouse Gcdh gene contains 11 exons and spans 7 kb of genomic DNA. Gcdhmore » was mapped by backcross analysis to mouse chromosome 8 within a region that is homologous to a region of human chromosome 19, where the human gene was previously mapped. 14 refs., 3 figs.« less
Mouse Models for Down Syndrome-Associated Developmental Cognitive Disabilities

PubMed Central

Liu, Chunhong; Belichenko, Pavel V.; Zhang, Li; Fu, Dawei; Kleschevnikov, Alexander M.; Baldini, Antonio; Antonarakis, Stylianos E.; Mobley, William C.; Yu, Y. Eugene

2011-01-01

Down syndrome (DS) is mainly caused by the presence of an extra copy of human chromosome 21 (Hsa21) and is a leading genetic cause for developmental cognitive disabilities in humans. The mouse is a premier model organism for DS because the regions on Hsa21 are syntenically conserved with three regions in the mouse genome, which are located on mouse chromosome 10 (Mmu10), Mmu16 and Mmu17. With the advance of chromosomal manipulation technologies, new mouse mutants have been generated to mimic DS at both the genotypic and phenotypic levels. Further mouse-based molecular genetic studies in the future may lead to the unraveling of the mechanisms underlying DS-associated developmental cognitive disabilities, which would lay the groundwork for developing effective treatments for this phenotypic manifestation. In this review, we will discuss recent progress and future challenges in modeling DS-associated developmental cognitive disability in mice with an emphasis on hippocampus-related phenotypes. PMID:21865664
Low joining efficiency and non-conservative repair of two distant double-strand breaks in mouse embryonic stem cells.

PubMed

Boubakour-Azzouz, Imenne; Ricchetti, Miria

2008-02-01

Efficient and faithful repair of DNA double-strand breaks (DSBs) is critical for genome stability. To understand whether cells carrying a functional repair apparatus are able to efficiently heal two distant chromosome ends and whether this DNA lesion might result in genome rearrangements, we induced DSBs in genetically modified mouse embryonic stem cells carrying two I-SceI sites in cis separated by a distance of 9 kbp. We show that in this context non-homologous end-joining (NHEJ) can repair using standard DNA pairing of the broken ends, but it also joins 3' non-complementary overhangs that require unusual joining intermediates. The repair efficiency of this lesion appears to be dramatically low and the extent of genome alterations was high in striking contrast with the spectra of repair events reported for two collinear DSBs in other experimental systems. The dramatic decline in accuracy suggests that significant constraints operate in the repair process of these distant DSBs, which may also control the low efficiency of this process. These findings provide important insights into the mechanism of repair by NHEJ and how this process may protect the genome from large rearrangements.
Variant ribosomal RNA alleles are conserved and exhibit tissue-specific expression

PubMed Central

Parks, Matthew M.; Kurylo, Chad M.; Dass, Randall A.; Bojmar, Linda; Lyden, David; Vincent, C. Theresa; Blanchard, Scott C.

2018-01-01

The ribosome, the integration point for protein synthesis in the cell, is conventionally considered a homogeneous molecular assembly that only passively contributes to gene expression. Yet, epigenetic features of the ribosomal DNA (rDNA) operon and changes in the ribosome’s molecular composition have been associated with disease phenotypes, suggesting that the ribosome itself may possess inherent regulatory capacity. Analyzing whole-genome sequencing data from the 1000 Genomes Project and the Mouse Genomes Project, we find that rDNA copy number varies widely across individuals, and we identify pervasive intra- and interindividual nucleotide variation in the 5S, 5.8S, 18S, and 28S ribosomal RNA (rRNA) genes of both human and mouse. Conserved rRNA sequence heterogeneities map to functional centers of the assembled ribosome, variant rRNA alleles exhibit tissue-specific expression, and ribosomes bearing variant rRNA alleles are present in the actively translating ribosome pool. These findings provide a critical framework for exploring the possibility that the expression of genomically encoded variant rRNA alleles gives rise to physically and functionally heterogeneous ribosomes that contribute to mammalian physiology and human disease. PMID:29503865
Intraocular pressure in the smallest primate aging model: the gray mouse lemur.

PubMed

Dubicanac, Marko; Joly, Marine; Strüve, Julia; Nolte, Ingo; Mestre-Francés, Nadine; Verdier, Jean-Michel; Zimmermann, Elke

2018-05-01

The aim of this study was to assess the practicability of common tonometers used in veterinary medicine for rapid intraocular pressure (IOP) screening, to calibrate IOP values gained by the tonometers, and to define a reference IOP value for the healthy eye in a new primate model for aging research, the gray mouse lemur. TonoVet ® and the TonoPen ™ measurements were calibrated manometrically in healthy enucleated eyes of mouse lemurs euthanized for veterinary reasons. For comparison of the practicability of both tonometers as a rapid IOP assessment tool for living mouse lemurs, the IOP of 24 eyes of 12 animals held in the hand was measured. To define a standard reference value for IOP in mouse lemurs, 258 healthy animals were measured using the TonoVet ® . Intraocular pressure measurements for the TonoVet ® can be corrected using the formula: y = 0.981 + (1.962*TonoVet ® value), and those for the TonoPen ™ using that of y = 5.38 + (1.426*TonoPen ™ value). The calibrated IOP for a healthy mouse lemur eye was 20.3 ± 2.8 mmHg. The TonoVet ® showed advantages in practicability, for example, small corneal contact area, short and painless corneal contact, shortened total time spent on investigation, as well as the more accurate measured values. IOP measurements of healthy mouse lemur eyes were not affected by age, sex, eye side, or colony. Tonometry using TonoVet ® is the more practicable assessment tool for IOP measurement of the tiny eyes of living mouse lemurs. Pathological deviations can be identified based on the described reference value. © 2016 American College of Veterinary Ophthalmologists.
Exploiting long read sequencing technologies to establish high quality highly contiguous pig reference genome assemblies

USDA-ARS?s Scientific Manuscript database

The current pig reference genome sequence (Sscrofa10.2) was established using Sanger sequencing and following the clone-by-clone hierarchical shotgun sequencing approach used in the public human genome project. However, as sequence coverage was low (4-6x) the resulting assembly was only of draft qua...
Transcriptome analyses of rhesus monkey preimplantation embryos reveal a reduced capacity for DNA double-strand break repair in primate oocytes and early embryos

PubMed Central

Wang, Xinyi; Liu, Denghui; He, Dajian; Suo, Shengbao; Xia, Xian; He, Xiechao; Han, Jing-Dong J.; Zheng, Ping

2017-01-01

Preimplantation embryogenesis encompasses several critical events including genome reprogramming, zygotic genome activation (ZGA), and cell-fate commitment. The molecular basis of these processes remains obscure in primates in which there is a high rate of embryo wastage. Thus, understanding the factors involved in genome reprogramming and ZGA might help reproductive success during this susceptible period of early development and generate induced pluripotent stem cells with greater efficiency. Moreover, explaining the molecular basis responsible for embryo wastage in primates will greatly expand our knowledge of species evolution. By using RNA-seq in single and pooled oocytes and embryos, we defined the transcriptome throughout preimplantation development in rhesus monkey. In comparison to archival human and mouse data, we found that the transcriptome dynamics of monkey oocytes and embryos were very similar to those of human but very different from those of mouse. We identified several classes of maternal and zygotic genes, whose expression peaks were highly correlated with the time frames of genome reprogramming, ZGA, and cell-fate commitment, respectively. Importantly, comparison of the ZGA-related network modules among the three species revealed less robust surveillance of genomic instability in primate oocytes and embryos than in rodents, particularly in the pathways of DNA damage signaling and homology-directed DNA double-strand break repair. This study highlights the utility of monkey models to better understand the molecular basis for genome reprogramming, ZGA, and genomic stability surveillance in human early embryogenesis and may provide insights for improved homologous recombination-mediated gene editing in monkey. PMID:28223401
Histone variant H3.3-mediated chromatin remodeling is essential for paternal genome activation in mouse preimplantation embryos.

PubMed

Kong, Qingran; Banaszynski, Laura A; Geng, Fuqiang; Zhang, Xiaolei; Zhang, Jiaming; Zhang, Heng; O'Neill, Claire L; Yan, Peidong; Liu, Zhonghua; Shido, Koji; Palermo, Gianpiero D; Allis, C David; Rafii, Shahin; Rosenwaks, Zev; Wen, Duancheng

2018-03-09

Derepression of chromatin-mediated transcriptional repression of paternal and maternal genomes is considered the first major step that initiates zygotic gene expression after fertilization. The histone variant H3.3 is present in both male and female gametes and is thought to be important for remodeling the paternal and maternal genomes for activation during both fertilization and embryogenesis. However, the underlying mechanisms remain poorly understood. Using our H3.3B-HA-tagged mouse model, engineered to report H3.3 expression in live animals and to distinguish different sources of H3.3 protein in embryos, we show here that sperm-derived H3.3 (sH3.3) protein is removed from the sperm genome shortly after fertilization and extruded from the zygotes via the second polar bodies (PBII) during embryogenesis. We also found that the maternal H3.3 (mH3.3) protein is incorporated into the paternal genome as early as 2 h postfertilization and is detectable in the paternal genome until the morula stage. Knockdown of maternal H3.3 resulted in compromised embryonic development both of fertilized embryos and of androgenetic haploid embryos. Furthermore, we report that mH3.3 depletion in oocytes impairs both activation of the Oct4 pluripotency marker gene and global de novo transcription from the paternal genome important for early embryonic development. Our results suggest that H3.3-mediated paternal chromatin remodeling is essential for the development of preimplantation embryos and the activation of the paternal genome during embryogenesis. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.
The novel 2016 WHO Neisseria gonorrhoeae reference strains for global quality assurance of laboratory investigations: phenotypic, genetic and reference genome characterization.

PubMed

Unemo, Magnus; Golparian, Daniel; Sánchez-Busó, Leonor; Grad, Yonatan; Jacobsson, Susanne; Ohnishi, Makoto; Lahra, Monica M; Limnios, Athena; Sikora, Aleksandra E; Wi, Teodora; Harris, Simon R

2016-11-01

Gonorrhoea and MDR Neisseria gonorrhoeae remain public health concerns globally. Enhanced, quality-assured, gonococcal antimicrobial resistance (AMR) surveillance is essential worldwide. The WHO global Gonococcal Antimicrobial Surveillance Programme (GASP) was relaunched in 2009. We describe the phenotypic, genetic and reference genome characteristics of the 2016 WHO gonococcal reference strains intended for quality assurance in the WHO global GASP, other GASPs, diagnostics and research worldwide. The 2016 WHO reference strains (n = 14) constitute the eight 2008 WHO reference strains and six novel strains. The novel strains represent low-level to high-level cephalosporin resistance, high-level azithromycin resistance and a porA mutant. All strains were comprehensively characterized for antibiogram (n = 23), serovar, prolyliminopeptidase, plasmid types, molecular AMR determinants, N. gonorrhoeae multiantigen sequence typing STs and MLST STs. Complete reference genomes were produced using single-molecule PacBio sequencing. The reference strains represented all available phenotypes, susceptible and resistant, to antimicrobials previously and currently used or considered for future use in gonorrhoea treatment. All corresponding resistance genotypes and molecular epidemiological types were described. Fully characterized, annotated and finished references genomes (n = 14) were presented. The 2016 WHO gonococcal reference strains are intended for internal and external quality assurance and quality control in laboratory investigations, particularly in the WHO global GASP and other GASPs, but also in phenotypic (e.g. culture, species determination) and molecular diagnostics, molecular AMR detection, molecular epidemiology and as fully characterized, annotated and finished reference genomes in WGS analysis, transcriptomics, proteomics and other molecular technologies and data analysis. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kishigami, Satoshi; Kinki University, 930 Nishimitani, Kinokawa 599-5993; Wakayama, Sayaka

In mammals, a diploid genome of an individual following fertilization of an egg and a spermatozoon is unique and irreproducible. This implies that the generated unique diploid genome is doomed with the individual ending. Even as cultured cells from the individual, they cannot normally proliferate in perpetuity because of the 'Hayflick limit'. However, Dolly, the sheep cloned from an adult mammary gland cell, changes this scenario. Somatic cell nuclear transfer (SCNT) enables us to produce offspring without germ cells, that is, to 'passage' a unique diploid genome. Animal cloning has also proven to be a powerful research tool for reprogrammingmore » in many mammals, notably mouse and cow. The mechanism underlying reprogramming, however, remains largely unknown and, animal cloning has been inefficient as a result. More momentously, in addition to abortion and fetal mortality, some cloned animals display possible premature aging phenotypes including early death and short telomere lengths. Under these inauspicious conditions, is it really possible for SCNT to preserve a diploid genome? Delightfully, in mouse and recently in primate, using SCNT we can produce nuclear transfer ES cells (ntES) more efficiently, which can preserve the eternal lifespan for the 'passage' of a unique diploid genome. Further, new somatic cloning technique using histone-deacetylase inhibitors has been developed which can significantly increase the previous cloning rates two to six times. Here, we introduce SCNT and its value as a preservation tool for a diploid genome while reviewing aging of cloned animals on cellular and individual levels.« less
Complete Genome Sequence of Acinetobacter baumannii CIP 70.10, a Susceptible Reference Strain for Comparative Genome Analyses.

PubMed

Krahn, Thomas; Wibberg, Daniel; Maus, Irena; Winkler, Anika; Pühler, Alfred; Poirel, Laurent; Schlüter, Andreas

2015-07-30

The complete genome sequence for the reference strain Acinetobacter baumannii CIP 70.10 (ATCC 15151) was established. The strain was isolated in France in 1970, is susceptible to most antimicrobial compounds, and is therefore of importance for comparative genome analyses with clinical multidrug-resistant (MDR) A. baumannii strains to study resistance development and acquisition in this emerging human pathogen. Copyright © 2015 Krahn et al.
Comprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics, Transcriptomics and Proteogenomics

PubMed Central

Dodhia, Kejal; Stoll, Thomas; Hastie, Marcus; Furuki, Eiko; Ellwood, Simon R.; Williams, Angela H.; Tan, Yew-Foon; Testa, Alison C.; Gorman, Jeffrey J.; Oliver, Richard P.

2016-01-01

Parastagonospora nodorum, the causal agent of Septoria nodorum blotch (SNB), is an economically important pathogen of wheat (Triticum spp.), and a model for the study of necrotrophic pathology and genome evolution. The reference P. nodorum strain SN15 was the first Dothideomycete with a published genome sequence, and has been used as the basis for comparison within and between species. Here we present an updated reference genome assembly with corrections of SNP and indel errors in the underlying genome assembly from deep resequencing data as well as extensive manual annotation of gene models using transcriptomic and proteomic sources of evidence (https://github.com/robsyme/Parastagonospora_nodorum_SN15). The updated assembly and annotation includes 8,366 genes with modified protein sequence and 866 new genes. This study shows the benefits of using a wide variety of experimental methods allied to expert curation to generate a reliable set of gene models. PMID:26840125
It’s More Than Stamp Collecting: How Genome Sequencing Can Unify Biological Research

PubMed Central

Richards, Stephen

2015-01-01

The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, whilst the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to “Big Science” survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. PMID:26003218
It's more than stamp collecting: how genome sequencing can unify biological research.

PubMed

Richards, Stephen

2015-07-01

The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, while the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to 'big science' survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. Copyright © 2015 Elsevier Ltd. All rights reserved.
Brucella abortus Strain 2308 Wisconsin Genome: Importance of the Definition of Reference Strains

PubMed Central

Suárez-Esquivel, Marcela; Ruiz-Villalobos, Nazareth; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Roop II, R. Martin; Comerci, Diego J.; Barquero-Calvo, Elías; Chacón-Díaz, Carlos; Caswell, Clayton C.; Baker, Kate S.; Chaves-Olarte, Esteban; Thomson, Nicholas R.; Moreno, Edgardo; Letesson, Jean J.; De Bolle, Xavier; Guzmán-Verri, Caterina

2016-01-01

Brucellosis is a bacterial infectious disease affecting a wide range of mammals and a neglected zoonosis caused by species of the genetically homogenous genus Brucella. As in most studies on bacterial diseases, research in brucellosis is carried out by using reference strains as canonical models to understand the mechanisms underlying host pathogen interactions. We performed whole genome sequencing analysis of the reference strain B. abortus 2308 routinely used in our laboratory, including manual curated annotation accessible as an editable version through a link at https://en.wikipedia.org/wiki/Brucella#Genomics. Comparison of this genome with two publically available 2308 genomes showed significant differences, particularly indels related to insertional elements, suggesting variability related to the transposition of these elements within the same strain. Considering the outcome of high resolution genomic techniques in the bacteriology field, the conventional concept of strain definition needs to be revised. PMID:27746773
Dehydration Preparation of Mouse Sperm for Vitrification and Rapid Laser Warming.

PubMed

Paredes, E; Mazur, P

Mice are fundamental models of study due to their ease of breeding, manipulation, and the well-studied genome. There has been extensive research focused on the cryopreservation of mouse germaplasm, as a way to help maintain the different transgenic mouse breeds. The first protocols for mouse sperm were developed in the 90's using slow cooling and a mixture of raffinose and glycerol. Since then, the rate of success reported remains highly variable. The Aim of this work is to study factors that are key for developing vitrification protocols for ultra-rapid laser warming of mouse sperm. Our results show that due to the exquisite sensitivity of sperm cells to osmotic excursions, our target levels of dehydration (~85% water content) cannot be achieved without causing a significant decrease in sperm motility and membrane fusion. It seems likely that mouse sperm vitrification is going to be difficult to develop due to the exquisite sensitivity of mouse sperm cells to handling and dehydration.
A single nucleotide mutation in Nppc is associated with a long bone abnormality in lbab mice.

PubMed

Jiao, Yan; Yan, Jian; Jiao, Feng; Yang, Hongbin; Donahue, Leah Rae; Li, Xinmin; Roe, Bruce A; Stuart, John; Gu, Weikuan

2007-04-17

The long bone abnormality (lbab) mouse is a new autosomal recessive mutant characterized by overall smaller body size with proportionate dwarfing of all organs and shorter long bones. Previous linkage analysis has located the lbab mutation on chromosome 1 between the markers D1Mit9 and D1Mit488. A genome-based positional approach was used to identify a mutation associated with lbab disease. A total of 122 genes and expressed sequence tags at the lbab region were screened for possible mutation by using genomic DNA from lbabl/lbab, lbab/+, and +/+ B6 mice and high throughput temperature gradient capillary electrophoresis. A sequence difference was identified in one of the amplicons of gene Nppc between lbab/lbab and +/+ mice. One-step reverse transcriptase polymerase chain reaction was performed to validate the difference of Nppc in different types of mice at the mRNA level. The mutation of Nppc was unique in lbab/lbab mice among multiple mouse inbred strains. The mutation of Nppc is co-segregated with lbab disease in 200 progenies produced from heterozygous lbab/+ parents. A single nucleotide mutation of Nppc is associated with dwarfism in lbab/lbab mice. Current genome information and technology allow us to efficiently identify single nucleotide mutations from roughly mapped disease loci. The lbab mouse is a useful model for hereditary human achondroplasia.
A single nucleotide mutation in Nppc is associated with a long bone abnormality in lbab mice

PubMed Central

Jiao, Yan; Yan, Jian; Jiao, Feng; Yang, HongBin; Donahue, Leah Rae; Li, Xinmin; Roe, Bruce A; Stuart, John; Gu, Weikuan

2007-01-01

Background The long bone abnormality (lbab) mouse is a new autosomal recessive mutant characterized by overall smaller body size with proportionate dwarfing of all organs and shorter long bones. Previous linkage analysis has located the lbab mutation on chromosome 1 between the markers D1Mit9 and D1Mit488. Results A genome-based positional approach was used to identify a mutation associated with lbab disease. A total of 122 genes and expressed sequence tags at the lbab region were screened for possible mutation by using genomic DNA from lbabl/lbab, lbab/+, and +/+ B6 mice and high throughput temperature gradient capillary electrophoresis. A sequence difference was identified in one of the amplicons of gene Nppc between lbab/lbab and +/+ mice. One-step reverse transcriptase polymerase chain reaction was performed to validate the difference of Nppc in different types of mice at the mRNA level. The mutation of Nppc was unique in lbab/lbab mice among multiple mouse inbred strains. The mutation of Nppc is co-segregated with lbab disease in 200 progenies produced from heterozygous lbab/+ parents. Conclusion A single nucleotide mutation of Nppc is associated with dwarfism in lbab/lbab mice. Current genome information and technology allow us to efficiently identify single nucleotide mutations from roughly mapped disease loci. The lbab mouse is a useful model for hereditary human achondroplasia. PMID:17439653
A genome-wide shRNA screen identifies GAS1 as a novel melanoma metastasis suppressor gene.

PubMed

Gobeil, Stephane; Zhu, Xiaochun; Doillon, Charles J; Green, Michael R

2008-11-01

Metastasis suppressor genes inhibit one or more steps required for metastasis without affecting primary tumor formation. Due to the complexity of the metastatic process, the development of experimental approaches for identifying genes involved in metastasis prevention has been challenging. Here we describe a genome-wide RNAi screening strategy to identify candidate metastasis suppressor genes. Following expression in weakly metastatic B16-F0 mouse melanoma cells, shRNAs were selected based upon enhanced satellite colony formation in a three-dimensional cell culture system and confirmed in a mouse experimental metastasis assay. Using this approach we discovered 22 genes whose knockdown increased metastasis without affecting primary tumor growth. We focused on one of these genes, Gas1 (Growth arrest-specific 1), because we found that it was substantially down-regulated in highly metastatic B16-F10 melanoma cells, which contributed to the high metastatic potential of this mouse cell line. We further demonstrated that Gas1 has all the expected properties of a melanoma tumor suppressor including: suppression of metastasis in a spontaneous metastasis assay, promotion of apoptosis following dissemination of cells to secondary sites, and frequent down-regulation in human melanoma metastasis-derived cell lines and metastatic tumor samples. Thus, we developed a genome-wide shRNA screening strategy that enables the discovery of new metastasis suppressor genes.
Resveratrol protects mouse embryonic stem cells from ionizing radiation by accelerating recovery from DNA strand breakage.

PubMed

Denissova, Natalia G; Nasello, Cara M; Yeung, Percy L; Tischfield, Jay A; Brenneman, Mark A

2012-01-01

Resveratrol has elicited many provocative anticancer effects in laboratory animals and cultured cells, including reduced levels of oxidative DNA damage, inhibition of tumor initiation and progression and induction of apoptosis in tumor cells. Use of resveratrol as a cancer-preventive agent in humans will require that its anticancer effects not be accompanied by damage to normal tissue stem or progenitor cells. In mouse embryonic stem cells (mESC) or early mouse embryos exposed to ethanol, resveratrol has been shown to suppress apoptosis and promote survival. However, in cells exposed to genotoxic stress, survival may come at the expense of genome stability. To learn whether resveratrol can protect stem cells from DNA damage and to study its effects on genomic integrity, we exposed mESC pretreated with resveratrol to ionizing radiation (IR). Forty-eight hours pretreatment with a comparatively low concentration of resveratrol (10 μM) improved survival of mESC >2-fold after exposure to 5 Gy of X-rays. Cells pretreated with resveratrol sustained the same levels of reactive oxygen species and DNA strand breakage after IR as mock-treated controls, but repaired DNA damage more rapidly and resumed cell division sooner. Frequencies of IR-induced mutation at a chromosomal reporter locus were not increased in cells pretreated with resveratrol as compared with controls, indicating that resveratrol can improve viability in mESC after DNA damage without compromising genomic integrity.

Adaptive Evolution of the Insulin Two-Gene System in Mouse

PubMed Central

Shiao, Meng-Shin; Liao, Ben-Yang; Long, Manyuan; Yu, Hon-Tsen

2008-01-01

Insulin genes in mouse and rat compose a two-gene system in which Ins1 was retroposed from the partially processed mRNA of Ins2. When Ins1 originated and how it was retained in genomes still remain interesting problems. In this study, we used genomic approaches to detect insulin gene copy number variation in rodent species and investigated evolutionary forces acting on both Ins1 and Ins2. We characterized the phylogenetic distribution of the new insulin gene (Ins1) by Southern analyses and confirmed by sequencing insulin genes in the rodent genomes. The results demonstrate that Ins1 originated right before the mouse–rat split (∼20 MYA), and both Ins1 and Ins2 are under strong functional constraints in these murine species. Interestingly, by examining a range of nucleotide polymorphisms, we detected positive selection acting on both Ins2 and Ins1 gene regions in the Mus musculus domesticus populations. Furthermore, three amino acid sites were also identified as having evolved under positive selection in two insulin peptides: two are in the signal peptide and one is in the C-peptide. Our data suggest an adaptive divergence in the mouse insulin two-gene system, which may result from the response to environmental change caused by the rise of agricultural civilization, as proposed by the thrifty-genotype hypothesis. PMID:18245324
The Recombination Landscape in Wild House Mice Inferred Using Population Genomic Data.

PubMed

Booker, Tom R; Ness, Rob W; Keightley, Peter D

2017-09-01

Characterizing variation in the rate of recombination across the genome is important for understanding several evolutionary processes. Previous analysis of the recombination landscape in laboratory mice has revealed that the different subspecies have different suites of recombination hotspots. It is unknown, however, whether hotspots identified in laboratory strains reflect the hotspot diversity of natural populations or whether broad-scale variation in the rate of recombination is conserved between subspecies. In this study, we constructed fine-scale recombination rate maps for a natural population of the Eastern house mouse, Mus musculus castaneus We performed simulations to assess the accuracy of recombination rate inference in the presence of phase errors, and we used a novel approach to quantify phase error. The spatial distribution of recombination events is strongly positively correlated between our castaneus map, and a map constructed using inbred lines derived predominantly from M. m. domesticus Recombination hotspots in wild castaneus show little overlap, however, with the locations of double-strand breaks in wild-derived house mouse strains. Finally, we also find that genetic diversity in M. m. castaneus is positively correlated with the rate of recombination, consistent with pervasive natural selection operating in the genome. Our study suggests that recombination rate variation is conserved at broad scales between house mouse subspecies, but it is not strongly conserved at fine scales. Copyright © 2017 by the Genetics Society of America.
1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

DOE PAGES

Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.; ...

2017-06-12

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster withmore » potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.« less
1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster withmore » potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.« less
Sequence analysis of chromosome 1 revealed different selection patterns between Chinese wild mice and laboratory strains.

PubMed

Xu, Fuyi; Hu, Shixian; Chao, Tianzhu; Wang, Maochun; Li, Kai; Zhou, Yuxun; Xu, Hongyan; Xiao, Junhua

2017-10-01

Both natural and artificial selection play a critical role in animals' adaptation to the environment. Detection of the signature of selection in genomic regions can provide insights for understanding the function of specific phenotypes. It is generally assumed that laboratory mice may experience intense artificial selection while wild mice more natural selection. However, the differences of selection signature in the mouse genome and underlying genes between wild and laboratory mice remain unclear. In this study, we used two mouse populations: chromosome 1 (Chr 1) substitution lines (C1SLs) derived from Chinese wild mice and mouse genome project (MGP) sequenced inbred strains and two selection detection statistics: Fst and Tajima's D to identify the signature of selection footprint on Chr 1. For the differentiation between the C1SLs and MGP, 110 candidate selection regions containing 47 protein coding genes were detected. A total of 149 selection regions which encompass 7.215 Mb were identified in the C1SLs by Tajima's D approach. While for the MGP, we identified nearly twice selection regions (243) compared with the C1SLs which accounted for 13.27 Mb Chr 1 sequence. Through functional annotation, we identified several biological processes with significant enrichment including seven genes in the olfactory transduction pathway. In addition, we searched the phenotypes associated with the 47 candidate selection genes identified by Fst. These genes were involved in behavior, growth or body weight, mortality or aging, and immune systems which align well with the phenotypic differences between wild and laboratory mice. Therefore, the findings would be helpful for our understanding of the phenotypic differences between wild and laboratory mice and applications for using this new mouse resource (C1SLs) for further genetics studies.
Transcriptome analyses of adult mouse brain reveal enrichment of lncRNAs in specific brain regions and neuronal populations

PubMed Central

Kadakkuzha, Beena M.; Liu, Xin-An; McCrate, Jennifer; Shankar, Gautam; Rizzo, Valerio; Afinogenova, Alina; Young, Brandon; Fallahi, Mohammad; Carvalloza, Anthony C.; Raveendra, Bindu; Puthanveettil, Sathyanarayanan V.

2015-01-01

Despite the importance of the long non-coding RNAs (lncRNAs) in regulating biological functions, the expression profiles of lncRNAs in the sub-regions of the mammalian brain and neuronal populations remain largely uncharacterized. By analyzing RNASeq datasets, we demonstrate region specific enrichment of populations of lncRNAs and mRNAs in the mouse hippocampus and pre-frontal cortex (PFC), the two major regions of the brain involved in memory storage and neuropsychiatric disorders. We identified 2759 lncRNAs and 17,859 mRNAs in the hippocampus and 2561 lncRNAs and 17,464 mRNAs expressed in the PFC. The lncRNAs identified correspond to ~14% of the transcriptome of the hippocampus and PFC and ~70% of the lncRNAs annotated in the mouse genome (NCBIM37) and are localized along the chromosomes as varying numbers of clusters. Importantly, we also found that a few of the tested lncRNA-mRNA pairs that share a genomic locus display specific co-expression in a region-specific manner. Furthermore, we find that sub-regions of the brain and specific neuronal populations have characteristic lncRNA expression signatures. These results reveal an unexpected complexity of the lncRNA expression in the mouse brain. PMID:25798087
J Genes for Heavy Chain Immunoglobulins of Mouse

NASA Astrophysics Data System (ADS)

Newell, Nanette; Richards, Julia E.; Tucker, Philip W.; Blattner, Frederick R.

1980-09-01

A 15.8-kilobase pair fragment of BALB/c mouse liver DNA, cloned in the Charon 4Aλ phage vector system, was shown to contain the μ heavy chain constant region (CHμ ) gene for the mouse immunoglobulin M. In addition, this fragment of DNA contains at least two J genes, used to code for the carboxyl terminal portion of heavy chain variable regions. These genes are located in genomic DNA about eight kilobase pairs to the 5' side of the CHμ gene. The complete nucleotide sequence of a 1120-base pair stretch of DNA that includes the two J genes has been determined.
Comprehensive red blood cell and platelet antigen prediction from whole genome sequencing: proof of principle

PubMed Central

Westhoff, Connie M.; Uy, Jon Michael; Aguad, Maria; Smeland‐Wagman, Robin; Kaufman, Richard M.; Rehm, Heidi L.; Green, Robert C.; Silberstein, Leslie E.

2015-01-01

BACKGROUND There are 346 serologically defined red blood cell (RBC) antigens and 33 serologically defined platelet (PLT) antigens, most of which have known genetic changes in 45 RBC or six PLT genes that correlate with antigen expression. Polymorphic sites associated with antigen expression in the primary literature and reference databases are annotated according to nucleotide positions in cDNA. This makes antigen prediction from next‐generation sequencing data challenging, since it uses genomic coordinates. STUDY DESIGN AND METHODS The conventional cDNA reference sequences for all known RBC and PLT genes that correlate with antigen expression were aligned to the human reference genome. The alignments allowed conversion of conventional cDNA nucleotide positions to the corresponding genomic coordinates. RBC and PLT antigen prediction was then performed using the human reference genome and whole genome sequencing (WGS) data with serologic confirmation. RESULTS Some major differences and alignment issues were found when attempting to convert the conventional cDNA to human reference genome sequences for the following genes: ABO, A4GALT, RHD, RHCE, FUT3, ACKR1 (previously DARC), ACHE, FUT2, CR1, GCNT2, and RHAG. However, it was possible to create usable alignments, which facilitated the prediction of all RBC and PLT antigens with a known molecular basis from WGS data. Traditional serologic typing for 18 RBC antigens were in agreement with the WGS‐based antigen predictions, providing proof of principle for this approach. CONCLUSION Detailed mapping of conventional cDNA annotated RBC and PLT alleles can enable accurate prediction of RBC and PLT antigens from whole genomic sequencing data. PMID:26634332
Genome Sequencing of Steroid Producing Bacteria Using Ion Torrent Technology and a Reference Genome.

PubMed

Sola-Landa, Alberto; Rodríguez-García, Antonio; Barreiro, Carlos; Pérez-Redondo, Rosario

2017-01-01

The Next-Generation Sequencing technology has enormously eased the bacterial genome sequencing and several tens of thousands of genomes have been sequenced during the last 10 years. Most of the genome projects are published as draft version, however, for certain applications the complete genome sequence is required.In this chapter, we describe the strategy that allowed the complete genome sequencing of Mycobacterium neoaurum NRRL B-3805, an industrial strain exploited for steroid production, using Ion Torrent sequencing reads and the genome of a close strain as the reference. This protocol can be applied to analyze the genetic variations between closely related strains; for example, to elucidate the point mutations between a parental strain and a random mutagenesis-derived mutant.
Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.

PubMed

Nielsen, H Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska; Rasmussen, Simon; Li, Junhua; Sunagawa, Shinichi; Plichta, Damian R; Gautier, Laurent; Pedersen, Anders G; Le Chatelier, Emmanuelle; Pelletier, Eric; Bonde, Ida; Nielsen, Trine; Manichanh, Chaysavanh; Arumugam, Manimozhiyan; Batto, Jean-Michel; Quintanilha Dos Santos, Marcelo B; Blom, Nikolaj; Borruel, Natalia; Burgdorf, Kristoffer S; Boumezbeur, Fouad; Casellas, Francesc; Doré, Joël; Dworzynski, Piotr; Guarner, Francisco; Hansen, Torben; Hildebrand, Falk; Kaas, Rolf S; Kennedy, Sean; Kristiansen, Karsten; Kultima, Jens Roat; Léonard, Pierre; Levenez, Florence; Lund, Ole; Moumen, Bouziane; Le Paslier, Denis; Pons, Nicolas; Pedersen, Oluf; Prifti, Edi; Qin, Junjie; Raes, Jeroen; Sørensen, Søren; Tap, Julien; Tims, Sebastian; Ussery, David W; Yamada, Takuji; Renault, Pierre; Sicheritz-Ponten, Thomas; Bork, Peer; Wang, Jun; Brunak, Søren; Ehrlich, S Dusko

2014-08-01

Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.
The value of new genome references.

PubMed

Worley, Kim C; Richards, Stephen; Rogers, Jeffrey

2017-09-15

Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.
Are special read alignment strategies necessary and cost-effective when handling sequencing reads from patient-derived tumor xenografts?

PubMed

Tso, Kai-Yuen; Lee, Sau Dan; Lo, Kwok-Wai; Yip, Kevin Y

2014-12-23

Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data. We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing. Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.
Transcription of mouse Sp2 yields alternatively spliced and sub-genomic mRNAs in a tissue- and cell-type-specific fashion.

PubMed

Yin, Haifeng; Nichols, Teresa D; Horowitz, Jonathan M

2010-07-01

The Sp-family of transcription factors is comprised by nine members, Sp1-9, that share a highly conserved DNA-binding domain. Sp2 is a poorly characterized member of this transcription factor family that is widely expressed in murine and human cell lines yet exhibits little DNA-binding or trans-activation activity in these settings. As a prelude to the generation of a "knock-out" mouse strain, we isolated a mouse Sp2 cDNA and performed a detailed analysis of Sp2 transcription in embryonic and adult mouse tissues. We report that (1) the 5' untranslated region of Sp2 is subject to alternative splicing, (2) Sp2 transcription is regulated by at least two promoters that differ in their cell-type specificity, (3) one Sp2 promoter is highly active in nine mammalian cell lines and strains and is regulated by at least five discrete stimulatory and inhibitory elements, (4) a variety of sub-genomic messages are synthesized from the Sp2 locus in a tissue- and cell-type-specific fashion and these transcripts have the capacity to encode a novel partial-Sp2 protein, and (5) RNA in situ hybridization assays indicate that Sp2 is widely expressed during mouse embryogenesis, particularly in the embryonic brain, and robust Sp2 expression occurs in neurogenic regions of the post-natal and adult brain. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Effective PCR-based detection of Naegleria fowleri from cultured sample and PAM-developed mouse.

PubMed

Kang, Heekyoung; Seong, Gi-Sang; Sohn, Hae-Jin; Kim, Jong-Hyun; Lee, Sang-Eun; Park, Mi Yeoun; Lee, Won-Ja; Shin, Ho-Joon

2015-10-01

Increasing numbers of Primary Amoebic Meningoencephalitis (PAM) cases due to Naegleria fowleri are becoming a serious issue in subtropical and tropical countries as a Neglected Tropical Disease (NTD). To establish a rapid and effective diagnostic tool, a PCR-based detection technique was developed based on previous PCR methods. Four kinds of primer pairs, Nfa1, Nae3, Nf-ITS, and Naegl, were employed in the cultured amoebic trophozoites and a mouse with PAM experimentally developed by N. fowleri inoculation (PAM-mouse). For the extraction of genomic DNA from N. fowleri trophozoites (1×10(6)), simple boiling with 10μl of PBS (pH 7.4) at 100°C for 30min was found to be the most rapid and efficient procedure, allowing amplification of 2.5×10(2) trophozoites using the Nfa-1 primer. The primers Nfa1 and Nae3 amplified only N. fowleri DNA, whereas the ITS primer detected N. fowleri and N. gruberi DNA. Using the PAM-mouse brain tissue, the Nfa1 primer was able to amplify the N. fowleri DNA 4 days post infection with 1ng/μl of genomic DNA being detectable. Using the PAM-mouse CSF, amplification of the N. fowleri DNA with the Nae3 primer was possible 5 days post infection showing a better performance than the Nfa1 primer at day 6. Copyright © 2015 Elsevier GmbH. All rights reserved.
Systemic delivery of shRNA by AAV9 provides highly efficient knockdown of ubiquitously expressed GFP in mouse heart, but not liver.

PubMed

Piras, Bryan A; O'Connor, Daniel M; French, Brent A

2013-01-01

AAV9 is a powerful gene delivery vehicle capable of providing long-term gene expression in a variety of cell types, particularly cardiomyocytes. The use of AAV-delivery for RNA interference is an intense area of research, but a comprehensive analysis of knockdown in cardiac and liver tissues after systemic delivery of AAV9 has yet to be reported. We sought to address this question by using AAV9 to deliver a short-hairpin RNA targeting the enhanced green fluorescent protein (GFP) in transgenic mice that constitutively overexpress GFP in all tissues. The expression cassette was initially tested in vitro and we demonstrated a 61% reduction in mRNA and a 90% reduction in GFP protein in dual-transfected 293 cells. Next, the expression cassette was packaged as single-stranded genomes in AAV9 capsids to test cardiac GFP knockdown with several doses ranging from 1.8×10(10) to 1.8×10(11) viral genomes per mouse and a dose-dependent response was obtained. We then analyzed GFP expression in both heart and liver after delivery of 4.4×10(11) viral genomes per mouse. We found that while cardiac knockdown was highly efficient, with a 77% reduction in GFP mRNA and a 71% reduction in protein versus control-treated mice, there was no change in liver expression. This was despite a 4.5-fold greater number of viral genomes in the liver than in the heart. This study demonstrates that single-stranded AAV9 vectors expressing shRNA can be used to achieve highly efficient cardiac-selective knockdown of GFP expression that is sustained for at least 7 weeks after the systemic injection of 8 day old mice, with no change in liver expression and no evidence of liver damage despite high viral genome presence in the liver.
De novo assembly of a haplotype-resolved human genome.

PubMed

Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

2015-06-01

The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.
Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM)

PubMed Central

Beagrie, Robert A.; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C.A.; Chotalia, Mita; Xie, Sheila Q.; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R.; Fraser, James; Dostie, Josée; Game, Laurence; Dillon, Niall; Edwards, Paul A.W.; Nicodemi, Mario; Pombo, Ana

2017-01-01

Summary The organization of the genome in the nucleus and the interactions of genes with their regulatory elements are key features of transcriptional control and their disruption can cause disease. We developed a novel genome-wide method, Genome Architecture Mapping (GAM), for measuring chromatin contacts, and other features of three-dimensional chromatin topology, based on sequencing DNA from a large collection of thin nuclear sections. We apply GAM to mouse embryonic stem cells and identify an enrichment for specific interactions between active genes and enhancers across very large genomic distances, using a mathematical model ‘SLICE’ (Statistical Inference of Co-segregation). GAM also reveals an abundance of three-way contacts genome-wide, especially between regions that are highly transcribed or contain super-enhancers, highlighting a previously inaccessible complexity in genome architecture and a major role for gene-expression specific contacts in organizing the genome in mammalian nuclei. PMID:28273065
The genome of the Gulf pipefish enables understanding of evolutionary innovations.

PubMed

Small, C M; Bassham, S; Catchen, J; Amores, A; Fuiten, A M; Brown, R S; Jones, A G; Cresko, W A

2016-12-20

Evolutionary origins of derived morphologies ultimately stem from changes in protein structure, gene regulation, and gene content. A well-assembled, annotated reference genome is a central resource for pursuing these molecular phenomena underlying phenotypic evolution. We explored the genome of the Gulf pipefish (Syngnathus scovelli), which belongs to family Syngnathidae (pipefishes, seahorses, and seadragons). These fishes have dramatically derived bodies and a remarkable novelty among vertebrates, the male brood pouch. We produce a reference genome, condensed into chromosomes, for the Gulf pipefish. Gene losses and other changes have occurred in pipefish hox and dlx clusters and in the tbx and pitx gene families, candidate mechanisms for the evolution of syngnathid traits, including an elongated axis and the loss of ribs, pelvic fins, and teeth. We measure gene expression changes in pregnant versus non-pregnant brood pouch tissue and characterize the genomic organization of duplicated metalloprotease genes (patristacins) recruited into the function of this novel structure. Phylogenetic inference using ultraconserved sequences provides an alternative hypothesis for the relationship between orders Syngnathiformes and Scombriformes. Comparisons of chromosome structure among percomorphs show that chromosome number in a pipefish ancestor became reduced via chromosomal fusions. The collected findings from this first syngnathid reference genome open a window into the genomic underpinnings of highly derived morphologies, demonstrating that de novo production of high quality and useful reference genomes is within reach of even small research groups.
Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship.

PubMed

Lee, S Hong; Clark, Sam; van der Werf, Julius H J

2017-01-01

Genomic prediction is emerging in a wide range of fields including animal and plant breeding, risk prediction in human precision medicine and forensic. It is desirable to establish a theoretical framework for genomic prediction accuracy when the reference data consists of information sources with varying degrees of relationship to the target individuals. A reference set can contain both close and distant relatives as well as 'unrelated' individuals from the wider population in the genomic prediction. The various sources of information were modeled as different populations with different effective population sizes (Ne). Both the effective number of chromosome segments (Me) and Ne are considered to be a function of the data used for prediction. We validate our theory with analyses of simulated as well as real data, and illustrate that the variation in genomic relationships with the target is a predictor of the information content of the reference set. With a similar amount of data available for each source, we show that close relatives can have a substantially larger effect on genomic prediction accuracy than lesser related individuals. We also illustrate that when prediction relies on closer relatives, there is less improvement in prediction accuracy with an increase in training data or marker panel density. We release software that can estimate the expected prediction accuracy and power when combining different reference sources with various degrees of relationship to the target, which is useful when planning genomic prediction (before or after collecting data) in animal, plant and human genetics.
Rational Design of Mouse Models for Cancer Research.

PubMed

Landgraf, Marietta; McGovern, Jacqui A; Friedl, Peter; Hutmacher, Dietmar W

2018-03-01

The laboratory mouse is widely considered as a valid and affordable model organism to study human disease. Attempts to improve the relevance of murine models for the investigation of human pathologies led to the development of various genetically engineered, xenograft and humanized mouse models. Nevertheless, most preclinical studies in mice suffer from insufficient predictive value when compared with cancer biology and therapy response of human patients. We propose an innovative strategy to improve the predictive power of preclinical cancer models. Combining (i) genomic, tissue engineering and regenerative medicine approaches for rational design of mouse models with (ii) rapid prototyping and computational benchmarking against human clinical data will enable fast and nonbiased validation of newly generated models. Copyright © 2017 Elsevier Ltd. All rights reserved.

New in-depth rainbow trout transcriptome reference and digital atlas of gene expression

USDA-ARS?s Scientific Manuscript database

Sequencing the rainbow trout genome is underway and a transcriptome reference sequence is required to help in genome assembly and gene discovery. Previously, we reported a transcriptome reference sequence using a 19X coverage of 454-pyrosequencing data. Although this work added a great wealth of ann...
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tu, Q.; Deng, Ye; Lin, Lu

Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
HnRNP A3 genes and pseudogenes in the vertebrate genomes.

PubMed

Makeyev, Aleksandr V; Kim, Chang Bae; Ruddle, Frank H; Enkhmandakh, Badam; Erdenechimeg, Lkhamsuren; Bayarsaihan, Dashzeveg

2005-04-01

The hnRNP A/B type proteins are abundant nuclear factors that bind to Pol II transcripts and are involved in numerous RNA-related activities. To date most data on the hnRNP A/B family have been obtained with recombinant proteins and cell cultures. Further characterization can result from an examination of the impact of various modifications in intact functional loci; however, such characterization is hampered by the presence of numerous and widely dispersed hnRNP A/B-related sequences in the mammalian genome. We have found hnRNP A3, a poorly recognized member of the hnRNP A/B family, among candidate transcription factors that interact with the regulatory region of the Hoxc8 gene and screened the human and mouse genomes for genes that encode hnRNP A3. We demonstrate that the sequence reported previously as the human hnRNP A3 gene (Accession number S63912) and located on 10p11.1 belongs to a processed pseudogene of the functional intron-containing locus HNRPA3, which we have identified on 2q31.2. We have also identified its murine orthologs on mouse chromosome 2D and rat chromosome 3q23. Alternative splices were revealed at the N-terminus and in the middle of hnRNP A3. 14 and 28 additional loci in the human and mouse genome, respectively, were mapped and identified as hnRNP A3 processed pseudogenes. In addition, we have found and compared hnRNP A3 orthologous genes in Gallus gallus, Xenopus tropicalis, and Danio rerio. The present in silico analysis serves as a necessary step toward a further functional characterization of hnRNP A3. (c) 2005 Wiley-Liss, Inc.
Microarray Analysis of LTR Retrotransposon Silencing Identifies Hdac1 as a Regulator of Retrotransposon Expression in Mouse Embryonic Stem Cells

PubMed Central

Madej, Monika J.; Taggart, Mary; Gautier, Philippe; Garcia-Perez, Jose Luis; Meehan, Richard R.; Adams, Ian R.

2012-01-01

Retrotransposons are highly prevalent in mammalian genomes due to their ability to amplify in pluripotent cells or developing germ cells. Host mechanisms that silence retrotransposons in germ cells and pluripotent cells are important for limiting the accumulation of the repetitive elements in the genome during evolution. However, although silencing of selected individual retrotransposons can be relatively well-studied, many mammalian retrotransposons are seldom analysed and their silencing in germ cells, pluripotent cells or somatic cells remains poorly understood. Here we show, and experimentally verify, that cryptic repetitive element probes present in Illumina and Affymetrix gene expression microarray platforms can accurately and sensitively monitor repetitive element expression data. This computational approach to genome-wide retrotransposon expression has allowed us to identify the histone deacetylase Hdac1 as a component of the retrotransposon silencing machinery in mouse embryonic stem cells, and to determine the retrotransposon targets of Hdac1 in these cells. We also identify retrotransposons that are targets of other retrotransposon silencing mechanisms such as DNA methylation, Eset-mediated histone modification, and Ring1B/Eed-containing polycomb repressive complexes in mouse embryonic stem cells. Furthermore, our computational analysis of retrotransposon silencing suggests that multiple silencing mechanisms are independently targeted to retrotransposons in embryonic stem cells, that different genomic copies of the same retrotransposon can be differentially sensitive to these silencing mechanisms, and helps define retrotransposon sequence elements that are targeted by silencing machineries. Thus repeat annotation of gene expression microarray data suggests that a complex interplay between silencing mechanisms represses retrotransposon loci in germ cells and embryonic stem cells. PMID:22570599
Workplace Ergonomics Reference Guide

MedlinePlus

... between use of keyboard and mouse (use keystroke equivalents to mouse).  Change your posture frequently throughout the ... walls should be removed.  The carpeting should be non-absorbent in warm, dark colors without padding or ...
Human Hrs, a tyrosine kinase substrate in growth factor-stimulated cells: cDNA cloning and mapping of the gene to chromosome 17.

PubMed

Lu, L; Komada, M; Kitamura, N

1998-06-15

Hrs is a 115kDa zinc finger protein which is rapidly tyrosine phosphorylated in cells stimulated with various growth factors. We previously purified the protein from a mouse cell line and cloned its cDNA. In the present study, we cloned a human Hrs cDNA from a human placenta cDNA library by cross-hybridization, using the mouse cDNA as a probe, and determined its nucleotide sequence. The human Hrs cDNA encoded a 777-amino-acid protein whose sequence was 93% identical to that of mouse Hrs. Northern blot analysis showed that the Hrs mRNA was about 3.0kb long and was expressed in all the human adult and fetal tissues tested. In addition, we showed by genomic Southern blot analysis that the human Hrs gene was a single-copy gene with a size of about 20kb. Furthermore, the human Hrs gene was mapped to chromosome 17 by Southern blotting of genomic DNAs from human/rodent somatic cell hybrids. Copyright 1998 Elsevier Science B.V. All rights reserved.
Eccentric localization of catalase to protect chromosomes from oxidative damages during meiotic maturation in mouse oocytes.

PubMed

Park, Yong Seok; You, Seung Yeop; Cho, Sungrae; Jeon, Hyuk-Joon; Lee, Sukchan; Cho, Dong-Hyung; Kim, Jae-Sung; Oh, Jeong Su

2016-09-01

The maintenance of genomic integrity and stability is essential for the survival of every organism. Unfortunately, DNA is vulnerable to attack by a variety of damaging agents. Oxidative stress is a major cause of DNA damage because reactive oxygen species (ROS) are produced as by-products of normal cellular metabolism. Cells have developed eloquent antioxidant defense systems to protect themselves from oxidative damage along with aerobic metabolism. Here, we show that catalase (CAT) is present in mouse oocytes to protect the genome from oxidative damage during meiotic maturation. CAT was expressed in the nucleus to form unique vesicular structures. However, after nuclear envelope breakdown, CAT was redistributed in the cytoplasm with particular focus at the chromosomes. Inhibition of CAT activity increased endogenous ROS levels, but did not perturb meiotic maturation. In addition, CAT inhibition produced chromosomal defects, including chromosome misalignment and DNA damage. Therefore, our data suggest that CAT is required not only to scavenge ROS, but also to protect DNA from oxidative damage during meiotic maturation in mouse oocytes.
Deletion of ultraconserved elements yields viable mice

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahituv, Nadav; Zhu, Yiwen; Visel, Axel

2007-07-15

Ultraconserved elements have been suggested to retainextended perfect sequence identity between the human, mouse, and ratgenomes due to essential functional properties. To investigate thenecessities of these elements in vivo, we removed four non-codingultraconserved elements (ranging in length from 222 to 731 base pairs)from the mouse genome. To maximize the likelihood of observing aphenotype, we chose to delete elements that function as enhancers in amouse transgenic assay and that are near genes that exhibit markedphenotypes both when completely inactivated in the mouse as well as whentheir expression is altered due to other genomic modifications.Remarkably, all four resulting lines of mice lackingmore » these ultraconservedelements were viable and fertile, and failed to reveal any criticalabnormalities when assayed for a variety of phenotypes including growth,longevity, pathology and metabolism. In addition more targeted screens,informed by the abnormalities observed in mice where genes in proximityto the investigated elements had been altered, also failed to revealnotable abnormalities. These results, while not inclusive of all thepossible phenotypic impact of the deleted sequences, indicate thatextreme sequence constraint does not necessarily reflect crucialfunctions required for viability.« less
Genomic organization and sequence of the Gus-s/sup a/ allele of the murine. beta. -glucuronidase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Funkenstein, B.; Leary, S.L.; Stein, J.C.

1988-03-01

The Gus-s/sup ..cap alpha../ allele of the mouse ..beta..-glucuronidase gene exhibits a high degree of inducibility by androgens due to its linkage with the Gus-r/sup ..cap alpha../ regulatory locus. The authors isolated Gus-s/sup ..cap alpha../ on a 28-kilobase pair fragment of mouse chromosome 5 and found that it contains 12 exons and 11 intervening sequences spanning 14 kilobase pairs of this genomic segment. The mRNA cap site was identified by ribonuclease protection and primer extension analyses which revealed an unusually short 5' noncoding sequence of 12 nucleotides. Proximal regulatory sequences in the 5'-flanking DNA and the complete sequence of themore » Gus-s/sup ..cap alpha../ mRNA transcript were also determined. Comparison of the amino acid sequence determined from the Gus-s/sup ..cap alpha../ nucleotide sequence with that of human ..beta..-glucuronidase indicated that the two human mRNA species differ due to alternate splicing of an exon homologous to exon 6 of the mouse gene.« less
Rodent models in Down syndrome research: impact and future opportunities

PubMed Central

2017-01-01

ABSTRACT Down syndrome is caused by trisomy of chromosome 21. To date, a multiplicity of mouse models with Down-syndrome-related features has been developed to understand this complex human chromosomal disorder. These mouse models have been important for determining genotype-phenotype relationships and identification of dosage-sensitive genes involved in the pathophysiology of the condition, and in exploring the impact of the additional chromosome on the whole genome. Mouse models of Down syndrome have also been used to test therapeutic strategies. Here, we provide an overview of research in the last 15 years dedicated to the development and application of rodent models for Down syndrome. We also speculate on possible and probable future directions of research in this fast-moving field. As our understanding of the syndrome improves and genome engineering technologies evolve, it is necessary to coordinate efforts to make all Down syndrome models available to the community, to test therapeutics in models that replicate the whole trisomy and design new animal models to promote further discovery of potential therapeutic targets. PMID:28993310
Rodent models in Down syndrome research: impact and future opportunities.

PubMed

Herault, Yann; Delabar, Jean M; Fisher, Elizabeth M C; Tybulewicz, Victor L J; Yu, Eugene; Brault, Veronique

2017-10-01

Down syndrome is caused by trisomy of chromosome 21. To date, a multiplicity of mouse models with Down-syndrome-related features has been developed to understand this complex human chromosomal disorder. These mouse models have been important for determining genotype-phenotype relationships and identification of dosage-sensitive genes involved in the pathophysiology of the condition, and in exploring the impact of the additional chromosome on the whole genome. Mouse models of Down syndrome have also been used to test therapeutic strategies. Here, we provide an overview of research in the last 15 years dedicated to the development and application of rodent models for Down syndrome. We also speculate on possible and probable future directions of research in this fast-moving field. As our understanding of the syndrome improves and genome engineering technologies evolve, it is necessary to coordinate efforts to make all Down syndrome models available to the community, to test therapeutics in models that replicate the whole trisomy and design new animal models to promote further discovery of potential therapeutic targets. © 2017. Published by The Company of Biologists Ltd.
[The role of metabolic activation of promutagens in the genome destabilization under pheromonal stress in the house mouse (Mus musculus)].

PubMed

Zhuk, A S; Stepchenkova, E I; Dukel'skaia, A V; Daev, E V; Inge-Vechtomov, S G

2011-10-01

The hypothesis on a relationship between the high frequency of mitotic disturbances in bone marrow cells and the change in the activity of the S9 liver fraction containing promutagen-activating enzymes under olfactory stress in the house mouse Mus musculus has been tested. For this purpose, the effect of the pheromone 2,5-dimethylpyrazine on the frequency of mitotic disturbances in mouse bone marrow cells has been measured by the anaphase-telophase assay. The Ames test using Salmonella typhimurium has been employed to compare the capacities of the S9 liver fractions from stressed and intact mice for activating the promutagen 2-aminofluorene. It has been demonstrated that the increased frequency of mitotic disturbances in bone marrow cells induced by the pheromonal stressor in male house mice is accompanied by an increased promutagen-activating capacity of the S9 liver fraction. The model system used in the study allowed the genetic consequences of the exposure to the olfactory stressor to be estimated and the possible mechanisms of genome destabilization to be assumed.
Disease model curation improvements at Mouse Genome Informatics

PubMed Central

Bello, Susan M.; Richardson, Joel E.; Davis, Allan P.; Wiegers, Thomas C.; Mattingly, Carolyn J.; Dolan, Mary E.; Smith, Cynthia L.; Blake, Judith A.; Eppig, Janan T.

2012-01-01

Optimal curation of human diseases requires an ontology or structured vocabulary that contains terms familiar to end users, is robust enough to support multiple levels of annotation granularity, is limited to disease terms and is stable enough to avoid extensive reannotation following updates. At Mouse Genome Informatics (MGI), we currently use disease terms from Online Mendelian Inheritance in Man (OMIM) to curate mouse models of human disease. While OMIM provides highly detailed disease records that are familiar to many in the medical community, it lacks structure to support multilevel annotation. To improve disease annotation at MGI, we evaluated the merged Medical Subject Headings (MeSH) and OMIM disease vocabulary created by the Comparative Toxicogenomics Database (CTD) project. Overlaying MeSH onto OMIM provides hierarchical access to broad disease terms, a feature missing from the OMIM. We created an extended version of the vocabulary to meet the genetic disease-specific curation needs at MGI. Here we describe our evaluation of the CTD application, the extensions made by MGI and discuss the strengths and weaknesses of this approach. Database URL: http://www.informatics.jax.org/ PMID:22434831
The Mosaic Ancestry of the Drosophila Genetic Reference Panel and the D. melanogaster Reference Genome Reveals a Network of Epistatic Fitness Interactions.

PubMed

Pool, John E

2015-12-01

North American populations of Drosophila melanogaster derive from both European and African source populations, but despite their importance for genetic research, patterns of ancestry along their genomes are largely undocumented. Here, I infer geographic ancestry along genomes of the Drosophila Genetic Reference Panel (DGRP) and the D. melanogaster reference genome, which may have implications for reference alignment, association mapping, and population genomic studies in Drosophila. Overall, the proportion of African ancestry was estimated to be 20% for the DGRP and 9% for the reference genome. Combining my estimate of admixture timing with historical records, I provide the first estimate of natural generation time for this species (approximately 15 generations per year). Ancestry levels were found to vary strikingly across the genome, with less African introgression on the X chromosome, in regions of high recombination, and at genes involved in specific processes (e.g., circadian rhythm). An important role for natural selection during the admixture process was further supported by evidence that many unlinked pairs of loci showed a deficiency of Africa-Europe allele combinations between them. Numerous epistatic fitness interactions may therefore exist between African and European genotypes, leading to ongoing selection against incompatible variants. By focusing on hubs in this network of fitness interactions, I identified a set of interacting loci that include genes with roles in sensation and neuropeptide/hormone reception. These findings suggest that admixed D. melanogaster samples could become an important study system for the genetics of early-stage isolation between populations. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Linkage maps of the Atlantic salmon (Salmo salar) genome derived from RAD sequencing

PubMed Central

2014-01-01

Background Genetic linkage maps are useful tools for mapping quantitative trait loci (QTL) influencing variation in traits of interest in a population. Genotyping-by-sequencing approaches such as Restriction-site Associated DNA sequencing (RAD-Seq) now enable the rapid discovery and genotyping of genome-wide SNP markers suitable for the development of dense SNP linkage maps, including in non-model organisms such as Atlantic salmon (Salmo salar). This paper describes the development and characterisation of a high density SNP linkage map based on SbfI RAD-Seq SNP markers from two Atlantic salmon reference families. Results Approximately 6,000 SNPs were assigned to 29 linkage groups, utilising markers from known genomic locations as anchors. Linkage maps were then constructed for the four mapping parents separately. Overall map lengths were comparable between male and female parents, but the distribution of the SNPs showed sex-specific patterns with a greater degree of clustering of sire-segregating SNPs to single chromosome regions. The maps were integrated with the Atlantic salmon draft reference genome contigs, allowing the unique assignment of ~4,000 contigs to a linkage group. 112 genome contigs mapped to two or more linkage groups, highlighting regions of putative homeology within the salmon genome. A comparative genomics analysis with the stickleback reference genome identified putative genes closely linked to approximately half of the ordered SNPs and demonstrated blocks of orthology between the Atlantic salmon and stickleback genomes. A subset of 47 RAD-Seq SNPs were successfully validated using a high-throughput genotyping assay, with a correspondence of 97% between the two assays. Conclusions This Atlantic salmon RAD-Seq linkage map is a resource for salmonid genomics research as genotyping-by-sequencing becomes increasingly common. This is aided by the integration of the SbfI RAD-Seq SNPs with existing reference maps and the draft reference genome, as well as the identification of putative genes proximal to the SNPs. Differences in the distribution of recombination events between the sexes is evident, and regions of homeology have been identified which are reflective of the recent salmonid whole genome duplication. PMID:24571138
A Perfect Match Genomic Landscape Provides a Unified Framework for the Precise Detection of Variation in Natural and Synthetic Haploid Genomes

PubMed Central

Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo

2018-01-01

We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. PMID:29367403
Challenges imposed by minor reference alleles on the identification and reporting of clinical variants from exome data.

PubMed

Koko, Mahmoud; Abdallah, Mohammed O E; Amin, Mutaz; Ibrahim, Muntaser

2018-01-15

The conventional variant calling of pathogenic alleles in exome and genome sequencing requires the presence of the non-pathogenic alleles as genome references. This hinders the correct identification of variants with minor and/or pathogenic reference alleles warranting additional approaches for variant calling. More than 26,000 Exome Aggregation Consortium (ExAC) variants have a minor reference allele including variants with known ClinVar disease alleles. For instance, in a number of variants related to clotting disorders, the phenotype-associated allele is a human genome reference allele (rs6025, rs6003, rs1799983, and rs2227564 using the assembly hg19). We highlighted how the current variant calling standards miss homozygous reference disease variants in these sites and provided a bioinformatic panel that can be used to screen these variants using commonly available variant callers. We present exome sequencing results from an individual with venous thrombosis to emphasize how pathogenic alleles in clinically relevant variants escape variant calling while non-pathogenic alleles are detected. This article highlights the importance of specialized variant calling strategies in clinical variants with minor reference alleles especially in the context of personal genomes and exomes. We provide here a simple strategy to screen potential disease-causing variants when present in homozygous reference state.
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads

PubMed Central

Li, Pinghao; Jiang, Xiaoqian; Wang, Shuang; Kim, Jihoon; Xiong, Hongkai; Ohno-Machado, Lucila

2014-01-01

Background and objective Short-read sequencing is becoming the standard of practice for the study of structural variants associated with disease. However, with the growth of sequence data largely surpassing reasonable storage capability, the biomedical community is challenged with the management, transfer, archiving, and storage of sequence data. Methods We developed Hierarchical mUlti-reference Genome cOmpression (HUGO), a novel compression algorithm for aligned reads in the sorted Sequence Alignment/Map (SAM) format. We first aligned short reads against a reference genome and stored exactly mapped reads for compression. For the inexact mapped or unmapped reads, we realigned them against different reference genomes using an adaptive scheme by gradually shortening the read length. Regarding the base quality value, we offer lossy and lossless compression mechanisms. The lossy compression mechanism for the base quality values uses k-means clustering, where a user can adjust the balance between decompression quality and compression rate. The lossless compression can be produced by setting k (the number of clusters) to the number of different quality values. Results The proposed method produced a compression ratio in the range 0.5–0.65, which corresponds to 35–50% storage savings based on experimental datasets. The proposed approach achieved 15% more storage savings over CRAM and comparable compression ratio with Samcomp (CRAM and Samcomp are two of the state-of-the-art genome compression algorithms). The software is freely available at https://sourceforge.net/projects/hierachicaldnac/with a General Public License (GPL) license. Limitation Our method requires having different reference genomes and prolongs the execution time for additional alignments. Conclusions The proposed multi-reference-based compression algorithm for aligned reads outperforms existing single-reference based algorithms. PMID:24368726
Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints.

PubMed

Glusman, Gustavo; Mauldin, Denise E; Hood, Leroy E; Robinson, Max

2017-01-01

We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into "genome fingerprints" via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. For example, we could compute all-against-all pairwise comparisons among the 2504 genomes in the 1000 Genomes data set in 67 s at high quality (21 μs per comparison, on a single processor), and achieved a lower quality approximation in just 11 s. Efficient computation enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative sequenced genomes in a set, population reconstruction, and many others. The original genome representation cannot be reconstructed from its fingerprint, effectively decoupling genome comparison from genome interpretation; the method thus has significant implications for privacy-preserving genome analytics.
Identification and validation of suitable reference genes for RT-qPCR analysis in mouse testis development.

PubMed

Gong, Zu-Kang; Wang, Shuang-Jie; Huang, Yong-Qi; Zhao, Rui-Qiang; Zhu, Qi-Fang; Lin, Wen-Zhen

2014-12-01

RT-qPCR is a commonly used method for evaluating gene expression; however, its accuracy and reliability are dependent upon the choice of appropriate reference gene(s), and there is limited information available on suitable reference gene(s) that can be used in mouse testis at different stages. In this study, using the RT-qPCR method, we investigated the expression variations of six reference genes representing different functional classes (Actb, Gapdh, Ppia, Tbp, Rps29, Hprt1) in mice testis during embryonic and postnatal development. The expression stabilities of putative reference genes were evaluated using five algorithms: geNorm, NormFinder, Bestkeeper, the comparative delta C(t) method and integrated tool RefFinder. Analysis of the results showed that Ppia, Gapdh and Actb were identified as the most stable genes and the geometric mean of Ppia, Gapdh and Actb constitutes an appropriate normalization factor for gene expression studies. The mRNA expression of AT1 as a test gene of interest varied depending upon which of the reference gene(s) was used as an internal control(s). This study suggested that Ppia, Gapdh and Actb are suitable reference genes among the six genes used for RT-qPCR normalization and provide crucial information for transcriptional analyses in future studies of gene expression in the developing mouse testis.

Human Contamination in Public Genome Assemblies.

PubMed

Kryukov, Kirill; Imanishi, Tadashi

2016-01-01

Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.
The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes.

PubMed

Clark, Samuel A; Hickey, John M; Daetwyler, Hans D; van der Werf, Julius H J

2012-02-09

The theory of genomic selection is based on the prediction of the effects of genetic markers in linkage disequilibrium with quantitative trait loci. However, genomic selection also relies on relationships between individuals to accurately predict genetic value. This study aimed to examine the importance of information on relatives versus that of unrelated or more distantly related individuals on the estimation of genomic breeding values. Simulated and real data were used to examine the effects of various degrees of relationship on the accuracy of genomic selection. Genomic Best Linear Unbiased Prediction (gBLUP) was compared to two pedigree based BLUP methods, one with a shallow one generation pedigree and the other with a deep ten generation pedigree. The accuracy of estimated breeding values for different groups of selection candidates that had varying degrees of relationships to a reference data set of 1750 animals was investigated. The gBLUP method predicted breeding values more accurately than BLUP. The most accurate breeding values were estimated using gBLUP for closely related animals. Similarly, the pedigree based BLUP methods were also accurate for closely related animals, however when the pedigree based BLUP methods were used to predict unrelated animals, the accuracy was close to zero. In contrast, gBLUP breeding values, for animals that had no pedigree relationship with animals in the reference data set, allowed substantial accuracy. An animal's relationship to the reference data set is an important factor for the accuracy of genomic predictions. Animals that share a close relationship to the reference data set had the highest accuracy from genomic predictions. However a baseline accuracy that is driven by the reference data set size and the overall population effective population size enables gBLUP to estimate a breeding value for unrelated animals within a population (breed), using information previously ignored by pedigree based BLUP methods.
Leishmania naiffi and Leishmania guyanensis reference genomes highlight genome structure and gene evolution in the Viannia subgenus

PubMed Central

Coughlan, Simone; Taylor, Ali Shirley; Feane, Eoghan; Sanders, Mandy; Schonian, Gabriele; Cotton, James A.

2018-01-01

The unicellular protozoan parasite Leishmania causes the neglected tropical disease leishmaniasis, affecting 12 million people in 98 countries. In South America, where the Viannia subgenus predominates, so far only L. (Viannia) braziliensis and L. (V.) panamensis have been sequenced, assembled and annotated as reference genomes. Addressing this deficit in molecular information can inform species typing, epidemiological monitoring and clinical treatment. Here, L. (V.) naiffi and L. (V.) guyanensis genomic DNA was sequenced to assemble these two genomes as draft references from short sequence reads. The methods used were tested using short sequence reads for L. braziliensis M2904 against its published reference as a comparison. This assembly and annotation pipeline identified 70 additional genes not annotated on the original M2904 reference. Phylogenetic and evolutionary comparisons of L. guyanensis and L. naiffi with 10 other Viannia genomes revealed four traits common to all Viannia: aneuploidy, 22 orthologous groups of genes absent in other Leishmania subgenera, elevated TATE transposon copies and a high NADH-dependent fumarate reductase gene copy number. Within the Viannia, there were limited structural changes in genome architecture specific to individual species: a 45 Kb amplification on chromosome 34 was present in all bar L. lainsoni, L. naiffi had a higher copy number of the virulence factor leishmanolysin, and laboratory isolate L. shawi M8408 had a possible minichromosome derived from the 3’ end of chromosome 34. This combination of genome assembly, phylogenetics and comparative analysis across an extended panel of diverse Viannia has uncovered new insights into the origin and evolution of this subgenus and can help improve diagnostics for leishmaniasis surveillance. PMID:29765675
Improved maize reference genome with single-molecule technologies

USDA-ARS?s Scientific Manuscript database

Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate elucidation of biological processes and support translation of research findings into improved and sustainable agricultural technolog...
Genotype Imputation with Thousands of Genomes

PubMed Central

Howie, Bryan; Marchini, Jonathan; Stephens, Matthew

2011-01-01

Genotype imputation is a statistical technique that is often used to increase the power and resolution of genetic association studies. Imputation methods work by using haplotype patterns in a reference panel to predict unobserved genotypes in a study dataset, and a number of approaches have been proposed for choosing subsets of reference haplotypes that will maximize accuracy in a given study population. These panel selection strategies become harder to apply and interpret as sequencing efforts like the 1000 Genomes Project produce larger and more diverse reference sets, which led us to develop an alternative framework. Our approach is built around a new approximation that uses local sequence similarity to choose a custom reference panel for each study haplotype in each region of the genome. This approximation makes it computationally efficient to use all available reference haplotypes, which allows us to bypass the panel selection step and to improve accuracy at low-frequency variants by capturing unexpected allele sharing among populations. Using data from HapMap 3, we show that our framework produces accurate results in a wide range of human populations. We also use data from the Malaria Genetic Epidemiology Network (MalariaGEN) to provide recommendations for imputation-based studies in Africa. We demonstrate that our approximation improves efficiency in large, sequence-based reference panels, and we discuss general computational strategies for modern reference datasets. Genome-wide association studies will soon be able to harness the power of thousands of reference genomes, and our work provides a practical way for investigators to use this rich information. New methodology from this study is implemented in the IMPUTE2 software package. PMID:22384356
Novel mouse model recapitulates genome and transcriptome alterations in human colorectal carcinomas.

PubMed

McNeil, Nicole E; Padilla-Nash, Hesed M; Buishand, Floryne O; Hue, Yue; Ried, Thomas

2017-03-01

Human colorectal carcinomas are defined by a nonrandom distribution of genomic imbalances that are characteristic for this disease. Often, these imbalances affect entire chromosomes. Understanding the role of these aneuploidies for carcinogenesis is of utmost importance. Currently, established transgenic mice do not recapitulate the pathognonomic genome aberration profile of human colorectal carcinomas. We have developed a novel model based on the spontaneous transformation of murine colon epithelial cells. During this process, cells progress through stages of pre-immortalization, immortalization and, finally, transformation, and result in tumors when injected into immunocompromised mice. We analyzed our model for genome and transcriptome alterations using ArrayCGH, spectral karyotyping (SKY), and array based gene expression profiling. ArrayCGH revealed a recurrent pattern of genomic imbalances. These results were confirmed by SKY. Comparing these imbalances with orthologous maps of human chromosomes revealed a remarkable overlap. We observed focal deletions of the tumor suppressor genes Trp53 and Cdkn2a/p16. High-level focal genomic amplification included the locus harboring the oncogene Mdm2, which was confirmed by FISH in the form of double minute chromosomes. Array-based global gene expression revealed distinct differences between the sequential steps of spontaneous transformation. Gene expression changes showed significant similarities with human colorectal carcinomas. Pathways most prominently affected included genes involved in chromosomal instability and in epithelial to mesenchymal transition. Our novel mouse model therefore recapitulates the most prominent genome and transcriptome alterations in human colorectal cancer, and might serve as a valuable tool for understanding the dynamic process of tumorigenesis, and for preclinical drug testing. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Exploring Other Genomes: Bacteria.

ERIC Educational Resources Information Center

Flannery, Maura C.

2001-01-01

Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)
Genome size of Alexandrium catenella and Gracilariopsis lemaneiformis estimated by flow cytometry

NASA Astrophysics Data System (ADS)

Du, Qingwei; Sui, Zhenghong; Chang, Lianpeng; Wei, Huihui; Liu, Yuan; Mi, Ping; Shang, Erlei; Zeeshan, Niaz; Que, Zhou

2016-08-01

Flow cytometry (FCM) technique has been widely applied to estimating the genome size of various higher plants. However, there is few report about its application in algae. In this study, an optimized procedure of FCM was exploited to estimate the genome size of two eukaryotic algae. For analyzing Alexandrium catenella, an important red tide species, the whole cell instead of isolated nucleus was studied, and chicken erythrocytes were used as an internal reference. The genome size of A. catenella was estimated to be 56.48 ± 4.14 Gb (1C), approximately nineteen times larger than that of human genome. For analyzing Gracilariopsis lemaneiformis, an important economical red alga, the purified nucleus was employed, and Arabidopsis thaliana and Chondrus crispus were used as internal references, respectively. The genome size of Gp. lemaneiformis was 97.35 ± 2.58 Mb (1C) and 112.73 ± 14.00 Mb (1C), respectively, depending on the different internal references. The results of this research will promote the related studies on the genomics and evolution of these two species.
The ecoresponsive genome of Daphnia pulex

DOE Office of Scientific and Technical Information (OSTI.GOV)

Colbourne, John K.; Pfrender, Michael E.; Gilbert, Donald

2011-02-04

This document provides supporting material related to the sequencing of the ecoresponsive genome of Daphnia pulex. This material includes information on materials and methods and supporting text, as well as supplemental figures, tables, and references. The coverage of materials and methods addresses genome sequence, assembly, and mapping to chromosomes, gene inventory, attributes of a compact genome, the origin and preservation of Daphnia pulex genes, implications of Daphnia's genome structure, evolutionary diversification of duplicated genes, functional significance of expanded gene families, and ecoresponsive genes. Supporting text covers chromosome studies, gene homology among Daphnia genomes, micro-RNA and transposable elements and the 46more » Daphnia pulex opsins. 36 figures, 50 tables, 183 references.« less
Immunologic applications of conditional gene modification technology in the mouse.

PubMed

Sharma, Suveena; Zhu, Jinfang

2014-04-02

Since the success of homologous recombination in altering mouse genome and the discovery of Cre-loxP system, the combination of these two breakthroughs has created important applications for studying the immune system in the mouse. Here, we briefly summarize the general principles of this technology and its applications in studying immune cell development and responses; such implications include conditional gene knockout and inducible and/or tissue-specific gene over-expression, as well as lineage fate mapping. We then discuss the pros and cons of a few commonly used Cre-expressing mouse lines for studying lymphocyte development and functions. We also raise several general issues, such as efficiency of gene deletion, leaky activity of Cre, and Cre toxicity, all of which may have profound impacts on data interpretation. Finally, we selectively list some useful links to the Web sites as valuable mouse resources. Copyright © 2014 John Wiley & Sons, Inc.
Extraordinary Sequence Divergence at Tsga8, an X-linked Gene Involved in Mouse Spermiogenesis

PubMed Central

Good, Jeffrey M.; Vanderpool, Dan; Smith, Kimberly L.; Nachman, Michael W.

2011-01-01

The X chromosome plays an important role in both adaptive evolution and speciation. We used a molecular evolutionary screen of X-linked genes potentially involved in reproductive isolation in mice to identify putative targets of recurrent positive selection. We then sequenced five very rapidly evolving genes within and between several closely related species of mice in the genus Mus. All five genes were involved in male reproduction and four of the genes showed evidence of recurrent positive selection. The most remarkable evolutionary patterns were found at Testis-specific gene a8 (Tsga8), a spermatogenesis-specific gene expressed during postmeiotic chromatin condensation and nuclear transformation. Tsga8 was characterized by extremely high levels of insertion–deletion variation of an alanine-rich repetitive motif in natural populations of Mus domesticus and M. musculus, differing in length from the reference mouse genome by up to 89 amino acids (27% of the total protein length). This population-level variation was coupled with striking divergence in protein sequence and length between closely related mouse species. Although no clear orthologs had previously been described for Tsga8 in other mammalian species, we have identified a highly divergent hypothetical gene on the rat X chromosome that shares clear orthology with the 5′ and 3′ ends of Tsga8. Further inspection of this ortholog verified that it is expressed in rat testis and shares remarkable similarity with mouse Tsga8 across several general features of the protein sequence despite no conservation of nucleotide sequence across over 60% of the rat-coding domain. Overall, Tsga8 appears to be one of the most rapidly evolving genes to have been described in rodents. We discuss the potential evolutionary causes and functional implications of this extraordinary divergence and the possible contribution of Tsga8 and the other four genes we examined to reproductive isolation in mice. PMID:21186189
Regulatory RNA Key Player in p53-Mediated Apoptosis in Embryonic Stem Cells | Center for Cancer Research

Cancer.gov

Embryonic stem cells (ESCs) must maintain the integrity of their genomes or risk passing potentially deleterious mutations on to numerous tissues. Thus, ESCs have a unique genome surveillance system and easily undergo apoptosis or differentiation when DNA damage is detected. The protein p53 is known to promote differentiation in mouse ESCs (mESCs), but its role in DNA
Concentration-and time-dependent genomic changes in the mouse urinary bladder following exposure to arsenate in drinking water for up to twelve weeks

EPA Science Inventory

Inorganic arsenic (AsD is a known human bladder carcinogen. The objective of this study was to examine the concentration dependence of the genomic response to ASi in the urinary bladders of mice. C57BL/6J mice were exposed for 1 or 12 weeks to arsenate in drinking water at concen...
ChIP-seq Accurately Predicts Tissue-Specific Activity of Enhancers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Visel, Axel; Blow, Matthew J.; Li, Zirong

2009-02-01

A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover since they are scattered amongst the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here, we performed chromatin immunoprecipitation with the enhancer-associated protein p300, followed by massively-parallel sequencing, to map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain, and limb tissue. Wemore » tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases revealed reproducible enhancer activity in those tissues predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities and suggest that such datasets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.« less
Genome editing reveals a role for OCT4 in human embryogenesis.

PubMed

Fogarty, Norah M E; McCarthy, Afshan; Snijders, Kirsten E; Powell, Benjamin E; Kubikova, Nada; Blakeley, Paul; Lea, Rebecca; Elder, Kay; Wamaitha, Sissy E; Kim, Daesik; Maciulyte, Valdone; Kleinjung, Jens; Kim, Jin-Soo; Wells, Dagan; Vallier, Ludovic; Bertero, Alessandro; Turner, James M A; Niakan, Kathy K

2017-10-05

Despite their fundamental biological and clinical importance, the molecular mechanisms that regulate the first cell fate decisions in the human embryo are not well understood. Here we use CRISPR-Cas9-mediated genome editing to investigate the function of the pluripotency transcription factor OCT4 during human embryogenesis. We identified an efficient OCT4-targeting guide RNA using an inducible human embryonic stem cell-based system and microinjection of mouse zygotes. Using these refined methods, we efficiently and specifically targeted the gene encoding OCT4 (POU5F1) in diploid human zygotes and found that blastocyst development was compromised. Transcriptomics analysis revealed that, in POU5F1-null cells, gene expression was downregulated not only for extra-embryonic trophectoderm genes, such as CDX2, but also for regulators of the pluripotent epiblast, including NANOG. By contrast, Pou5f1-null mouse embryos maintained the expression of orthologous genes, and blastocyst development was established, but maintenance was compromised. We conclude that CRISPR-Cas9-mediated genome editing is a powerful method for investigating gene function in the context of human development.
Whole Genome Sequence of Two Wild-Derived Mus musculus domesticus Inbred Strains, LEWES/EiJ and ZALENDE/EiJ, with Different Diploid Numbers

PubMed Central

Morgan, Andrew P.; Didion, John P.; Doran, Anthony G.; Holt, James M.; McMillan, Leonard; Keane, Thomas M.; de Villena, Fernando Pardo-Manuel

2016-01-01

Wild-derived mouse inbred strains are becoming increasingly popular for complex traits analysis, evolutionary studies, and systems genetics. Here, we report the whole-genome sequencing of two wild-derived mouse inbred strains, LEWES/EiJ and ZALENDE/EiJ, of Mus musculus domesticus origin. These two inbred strains were selected based on their geographic origin, karyotype, and use in ongoing research. We generated 14× and 18× coverage sequence, respectively, and discovered over 1.1 million novel variants, most of which are private to one of these strains. This report expands the number of wild-derived inbred genomes in the Mus genus from six to eight. The sequence variation can be accessed via an online query tool; variant calls (VCF format) and alignments (BAM format) are available for download from a dedicated ftp site. Finally, the sequencing data have also been stored in a lossless, compressed, and indexed format using the multi-string Burrows-Wheeler transform. All data can be used without restriction. PMID:27765810
Computational analyses of mammalian lactate dehydrogenases: human, mouse, opossum and platypus LDHs.

PubMed

Holmes, Roger S; Goldberg, Erwin

2009-10-01

Computational methods were used to predict the amino acid sequences and gene locations for mammalian lactate dehydrogenase (LDH) genes and proteins using genome sequence databanks. Human LDHA, LDHC and LDH6A genes were located in tandem on chromosome 11, while LDH6B and LDH6C genes were on chromosomes 15 and 12, respectively. Opossum LDHC and LDH6B genes were located in tandem with the opossum LDHA gene on chromosome 5 and contained 7 (LDHA and LDHC) or 8 (LDH6B) exons. An amino acid sequence prediction for the opossum LDH6B subunit gave an extended N-terminal sequence, similar to the human and mouse LDH6B sequences, which may support the export of this enzyme into mitochondria. The platypus genome contained at least 3 LDH genes encoding LDHA, LDHB and LDH6B subunits. Phylogenetic studies and sequence analyses indicated that LDHA, LDHB and LDH6B genes are present in all mammalian genomes examined, including a monotreme species (platypus), whereas the LDHC gene may have arisen more recently in marsupial mammals.
Computational analyses of mammalian lactate dehydrogenases: human, mouse, opossum and platypus LDHs

PubMed Central

Holmes, Roger S; Goldberg, Erwin

2009-01-01

Computational methods were used to predict the amino acid sequences and gene locations for mammalian lactate dehydrogenase (LDH) genes and proteins using genome sequence databanks. Human LDHA, LDHC and LDH6A genes were located in tandem on chromosome 11, while LDH6B and LDH6C genes were on chromosomes 15 and 12, respectively. Opossum LDHC and LDH6B genes were located in tandem with the opossum LDHA gene on chromosome 5 and contained 7 (LDHA and LDHC) or 8 (LDH6B) exons. An amino acid sequence prediction for the opossum LDH6B subunit gave an extended N-terminal sequence, similar to the human and mouse LDH6B sequences, which may support the export of this enzyme into mitochondria. The platypus genome contained at least 3 LDH genes encoding LDHA, LDHB and LDH6B subunits. Phylogenetic studies and sequence analyses indicated that LDHA, LDHB and LDH6B genes are present in all mammalian genomes examined, including a monotreme species (platypus), whereas the LDHC gene may have arisen more recently in marsupial mammals. PMID:19679512
Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis

2005-06-13

Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A totalmore » of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.« less
The genomic ancestry, landscape genetics and invasion history of introduced mice in New Zealand

PubMed Central

Russell, James C.; King, Carolyn M.

2018-01-01

The house mouse (Mus musculus) provides a fascinating system for studying both the genomic basis of reproductive isolation, and the patterns of human-mediated dispersal. New Zealand has a complex history of mouse invasions, and the living descendants of these invaders have genetic ancestry from all three subspecies, although most are primarily descended from M. m. domesticus. We used the GigaMUGA genotyping array (approximately 135 000 loci) to describe the genomic ancestry of 161 mice, sampled from 34 locations from across New Zealand (and one Australian city—Sydney). Of these, two populations, one in the south of the South Island, and one on Chatham Island, showed complete mitochondrial lineage capture, featuring two different lineages of M. m. castaneus mitochondrial DNA but with only M. m. domesticus nuclear ancestry detectable. Mice in the northern and southern parts of the North Island had small traces (approx. 2–3%) of M. m. castaneus nuclear ancestry, and mice in the upper South Island had approximately 7–8% M. m. musculus nuclear ancestry including some Y-chromosomal ancestry—though no detectable M. m. musculus mitochondrial ancestry. This is the most thorough genomic study of introduced populations of house mice yet conducted, and will have relevance to studies of the isolation mechanisms separating subspecies of mice. PMID:29410804

Colocalization of somatic and meiotic double strand breaks near the Myc oncogene on mouse chromosome 15.

PubMed

Ng, Siemon H; Maas, Sarah A; Petkov, Petko M; Mills, Kevin D; Paigen, Kenneth

2009-10-01

Both somatic and meiotic recombinations involve the repair of DNA double strand breaks (DSBs) that occur at preferred locations in the genome. Improper repair of DSBs during either mitosis or meiosis can lead to mutations, chromosomal aberration such as translocations, cancer, and/or cell death. Currently, no model exists that explains the locations of either spontaneous somatic DSBs or programmed meiotic DSBs or relates them to each other. One common class of tumorigenic translocations arising from DSBs is chromosomal rearrangements near the Myc oncogene. Myc translocations have been associated with Burkitt lymphoma in humans, plasmacytoma in mice, and immunocytoma in rats. Comparing the locations of somatic and meiotic DSBs near the mouse Myc oncogene, we demonstrated that the placement of these DSBs is not random and that both events clustered in the same short discrete region of the genome. Our work shows that both somatic and meiotic DSBs tend to occur in proximity to each other within the Myc region, suggesting that they share common originating features. It is likely that some regions of the genome are more susceptible to both somatic and meiotic DSBs, and the locations of meiotic hotspots may be an indicator of genomic regions more susceptible to DNA damage. (c) 2009 Wiley-Liss, Inc.
Engineered chromosome-based genetic mapping establishes a 3.7 Mb critical genomic region for Down syndrome-associated heart defects in mice.

PubMed

Liu, Chunhong; Morishima, Masae; Jiang, Xiaoling; Yu, Tao; Meng, Kai; Ray, Debjit; Pao, Annie; Ye, Ping; Parmacek, Michael S; Yu, Y Eugene

2014-06-01

Trisomy 21 (Down syndrome, DS) is the most common human genetic anomaly associated with heart defects. Based on evolutionary conservation, DS-associated heart defects have been modeled in mice. By generating and analyzing mouse mutants carrying different genomic rearrangements in human chromosome 21 (Hsa21) syntenic regions, we found the triplication of the Tiam1-Kcnj6 region on mouse chromosome 16 (Mmu16) resulted in DS-related cardiovascular abnormalities. In this study, we developed two tandem duplications spanning the Tiam1-Kcnj6 genomic region on Mmu16 using recombinase-mediated genome engineering, Dp(16)3Yey and Dp(16)4Yey, spanning the 2.1 Mb Tiam1-Il10rb and 3.7 Mb Ifnar1-Kcnj6 regions, respectively. We found that Dp(16)4Yey/+, but not Dp(16)3Yey/+, led to heart defects, suggesting the triplication of the Ifnar1-Kcnj6 region is sufficient to cause DS-associated heart defects. Our transcriptional analysis of Dp(16)4Yey/+ embryos showed that the Hsa21 gene orthologs located within the duplicated interval were expressed at the elevated levels, reflecting the consequences of the gene dosage alterations. Therefore, we have identified a 3.7 Mb genomic region, the smallest critical genomic region, for DS-associated heart defects, and our results should set the stage for the final step to establish the identities of the causal gene(s), whose elevated expression(s) directly underlie this major DS phenotype.
The Release 6 reference sequence of the Drosophila melanogaster genome

DOE PAGES

Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.; ...

2015-01-14

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less
The Release 6 reference sequence of the Drosophila melanogaster genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less
Assessing the Mechanisms of MDS and Its Transformation to Leukemia in a Novel Humanized Mouse

DTIC Science & Technology

2016-05-01

achievements N/A References N/A References: 1. Rongvaux, A., et al., Development and function of human innate immune cells in a...in cancer survivors. MDS is inherently difficult to study. MDS stem cells cannot be grown in culture and in vivo models are thus the gold standard...However, MDS stem cells are diseased and fail to efficiently engraft in current immunodeficient mouse models. We have optimized engraftment of
[Parental genome imprinting].

PubMed

Babinet, C

1993-01-01

Genetical as well as experimental embryology methods have permitted, in recent years, to uncover a very important feature of mammalian embryonic development: it has been shown that female and male genomic complements are differentially imprinted in such a way that contribution of both a maternally and a paternally derived genome are absolutely necessary for the embryo to complete its normal development. Differential genomic imprinting seems therefore to impose some new and essential kind of information to the one already contained in the genomic sequences. The differential imprinting should be imposed on the genetic material during gametogenesis and persist throughout somatic development after fertilization. It should then be erased in the germ cell line and be established again in sperm and egg genomes. The recent discovery of several mouse genes which are imprinted should permit to address the question of the molecular mechanisms of imprinting.
The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now

PubMed Central

Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C.; Dwight, Selina S.; Hitz, Benjamin C.; Karra, Kalpana; Nash, Robert S.; Weng, Shuai; Wong, Edith D.; Lloyd, Paul; Skrzypek, Marek S.; Miyasato, Stuart R.; Simison, Matt; Cherry, J. Michael

2014-01-01

The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639
Fast lossless compression via cascading Bloom filters

PubMed Central

2014-01-01

Background Data from large Next Generation Sequencing (NGS) experiments present challenges both in terms of costs associated with storage and in time required for file transfer. It is sometimes possible to store only a summary relevant to particular applications, but generally it is desirable to keep all information needed to revisit experimental results in the future. Thus, the need for efficient lossless compression methods for NGS reads arises. It has been shown that NGS-specific compression schemes can improve results over generic compression methods, such as the Lempel-Ziv algorithm, Burrows-Wheeler transform, or Arithmetic Coding. When a reference genome is available, effective compression can be achieved by first aligning the reads to the reference genome, and then encoding each read using the alignment position combined with the differences in the read relative to the reference. These reference-based methods have been shown to compress better than reference-free schemes, but the alignment step they require demands several hours of CPU time on a typical dataset, whereas reference-free methods can usually compress in minutes. Results We present a new approach that achieves highly efficient compression by using a reference genome, but completely circumvents the need for alignment, affording a great reduction in the time needed to compress. In contrast to reference-based methods that first align reads to the genome, we hash all reads into Bloom filters to encode, and decode by querying the same Bloom filters using read-length subsequences of the reference genome. Further compression is achieved by using a cascade of such filters. Conclusions Our method, called BARCODE, runs an order of magnitude faster than reference-based methods, while compressing an order of magnitude better than reference-free methods, over a broad range of sequencing coverage. In high coverage (50-100 fold), compared to the best tested compressors, BARCODE saves 80-90% of the running time while only increasing space slightly. PMID:25252952
Fast lossless compression via cascading Bloom filters.

PubMed

Rozov, Roye; Shamir, Ron; Halperin, Eran

2014-01-01

Data from large Next Generation Sequencing (NGS) experiments present challenges both in terms of costs associated with storage and in time required for file transfer. It is sometimes possible to store only a summary relevant to particular applications, but generally it is desirable to keep all information needed to revisit experimental results in the future. Thus, the need for efficient lossless compression methods for NGS reads arises. It has been shown that NGS-specific compression schemes can improve results over generic compression methods, such as the Lempel-Ziv algorithm, Burrows-Wheeler transform, or Arithmetic Coding. When a reference genome is available, effective compression can be achieved by first aligning the reads to the reference genome, and then encoding each read using the alignment position combined with the differences in the read relative to the reference. These reference-based methods have been shown to compress better than reference-free schemes, but the alignment step they require demands several hours of CPU time on a typical dataset, whereas reference-free methods can usually compress in minutes. We present a new approach that achieves highly efficient compression by using a reference genome, but completely circumvents the need for alignment, affording a great reduction in the time needed to compress. In contrast to reference-based methods that first align reads to the genome, we hash all reads into Bloom filters to encode, and decode by querying the same Bloom filters using read-length subsequences of the reference genome. Further compression is achieved by using a cascade of such filters. Our method, called BARCODE, runs an order of magnitude faster than reference-based methods, while compressing an order of magnitude better than reference-free methods, over a broad range of sequencing coverage. In high coverage (50-100 fold), compared to the best tested compressors, BARCODE saves 80-90% of the running time while only increasing space slightly.
Genome-wide mapping in a house mouse hybrid zone reveals hybrid sterility loci and Dobzhansky-Muller interactions.

PubMed

Turner, Leslie M; Harr, Bettina

2014-12-09

Mapping hybrid defects in contact zones between incipient species can identify genomic regions contributing to reproductive isolation and reveal genetic mechanisms of speciation. The house mouse features a rare combination of sophisticated genetic tools and natural hybrid zones between subspecies. Male hybrids often show reduced fertility, a common reproductive barrier between incipient species. Laboratory crosses have identified sterility loci, but each encompasses hundreds of genes. We map genetic determinants of testis weight and testis gene expression using offspring of mice captured in a hybrid zone between M. musculus musculus and M. m. domesticus. Many generations of admixture enables high-resolution mapping of loci contributing to these sterility-related phenotypes. We identify complex interactions among sterility loci, suggesting multiple, non-independent genetic incompatibilities contribute to barriers to gene flow in the hybrid zone.
Genomic analysis of sleep deprivation reveals translational regulation in the hippocampus.

PubMed

Vecsey, Christopher G; Peixoto, Lucia; Choi, Jennifer H K; Wimmer, Mathieu; Jaganath, Devan; Hernandez, Pepe J; Blackwell, Jennifer; Meda, Karuna; Park, Alan J; Hannenhalli, Sridhar; Abel, Ted

2012-10-17

Sleep deprivation is a common problem of considerable health and economic impact in today's society. Sleep loss is associated with deleterious effects on cognitive functions such as memory and has a high comorbidity with many neurodegenerative and neuropsychiatric disorders. Therefore, it is crucial to understand the molecular basis of the effect of sleep deprivation in the brain. In this study, we combined genome-wide and traditional molecular biological approaches to determine the cellular and molecular impacts of sleep deprivation in the mouse hippocampus, a brain area crucial for many forms of memory. Microarray analysis examining the effects of 5 h of sleep deprivation on gene expression in the mouse hippocampus found 533 genes with altered expression. Bioinformatic analysis revealed that a prominent effect of sleep deprivation was to downregulate translation, potentially mediated through components of the insulin signaling pathway such as the mammalian target of rapamycin (mTOR), a key regulator of protein synthesis. Consistent with this analysis, sleep deprivation reduced levels of total and phosphorylated mTOR, and levels returned to baseline after 2.5 h of recovery sleep. Our findings represent the first genome-wide analysis of the effects of sleep deprivation on the mouse hippocampus, and they suggest that the detrimental effects of sleep deprivation may be mediated by reductions in protein synthesis via downregulation of mTOR. Because protein synthesis and mTOR activation are required for long-term memory formation, our study improves our understanding of the molecular mechanisms underlying the memory impairments induced by sleep deprivation.
Cloning of ES cells and mice by nuclear transfer.

PubMed

Wakayama, Sayaka; Kishigami, Satoshi; Wakayama, Teruhiko

2009-01-01

We have been able to develop a stable nuclear transfer (NT) method in the mouse, in which donor nuclei are directly injected into the oocyte using a piezo-actuated micromanipulator. Although the piezo unit is a complex tool, once mastered it is of great help not only in NT experiments, but also in almost all other forms of micromanipulation. Using this technique, embryonic stem (ntES) cell lines established from somatic cell nuclei can be generated relatively easily from a variety of mouse genotypes and cell types. Such ntES cells can be used not only for experimental models of human therapeutic cloning but also as a means of preserving mouse genomes instead of preserving germ cells. Here, we describe our most recent protocols for mouse cloning.
Complex Genetics of Behavior: BXDs in the Automated Home-Cage.

PubMed

Loos, Maarten; Verhage, Matthijs; Spijker, Sabine; Smit, August B

2017-01-01

This chapter describes a use case for the genetic dissection and automated analysis of complex behavioral traits using the genetically diverse panel of BXD mouse recombinant inbred strains. Strains of the BXD resource differ widely in terms of gene and protein expression in the brain, as well as in their behavioral repertoire. A large mouse resource opens the possibility for gene finding studies underlying distinct behavioral phenotypes, however, such a resource poses a challenge in behavioral phenotyping. To address the specifics of large-scale screening we describe how to investigate: (1) how to assess mouse behavior systematically in addressing a large genetic cohort, (2) how to dissect automation-derived longitudinal mouse behavior into quantitative parameters, and (3) how to map these quantitative traits to the genome, deriving loci underlying aspects of behavior.
Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

PubMed

Kisand, Veljo; Lettieri, Teresa

2013-04-01

De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (<450 bps), which are presumed to aid in the analysis of uncharacterized genomes. The array of tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize unknown bacteria with modest effort.
The Importance of Biological Databases in Biological Discovery.

PubMed

Baxevanis, Andreas D; Bateman, Alex

2015-06-19

Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
Uracil-DNA Glycosylase in Base Excision Repair and Adaptive Immunity

PubMed Central

Doseth, Berit; Visnes, Torkild; Wallenius, Anders; Ericsson, Ida; Sarno, Antonio; Pettersen, Henrik Sahlin; Flatberg, Arnar; Catterall, Tara; Slupphaug, Geir; Krokan, Hans E.; Kavli, Bodil

2011-01-01

Genomic uracil is a DNA lesion but also an essential key intermediate in adaptive immunity. In B cells, activation-induced cytidine deaminase deaminates cytosine to uracil (U:G mispairs) in Ig genes to initiate antibody maturation. Uracil-DNA glycosylases (UDGs) such as uracil N-glycosylase (UNG), single strand-selective monofunctional uracil-DNA glycosylase 1 (SMUG1), and thymine-DNA glycosylase remove uracil from DNA. Gene-targeted mouse models are extensively used to investigate the role of these enzymes in DNA repair and Ig diversification. However, possible species differences in uracil processing in humans and mice are yet not established. To address this, we analyzed UDG activities and quantities in human and mouse cell lines and in splenic B cells from Ung+/+ and Ung−/− backcrossed mice. Interestingly, human cells displayed ∼15-fold higher total uracil excision capacity due to higher levels of UNG. In contrast, SMUG1 activity was ∼8-fold higher in mouse cells, constituting ∼50% of the total U:G excision activity compared with less than 1% in human cells. In activated B cells, both UNG and SMUG1 activities were at levels comparable with those measured for mouse cell lines. Moreover, SMUG1 activity per cell was not down-regulated after activation. We therefore suggest that SMUG1 may work as a weak backup activity for UNG2 during class switch recombination in Ung−/− mice. Our results reveal significant species differences in genomic uracil processing. These findings should be taken into account when mouse models are used in studies of uracil DNA repair and adaptive immunity. PMID:21454529
Genomic Organization of the Murine Miller–Dieker/Lissencephaly Region: Conservation of Linkage with the Human Region

PubMed Central

Hirotsune, Shinji; Pack, Svetlana D.; Chong, Samuel S.; Robbins, Christiane M.; Pavan, William J.; Ledbetter, David H.; Wynshaw-Boris, Anthony

1997-01-01

Several human syndromes are associated with haploinsufficiency of chromosomal regions secondary to microdeletions. Isolated lissencephaly sequence (ILS), a human developmental disease characterized by a smooth cerebral surface (classical lissencephaly) and microscopic evidence of incomplete neuronal migration, is often associated with small deletions or translocations at chromosome 17p13.3. Miller–Dieker syndrome (MDS) is associated with larger deletions of 17p13.3 and consists of classical lissencephaly with additional phenotypes including facial abnormalities. We have isolated the murine homologs of three genes located inside and outside the MDS region: Lis1, Mnt/Rox, and 14-3-3ε. These genes are all located on mouse chromosome 11B2, as determined by metaphase FISH, and the relative order and approximate gene distance was determined by interphase FISH analysis. The transcriptional orientation and intergenic distance of Lis1 and Mnt/Rox were ascertained by fragmentation analysis of a mouse yeast artificial chromosome containing both genes. To determine the distance and orientation of 14-3-3ε with respect to Lis1 and Mnt/Rox, we introduced a super-rare cutter site (VDE) that is unique in the mouse genome into 14-3-3ε by gene targeting. Using the introduced VDE site, the orientation of this gene was determined by pulsed field gel electrophoresis and Southern blot analysis. Our results demonstrate that the MDS region is conserved between human and mouse. This conservation of linkage suggests that the mouse can be used to model microdeletions that occur in ILS and MDS. PMID:9199935
TALEN mediated targeted editing of GM2/GD2-synthase gene modulates anchorage independent growth by reducing anoikis resistance in mouse tumor cells

PubMed Central

Mahata, Barun; Banerjee, Avisek; Kundu, Manjari; Bandyopadhyay, Uday; Biswas, Kaushik

2015-01-01

Complex ganglioside expression is highly deregulated in several tumors which is further dependent on specific ganglioside synthase genes. Here, we designed and constructed a pair of highly specific transcription-activator like effector endonuclease (TALENs) to disrupt a particular genomic locus of mouse GM2-synthase, a region conserved in coding sequence of all four transcript variants of mouse GM2-synthase. Our designed TALENs effectively work in different mouse cell lines and TALEN induced mutation rate is over 45%. Clonal selection strategy is undertaken to generate stable GM2-synthase knockout cell line. We have also demonstrated non-homologous end joining (NHEJ) mediated integration of neomycin cassette into the TALEN targeted GM2-synthase locus. Functionally, clonally selected GM2-synthase knockout clones show reduced anchorage-independent growth (AIG), reduction in tumor growth and higher cellular adhesion as compared to wild type Renca-v cells. Insight into the mechanism shows that, reduced AIG is due to loss in anoikis resistance, as both knockout clones show increased sensitivity to detachment induced apoptosis. Therefore, TALEN mediated precise genome editing at GM2-synthase locus not only helps us in understanding the function of GM2-synthase gene and complex gangliosides in tumorigenicity but also holds tremendous potential to use TALENs in translational cancer research and therapeutics. PMID:25762467
TALEN mediated targeted editing of GM2/GD2-synthase gene modulates anchorage independent growth by reducing anoikis resistance in mouse tumor cells.

PubMed

Mahata, Barun; Banerjee, Avisek; Kundu, Manjari; Bandyopadhyay, Uday; Biswas, Kaushik

2015-03-12

Complex ganglioside expression is highly deregulated in several tumors which is further dependent on specific ganglioside synthase genes. Here, we designed and constructed a pair of highly specific transcription-activator like effector endonuclease (TALENs) to disrupt a particular genomic locus of mouse GM2-synthase, a region conserved in coding sequence of all four transcript variants of mouse GM2-synthase. Our designed TALENs effectively work in different mouse cell lines and TALEN induced mutation rate is over 45%. Clonal selection strategy is undertaken to generate stable GM2-synthase knockout cell line. We have also demonstrated non-homologous end joining (NHEJ) mediated integration of neomycin cassette into the TALEN targeted GM2-synthase locus. Functionally, clonally selected GM2-synthase knockout clones show reduced anchorage-independent growth (AIG), reduction in tumor growth and higher cellular adhesion as compared to wild type Renca-v cells. Insight into the mechanism shows that, reduced AIG is due to loss in anoikis resistance, as both knockout clones show increased sensitivity to detachment induced apoptosis. Therefore, TALEN mediated precise genome editing at GM2-synthase locus not only helps us in understanding the function of GM2-synthase gene and complex gangliosides in tumorigenicity but also holds tremendous potential to use TALENs in translational cancer research and therapeutics.
Genomic organization, expression, and chromosome localization of a third aurora-related kinase gene, Aie1.

PubMed

Hu, H M; Chuang, C K; Lee, M J; Tseng, T C; Tang, T K

2000-11-01

We previously reported two novel testis-specific serine/threonine kinases, Aie1 (mouse) and AIE2 (human), that share high amino acid identities with the kinase domains of fly aurora and yeast Ipl1. Here, we report the entire intron-exon organization of the Aie1 gene and analyze the expression patterns of Aie1 mRNA during testis development. The mouse Aie1 gene spans approximately 14 kb and contains seven exons. The sequences of the exon-intron boundaries of the Aie1 gene conform to the consensus sequences (GT/AG) of the splicing donor and acceptor sites of most eukaryotic genes. Comparative genomic sequencing revealed that the gene structure is highly conserved between mouse Aie1 and human AIE2. However, much less homology was found in the sequence outside the kinase-coding domains. The Aie1 locus was mapped to mouse chromosome 7A2-A3 by fluorescent in situ hybridization. Northern blot analysis indicates that Aie1 mRNA likely is expressed at a low level on day 14 and reaches its plateau on day 21 in the developing postnatal testis. RNA in situ hybridization indicated that the expression of the Aie1 transcript was restricted to meiotically active germ cells, with the highest levels detected in spermatocytes at the late pachytene stage. These findings suggest that Aie1 plays a role in spermatogenesis.

Re-annotation, improved large-scale assembly and establishment of a catalogue of noncoding loci for the genome of the model brown alga Ectocarpus.

PubMed

Cormier, Alexandre; Avia, Komlan; Sterck, Lieven; Derrien, Thomas; Wucher, Valentin; Andres, Gwendoline; Monsoor, Misharl; Godfroy, Olivier; Lipinska, Agnieszka; Perrineau, Marie-Mathilde; Van De Peer, Yves; Hitte, Christophe; Corre, Erwan; Coelho, Susana M; Cock, J Mark

2017-04-01

The genome of the filamentous brown alga Ectocarpus was the first to be completely sequenced from within the brown algal group and has served as a key reference genome both for this lineage and for the stramenopiles. We present a complete structural and functional reannotation of the Ectocarpus genome. The large-scale assembly of the Ectocarpus genome was significantly improved and genome-wide gene re-annotation using extensive RNA-seq data improved the structure of 11 108 existing protein-coding genes and added 2030 new loci. A genome-wide analysis of splicing isoforms identified an average of 1.6 transcripts per locus. A large number of previously undescribed noncoding genes were identified and annotated, including 717 loci that produce long noncoding RNAs. Conservation of lncRNAs between Ectocarpus and another brown alga, the kelp Saccharina japonica, suggests that at least a proportion of these loci serve a function. Finally, a large collection of single nucleotide polymorphism-based markers was developed for genetic analyses. These resources are available through an updated and improved genome database. This study significantly improves the utility of the Ectocarpus genome as a high-quality reference for the study of many important aspects of brown algal biology and as a reference for genomic analyses across the stramenopiles. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
A Perfect Match Genomic Landscape Provides a Unified Framework for the Precise Detection of Variation in Natural and Synthetic Haploid Genomes.

PubMed

Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo

2018-04-01

We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. Copyright © 2018 by the Genetics Society of America.
An Evaluation Framework for Lossy Compression of Genome Sequencing Quality Values.

PubMed

Alberti, Claudio; Daniels, Noah; Hernaez, Mikel; Voges, Jan; Goldfeder, Rachel L; Hernandez-Lopez, Ana A; Mattavelli, Marco; Berger, Bonnie

2016-01-01

This paper provides the specification and an initial validation of an evaluation framework for the comparison of lossy compressors of genome sequencing quality values. The goal is to define reference data, test sets, tools and metrics that shall be used to evaluate the impact of lossy compression of quality values on human genome variant calling. The functionality of the framework is validated referring to two state-of-the-art genomic compressors. This work has been spurred by the current activity within the ISO/IEC SC29/WG11 technical committee (a.k.a. MPEG), which is investigating the possibility of starting a standardization activity for genomic information representation.
Genome sequences of a mouse-avirulent and a mouse-virulent strain of Ross River virus.

PubMed

Faragher, S G; Meek, A D; Rice, C M; Dalgarno, L

1988-04-01

The nucleotide sequence of the genomic RNA of a mouse-avirulent strain of Ross River virus, RRV NB5092 (isolated in 1969), has been determined and the corresponding sequence for the prototype mouse-virulent strain, RRV T48 (isolated in 1959), has been completed. The RRV NB5092 genome is approximately 11,674 nucleotides in length, compared with 11,853 nucleotides for RRV T48. RRV NB5092 and RRV T48 have the same genome organization. For both viruses an untranslated region of 80 nucleotides at the 5' end of the genome is followed by a 7440-nucleotide open reading frame which is interrupted after 5586 nucleotides by a single opal termination codon. By homology with other alphaviruses, the 5586-nucleotide open reading frame encodes the nonstructural proteins nsP1, nsP2, and nsP3; a fourth nonstructural protein, nsP4, is produced by read-through of the opal codon. The RRV nonstructural proteins show strong homology with the corresponding proteins of Sindbis virus and Semliki Forest virus in terms of size, net charge, and hydropathy characteristics. However, homology is not uniform between or within the proteins; nsP1, nsP2, and nsP4 contain extended domains which are highly conserved between alphaviruses, while the C-terminal region of nsP3 shows little conservation in sequence or length between alphaviruses. An untranslated "junction" region of 44 nucleotides (for RRV NB5092) or 47 nucleotides (for RRV T48) separates the nonstructural and structural protein coding regions. The structural proteins (capsid-E3-E2-6K-E1) are translated from an open reading frame of 3762 nucleotides which is followed by a 3'-untranslated region of approximately 348 nucleotides (for RRV NB5092) or 524 nucleotides (for RRV T48). Excluding deletions and insertions, the genomes of RRV NB5092 and RRV T48 differ at 284 nucleotides, representing a sequence divergence of 2.38%. Sequence deletions or insertions were found only in the noncoding regions and include a 173-nucleotide deletion in the 3'-untranslated region of RRV NB5092, compared with RRV T48. In the coding regions, most of the nucleotide differences are silent; there are 36 amino acid differences in the nonstructural proteins and 12 in the structural proteins. The distribution of amino acid differences between the two RRV strains correlates with the location of domains which are poorly conserved in sequence between alphaviruses. The possible role of amino acid differences in envelope glycoproteins E1 and E2 in determining the different antigenic and biological properties of RRV NB5092 and RRV T48 is discussed.
High-resolution analysis of alterations in medullary thyroid carcinoma genomes.

PubMed

Flicker, Karin; Ulz, Peter; Höger, Harald; Zeitlhofer, Petra; Haas, Oskar A; Behmel, Annemarie; Buchinger, Wolfgang; Scheuba, Christian; Niederle, Bruno; Pfragner, Roswitha; Speicher, Michael R

2012-07-15

Hereditary and sporadic medullary thyroid carcinoma (MTC) are closely associated with RET proto-oncogene mutations. However, the role of additional changes in the tumor genomes remains unclear. Our objective was the identification of chromosomal regions involved in MTC tumorigenesis and to assess their significance by using MTC-derived cell lines. We used array-CGH (comparative genomic hybridization) to map chromosomal imbalances in 52 primary tumors and ten metastases. Eleven tumors (11/52, 21%) were hereditary and 41 (41/52, 79%) were sporadic. Among the latter, 15 tumors (15/41, 37%) harbored RET mutations. Furthermore, we characterized five MTC cell lines in detail and evaluated the tumorigenicity by severe combined immunodeficiency (SCID)-mouse experiments. Most MTCs had only few copy number changes, and losses of chromosomes 1p, 4q, 19p and 22q were observed most frequently. The number of chromosomal aberrations increased in metastases. Twenty-three percent (12/52) of the primary tumors did not even show any chromosomal gains and losses. We injected three cell lines (two of these were without chromosomal changes and pathogenic RET mutations) into immune deficient SCID mice, and in each case, we observed rapid tumor growth at the injection sites. Our data suggest that MTCs--in contrast to most other tumor entities--do not acquire a multitude of genomic imbalances. SCID mouse experiments performed with chromosomally normal cell lines and without RET mutations suggest that presently unknown submicroscopic genomic changes are sufficient in MTC tumorigenesis. Copyright © 2011 UICC.
Genetic recombination is directed away from functional genomic elements in mice.

PubMed

Brick, Kevin; Smagulova, Fatima; Khil, Pavel; Camerini-Otero, R Daniel; Petukhova, Galina V

2012-05-13

Genetic recombination occurs during meiosis, the key developmental programme of gametogenesis. Recombination in mammals has been recently linked to the activity of a histone H3 methyltransferase, PR domain containing 9 (PRDM9), the product of the only known speciation-associated gene in mammals. PRDM9 is thought to determine the preferred recombination sites--recombination hotspots--through sequence-specific binding of its highly polymorphic multi-Zn-finger domain. Nevertheless, Prdm9 knockout mice are proficient at initiating recombination. Here we map and analyse the genome-wide distribution of recombination initiation sites in Prdm9 knockout mice and in two mouse strains with different Prdm9 alleles and their F(1) hybrid. We show that PRDM9 determines the positions of practically all hotspots in the mouse genome, with the exception of the pseudo-autosomal region (PAR)--the only area of the genome that undergoes recombination in 100% of cells. Surprisingly, hotspots are still observed in Prdm9 knockout mice, and as in wild type, these hotspots are found at H3 lysine 4 (H3K4) trimethylation marks. However, in the absence of PRDM9, most recombination is initiated at promoters and at other sites of PRDM9-independent H3K4 trimethylation. Such sites are rarely targeted in wild-type mice, indicating an unexpected role of the PRDM9 protein in sequestering the recombination machinery away from gene-promoter regions and other functional genomic elements.
G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences.

PubMed

Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M

2018-05-01

Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
Z-DNA-induced super-transport of energy within genomes

NASA Astrophysics Data System (ADS)

Kulish, Vladimir V.; Heng, Li; Dröge, Peter

2007-10-01

Spontaneous transitions of genomic DNA segments from right-handed B-DNA into the left-handed, high-energy Z conformation are unstable within topologically relaxed DNA molecules, such as mammalian chromosomes. Here we show, from direct application of the principles of statistical physics with a promoter region in the mouse genome as a representative example, that the life span for this alternate DNA conformation may be much smaller than the characteristic time of thermal fluctuations that cause the B-to-Z transition. Surprisingly, such a short existence of Z-DNA is important because it can be responsible for super-transport of energy within a genome. This type of energy transport can be utilized by a cell to communicate information about the state of particular chromatin domains within chromosomes or as a buffer against genome instability.
Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints

PubMed Central

Glusman, Gustavo; Mauldin, Denise E.; Hood, Leroy E.; Robinson, Max

2017-01-01

We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into “genome fingerprints” via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. For example, we could compute all-against-all pairwise comparisons among the 2504 genomes in the 1000 Genomes data set in 67 s at high quality (21 μs per comparison, on a single processor), and achieved a lower quality approximation in just 11 s. Efficient computation enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative sequenced genomes in a set, population reconstruction, and many others. The original genome representation cannot be reconstructed from its fingerprint, effectively decoupling genome comparison from genome interpretation; the method thus has significant implications for privacy-preserving genome analytics. PMID:29018478
Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.

PubMed

Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M

2017-08-16

High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.
Data and animal management software for large-scale phenotype screening.

PubMed

Ching, Keith A; Cooke, Michael P; Tarantino, Lisa M; Lapp, Hilmar

2006-04-01

The mouse N-ethyl-N-nitrosourea (ENU) mutagenesis program at the Genomics Institute of the Novartis Research Foundation (GNF) uses MouseTRACS to analyze phenotype screens and manage animal husbandry. MouseTRACS is a Web-based laboratory informatics system that electronically records and organizes mouse colony operations, prints cage cards, tracks inventory, manages requests, and reports Institutional Animal Care and Use Committee (IACUC) protocol usage. For efficient phenotype screening, MouseTRACS identifies mutants, visualizes data, and maps mutations. It displays and integrates phenotype and genotype data using likelihood odds ratio (LOD) plots of genetic linkage between genotype and phenotype. More detailed mapping intervals show individual single nucleotide polymorphism (SNP) markers in the context of phenotype. In addition, dynamically generated pedigree diagrams and inventory reports linked to screening results summarize the inheritance pattern and the degree of penetrance. MouseTRACS displays screening data in tables and uses standard charts such as box plots, histograms, scatter plots, and customized charts looking at clustered mice or cross pedigree comparisons. In summary, MouseTRACS enables the efficient screening, analysis, and management of thousands of animals to find mutant mice and identify novel gene functions. MouseTRACS is available under an open source license at http://www.mousetracs.sourceforge.net.
Trivial role for NSMCE2 during in vitro proliferation and differentiation of male germline stem cells.

PubMed

Zheng, Yi; Jongejan, Aldo; Mulder, Callista L; Mastenbroek, Sebastiaan; Repping, Sjoerd; Wang, Yinghua; Li, Jinsong; Hamer, Geert

2017-09-01

Spermatogenesis, starting with spermatogonial differentiation, is characterized by ongoing and dramatic alterations in composition and function of chromatin. Failure to maintain proper chromatin dynamics during spermatogenesis may lead to mutations, chromosomal aberrations or aneuploidies. When transmitted to the offspring, these can cause infertility or congenital malformations. The structural maintenance of chromosomes (SMC) 5/6 protein complex has recently been described to function in chromatin modeling and genomic integrity maintenance during spermatogonial differentiation and meiosis. Among the subunits of the SMC5/6 complex, non-SMC element 2 (NSMCE2) is an important small ubiquitin-related modifier (SUMO) ligase. NSMCE2 has been reported to be essential for mouse development, prevention of cancer and aging in adult mice and topological stress relief in human somatic cells. By using in vitro cultured primary mouse spermatogonial stem cells (SSCs), referred to as male germline stem (GS) cells, we investigated the function of NSMCE2 during spermatogonial proliferation and differentiation. We first optimized a protocol to generate genetically modified GS cell lines using CRISPR-Cas9 and generated an Nsmce2 -/- GS cell line. Using this Nsmce2 -/- GS cell line, we found that NSMCE2 was dispensable for proliferation, differentiation and topological stress relief in mouse GS cells. Moreover, RNA sequencing analysis demonstrated that the transcriptome was only minimally affected by the absence of NSMCE2. Only differential expression of Sgsm1 appeared highly significant, but with SGSM1 protein levels being unaffected without NSMCE2. Hence, despite the essential roles of NSMCE2 in somatic cells, chromatin integrity maintenance seems differentially regulated in the germline. © 2017 Society for Reproduction and Fertility.
Computational neuroanatomy: mapping cell-type densities in the mouse brain, simulations from the Allen Brain Atlas

NASA Astrophysics Data System (ADS)

Grange, Pascal

2015-09-01

The Allen Brain Atlas of the adult mouse (ABA) consists of digitized expression profiles of thousands of genes in the mouse brain, co-registered to a common three-dimensional template (the Allen Reference Atlas).This brain-wide, genome-wide data set has triggered a renaissance in neuroanatomy. Its voxelized version (with cubic voxels of side 200 microns) is available for desktop computation in MATLAB. On the other hand, brain cells exhibit a great phenotypic diversity (in terms of size, shape and electrophysiological activity), which has inspired the names of some well-studied cell types, such as granule cells and medium spiny neurons. However, no exhaustive taxonomy of brain cell is available. A genetic classification of brain cells is being undertaken, and some cell types have been chraracterized by their transcriptome profiles. However, given a cell type characterized by its transcriptome, it is not clear where else in the brain similar cells can be found. The ABA can been used to solve this region-specificity problem in a data-driven way: rewriting the brain-wide expression profiles of all genes in the atlas as a sum of cell-type-specific transcriptome profiles is equivalent to solving a quadratic optimization problem at each voxel in the brain. However, the estimated brain-wide densities of 64 cell types published recently were based on one series of co-registered coronal in situ hybridization (ISH) images per gene, whereas the online ABA contains several image series per gene, including sagittal ones. In the presented work, we simulate the variability of cell-type densities in a Monte Carlo way by repeatedly drawing a random image series for each gene and solving the optimization problem. This yields error bars on the region-specificity of cell types.
Microarray-based comparison of three amplification methods for nanogram amounts of total RNA

PubMed Central

Singh, Ruchira; Maganti, Rajanikanth J.; Jabba, Sairam V.; Wang, Martin; Deng, Glenn; Heath, Joe Don; Kurn, Nurith; Wangemann, Philine

2007-01-01

Gene expression profiling using microarrays requires microgram amounts of RNA, which limits its direct application for the study of nanogram RNA samples obtained using microdissection, laser capture microscopy, or needle biopsy. A novel system based on Ribo-SPIA technology (RS, Ovation-Biotin amplification and labeling system) was recently introduced. The utility of the RS system, an optimized prototype system for picogram RNA samples (pRS), and two T7-based systems involving one or two rounds of amplification (OneRA, Standard Protocol, or TwoRA, Small Sample Prototcol, version II) were evaluated in the present study. Mouse kidney (MK) and mouse universal reference (MUR) RNA samples, 0.3 ng to 10 μg, were analyzed using high-density Affymetrix Mouse Genome 430 2.0 GeneChip arrays. Call concordance between replicates, correlations of signal intensity, signal intensity ratios, and minimal fold increase necessary for significance were determined. All systems amplified partially overlapping sets of genes with similar signal intensity correlations. pRS amplified the highest number of genes from 10-ng RNA samples. We detected 24 of 26 genes verified by RT-PCR in samples prepared using pRS. TwoRA yielded somewhat higher call concordances than did RS and pRS (91.8% vs. 89.3% and 88.1%, respectively). Although all target preparation methods were suitable, pRS amplified the highest number of targets and was found to be suitable for amplification of as little as 0.3 ng of total RNA. In addition, RS and pRS were faster and simpler to use than the T7-based methods and resulted in the generation of cDNA, which is more stable than cRNA. PMID:15613496
Osteoarthritis year in review 2017: genetics and epigenetics.

PubMed

Peffers, M J; Balaskas, P; Smagul, A

2018-03-01

The purpose of this review is to describe highlights from original research publications related to osteoarthritis (OA), epigenetics and genomics with the intention of recognising significant advances. To identify relevant papers a Pubmed literature search was conducted for articles published between April 2016 and April 2017 using the search terms 'osteoarthritis' together with 'genetics', 'genomics', 'epigenetics', 'microRNA', 'lncRNA', 'DNA methylation' and 'histone modification'. The search term OA generated almost 4000 references. Publications using the combination of descriptors OA and genetics provided the most references (82 references). However this was reduced compared to the same period in the previous year; 8.1-2.1% (expressed as a percentage of the total publications combining the terms OA and genetics). Publications combining the terms OA with genomics (29 references), epigenetics (16 references), long non-coding RNA (lncRNA) (11 references; including the identification of novel lncRNAs in OA), DNA methylation (21 references), histone modification (3 references) and microRNA (miR) (79 references) were reviewed. Potential OA therapeutics such as histone deacetylase (HDAC) inhibitors have been identified. A number of non-coding RNAs may also provide targets for future treatments. There continues to be a year on year increase in publications researching miRs in OA (expressed as a percentage of the total publications), with a doubling over the last 4 years. An overview on the last year's progress within the fields of epigenetics and genomics with respect to OA will be given. Copyright © 2017 Osteoarthritis Research Society International. All rights reserved.
Complex multi-enhancer contacts captured by genome architecture mapping.

PubMed

Beagrie, Robert A; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C A; Chotalia, Mita; Xie, Sheila Q; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R; Fraser, James; Dostie, Josée; Game, Laurence; Dillon, Niall; Edwards, Paul A W; Nicodemi, Mario; Pombo, Ana

2017-03-23

The organization of the genome in the nucleus and the interactions of genes with their regulatory elements are key features of transcriptional control and their disruption can cause disease. Here we report a genome-wide method, genome architecture mapping (GAM), for measuring chromatin contacts and other features of three-dimensional chromatin topology on the basis of sequencing DNA from a large collection of thin nuclear sections. We apply GAM to mouse embryonic stem cells and identify enrichment for specific interactions between active genes and enhancers across very large genomic distances using a mathematical model termed SLICE (statistical inference of co-segregation). GAM also reveals an abundance of three-way contacts across the genome, especially between regions that are highly transcribed or contain super-enhancers, providing a level of insight into genome architecture that, owing to the technical limitations of current technologies, has previously remained unattainable. Furthermore, GAM highlights a role for gene-expression-specific contacts in organizing the genome in mammalian nuclei.
An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.

PubMed

Shao, Mingfu; Lin, Yu; Moret, Bernard M E

2015-05-01

Computing the edit distance between two genomes is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be computed in linear time for genomes without duplicate genes, while the problem becomes NP-hard in the presence of duplicate genes. In this article, we propose an integer linear programming (ILP) formulation to compute the DCJ distance between two genomes with duplicate genes. We also provide an efficient preprocessing approach to simplify the ILP formulation while preserving optimality. Comparison on simulated genomes demonstrates that our method outperforms MSOAR in computing the edit distance, especially when the genomes contain long duplicated segments. We also apply our method to assign orthologous gene pairs among human, mouse, and rat genomes, where once again our method outperforms MSOAR.
Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

PubMed Central

2011-01-01

Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

PubMed

Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

2014-01-01

Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.
Selection and evaluation of novel reference genes for quantitative reverse transcription PCR (qRT-PCR) based on genome and transcriptome data in Brassica napus L.

PubMed

Yang, Hongli; Liu, Jing; Huang, Shunmou; Guo, Tingting; Deng, Linbin; Hua, Wei

2014-03-15

Selection of reference genes in Brassica napus, a tetraploid (4×) species, is a very difficult task without information on genome and transcriptome. By now, only several traditional reference genes which show significant expression differentiation under different conditions are used in B. napus. In the present study, based on genome and transcriptome data of the rapeseed Zhongshuang-11 cultivar, 14 candidate reference genes were screened for investigation in different tissues, cultivars, and treated conditions of B. napus. These genes were as follows: ELF5, ENTH, F-BOX7, F-BOX2, FYPP1, GDI1, GYF, MCP2d, OTP80, PPR, SPOC, Unknown1, Unknown2 and UBA. Among them, excluding GYF and FYPP1, another 12 genes, were identified to perform better than traditional reference genes ACTIN7 and GAPDH. To further validate the accuracy of the newly developed reference genes in normalization, expression levels of BnCAT1 (B. napus catalase 1) in different rapeseed tissues and seedlings under stress conditions were normalized by the three most stable reference genes PPR, GDI1, and ENTH and little difference existed in normalization results. To the best of our knowledge, this is the first time B. napus reference genes have been provided with the help of complete genome and transcriptome information. The new reference genes provided in this study are more accurate than previously reported reference genes in quantifying expression levels of B. napus genes. Crown Copyright © 2014. Published by Elsevier B.V. All rights reserved.

Construction of a Llama Bacterial Artificial Chromosome Library with Approximately 9-Fold Genome Equivalent Coverage

PubMed Central

Airmet, K. W.; Hinckley, J. D.; Tree, L. T.; Moss, M.; Blumell, S.; Ulicny, K.; Gustafson, A. K.; Weed, M.; Theodosis, R.; Lehnardt, M.; Genho, J.; Stevens, M. R.; Kooyman, D. L.

2012-01-01

The Ilama is an important agricultural livestock in much of South America. The llama is increasing in popularity in the United States as a companion animal. Little work has been done to improve llama production using modern technology. A paucity of information is available regarding the llama genome. We report the construction of a llama bacterial artificial chromosome (BAC) library of about 196,224 clones in the vector pECBAC1. Using flow cytometry and bovine, human, mouse, and chicken as controls, we determined the llama genome size to be 2.4 × 109 bp. The average insert size of the library is 137.8 kb corresponding to approximately 9-fold genome coverage. Further studies are needed to further characterize the library and llama genome. We anticipate that this new library will help facilitate future genomic studies in the llama. PMID:22811594
Preparation of rAAV9 to Overexpress or Knockdown Genes in Mouse Hearts

PubMed Central

Ding, Jian; Lin, Zhi-Qiang; Jiang, Jian-Ming; Seidman, Christine E.; Seidman, Jonathan G.; Pu, William T.; Wang, Da-Zhi

2016-01-01

Controlling the expression or activity of specific genes through the myocardial delivery of genetic materials in murine models permits the investigation of gene functions. Their therapeutic potential in the heart can also be determined. There are limited approaches for in vivo molecular intervention in the mouse heart. Recombinant adeno-associated virus (rAAV)-based genome engineering has been utilized as an essential tool for in vivo cardiac gene manipulation. The specific advantages of this technology include high efficiency, high specificity, low genomic integration rate, minimalimmunogenicity, and minimal pathogenicity. Here, a detailed procedure to construct, package, and purify the rAAV9 vectors is described. Subcutaneous injection of rAAV9 into neonatal pups results in robust expression or efficient knockdown of the gene(s) of interest in the mouse heart, but not in the liver and other tissues. Using the cardiac-specific TnnT2 promoter, high expression of GFP gene in the heart was obtained. Additionally, target mRNA was inhibited in the heart when a rAAV9-U6-shRNA was utilized. Working knowledge of rAAV9 technology may be useful for cardiovascular investigations. PMID:28060283
Genomic structure, promoter identification, and chromosomal mapping of a mouse nuclear orphan receptor expressed in embryos and adult testes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, C.H.; Wei, Li-Na; Copeland, N.G.

We have isolated and characterized overlapping genomic clones containing the complete transcribed region of a newly isolated mouse cDNA encoding an orphan receptor expressed specifically in midgestation embryos and adult testis. This gene spans a distance of more than 50 kb and is organized into 13 exons. The transcription initiation site is located at the 158th nucleotide upstream from the translation initiation codon. All the exon/intron junction sequences follow the GT/AG rule. Based upon Northern blot analysis and the size of the transcribed region of the gene, its transcript was determined to be approximately 2.5 kb. Within approximately 500 hpmore » upstream from the transcription initiation site, several immune response regulatory elements were identified but no TATA box was located. This gene was mapped to the distal region of mouse chromosome 10 and its locus has been designated Tr2-11. Immunohistochemical studies show that the Tr2-11 protein is present mainly in advanced germ cell populations of mature testes and that Tr2-11 gene expression is dramatically decreased in vitamin A-depleted animals. 23 refs., 7 figs.« less
Preparation of rAAV9 to Overexpress or Knockdown Genes in Mouse Hearts.

PubMed

Ding, Jian; Lin, Zhi-Qiang; Jiang, Jian-Ming; Seidman, Christine E; Seidman, Jonathan G; Pu, William T; Wang, Da-Zhi

2016-12-17

Controlling the expression or activity of specific genes through the myocardial delivery of genetic materials in murine models permits the investigation of gene functions. Their therapeutic potential in the heart can also be determined. There are limited approaches for in vivo molecular intervention in the mouse heart. Recombinant adeno-associated virus (rAAV)-based genome engineering has been utilized as an essential tool for in vivo cardiac gene manipulation. The specific advantages of this technology include high efficiency, high specificity, low genomic integration rate, minimal immunogenicity, and minimal pathogenicity. Here, a detailed procedure to construct, package, and purify the rAAV9 vectors is described. Subcutaneous injection of rAAV9 into neonatal pups results in robust expression or efficient knockdown of the gene(s) of interest in the mouse heart, but not in the liver and other tissues. Using the cardiac-specific TnnT2 promoter, high expression of GFP gene in the heart was obtained. Additionally, target mRNA was inhibited in the heart when a rAAV9-U6-shRNA was utilized. Working knowledge of rAAV9 technology may be useful for cardiovascular investigations.
Identification of Candidate B-Lymphoma Genes by Cross-Species Gene Expression Profiling

PubMed Central

Tompkins, Van S.; Han, Seong-Su; Olivier, Alicia; Syrbu, Sergei; Bair, Thomas; Button, Anna; Jacobus, Laura; Wang, Zebin; Lifton, Samuel; Raychaudhuri, Pradip; Morse, Herbert C.; Weiner, George; Link, Brian; Smith, Brian J.; Janz, Siegfried

2013-01-01

Comparative genome-wide expression profiling of malignant tumor counterparts across the human-mouse species barrier has a successful track record as a gene discovery tool in liver, breast, lung, prostate and other cancers, but has been largely neglected in studies on neoplasms of mature B-lymphocytes such as diffuse large B cell lymphoma (DLBCL) and Burkitt lymphoma (BL). We used global gene expression profiles of DLBCL-like tumors that arose spontaneously in Myc-transgenic C57BL/6 mice as a phylogenetically conserved filter for analyzing the human DLBCL transcriptome. The human and mouse lymphomas were found to have 60 concordantly deregulated genes in common, including 8 genes that Cox hazard regression analysis associated with overall survival in a published landmark dataset of DLBCL. Genetic network analysis of the 60 genes followed by biological validation studies indicate FOXM1 as a candidate DLBCL and BL gene, supporting a number of studies contending that FOXM1 is a therapeutic target in mature B cell tumors. Our findings demonstrate the value of the “mouse filter” for genomic studies of human B-lineage neoplasms for which a vast knowledge base already exists. PMID:24130802
A reporter model to visualize imprinting stability at the Dlk1 locus during mouse development and in pluripotent cells.

PubMed

Swanzey, Emily; Stadtfeld, Matthias

2016-11-15

Genomic imprinting results in the monoallelic expression of genes that encode important regulators of growth and proliferation. Dysregulation of imprinted genes, such as those within the Dlk1-Dio3 locus, is associated with developmental syndromes and specific diseases. Our ability to interrogate causes of imprinting instability has been hindered by the absence of suitable model systems. Here, we describe a Dlk1 knock-in reporter mouse that enables single-cell visualization of allele-specific expression and prospective isolation of cells, simultaneously. We show that this 'imprinting reporter mouse' can be used to detect tissue-specific Dlk1 expression patterns in developing embryos. We also apply this system to pluripotent cell culture and demonstrate that it faithfully indicates DNA methylation changes induced upon cellular reprogramming. Finally, the reporter system reveals the role of elevated oxygen levels in eroding imprinted Dlk1 expression during prolonged culture and in vitro differentiation. The possibility to study allele-specific expression in different contexts makes our reporter system a useful tool to dissect the regulation of genomic imprinting in normal development and disease. © 2016. Published by The Company of Biologists Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)

SacconePhD, Scott F; Chesler, Elissa J; Bierut, Laura J

Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well representedmore » by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.« less
Combining cow and bull reference populations to increase accuracy of genomic prediction and genome-wide association studies.

PubMed

Calus, M P L; de Haas, Y; Veerkamp, R F

2013-10-01

Genomic selection holds the promise to be particularly beneficial for traits that are difficult or expensive to measure, such that access to phenotypes on large daughter groups of bulls is limited. Instead, cow reference populations can be generated, potentially supplemented with existing information from the same or (highly) correlated traits available on bull reference populations. The objective of this study, therefore, was to develop a model to perform genomic predictions and genome-wide association studies based on a combined cow and bull reference data set, with the accuracy of the phenotypes differing between the cow and bull genomic selection reference populations. The developed bivariate Bayesian stochastic search variable selection model allowed for an unbalanced design by imputing residuals in the residual updating scheme for all missing records. The performance of this model is demonstrated on a real data example, where the analyzed trait, being milk fat or protein yield, was either measured only on a cow or a bull reference population, or recorded on both. Our results were that the developed bivariate Bayesian stochastic search variable selection model was able to analyze 2 traits, even though animals had measurements on only 1 of 2 traits. The Bayesian stochastic search variable selection model yielded consistently higher accuracy for fat yield compared with a model without variable selection, both for the univariate and bivariate analyses, whereas the accuracy of both models was very similar for protein yield. The bivariate model identified several additional quantitative trait loci peaks compared with the single-trait models on either trait. In addition, the bivariate models showed a marginal increase in accuracy of genomic predictions for the cow traits (0.01-0.05), although a greater increase in accuracy is expected as the size of the bull population increases. Our results emphasize that the chosen value of priors in Bayesian genomic prediction models are especially important in small data sets. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Ensembl Genomes 2016: more genomes, more complexity.

PubMed

Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

2016-01-04

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ensembl Genomes 2016: more genomes, more complexity

PubMed Central

Kersey, Paul Julian; Allen, James E.; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J.; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J.; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K.; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D.; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello–Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M.; Howe, Kevin L.; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M.

2016-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. PMID:26578574
Choice of Reference Sequence and Assembler for Alignment of Listeria monocytogenes Short-Read Sequence Data Greatly Influences Rates of Error in SNP Analyses

PubMed Central

Pightling, Arthur W.; Petronella, Nicholas; Pagotto, Franco

2014-01-01

The wide availability of whole-genome sequencing (WGS) and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs) in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs) are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps) are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i) depth of sequencing coverage, ii) choice of reference-guided short-read sequence assembler, iii) choice of reference genome, and iv) whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT), using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming). We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers should test a variety of conditions to achieve optimal results. PMID:25144537
The open reading frames in the 3' long terminal repeats of several mouse mammary tumor virus integrants encode V beta 3-specific superantigens

PubMed Central

1992-01-01

Mice expressing the minor lymphocyte stimulation antigens, Mls-1a, -2a, or -3a, singly on the B10.BR background have been generated. Mls phenotypes correlate with the integration of mouse mammary tumor viruses (MTV) in the mouse genome. The open reading frames within the 3' long terminal repeats of the integrated MTVs 1, 3, 6, and 13 encode V beta 3-specific superantigens. Sequence data for these viral superantigens is presented, indicating that it is the COOH-terminal portion of the viral superantigen that interacts with the T cell receptor V beta element. PMID:1309854
Genome-wide mapping in a house mouse hybrid zone reveals hybrid sterility loci and Dobzhansky-Muller interactions

PubMed Central

Turner, Leslie M; Harr, Bettina

2014-01-01

Mapping hybrid defects in contact zones between incipient species can identify genomic regions contributing to reproductive isolation and reveal genetic mechanisms of speciation. The house mouse features a rare combination of sophisticated genetic tools and natural hybrid zones between subspecies. Male hybrids often show reduced fertility, a common reproductive barrier between incipient species. Laboratory crosses have identified sterility loci, but each encompasses hundreds of genes. We map genetic determinants of testis weight and testis gene expression using offspring of mice captured in a hybrid zone between M. musculus musculus and M. m. domesticus. Many generations of admixture enables high-resolution mapping of loci contributing to these sterility-related phenotypes. We identify complex interactions among sterility loci, suggesting multiple, non-independent genetic incompatibilities contribute to barriers to gene flow in the hybrid zone. DOI: http://dx.doi.org/10.7554/eLife.02504.001 PMID:25487987
A genome-scale map of expression for a mouse brain section obtained using voxelation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chin, Mark H.; Geng, Alex B.; Khan, Arshad H.

Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological diseases. We have reconstructed 2- dimensional images of gene expression for 20,000 genes in a coronal slice of the mouse brain at the level of the striatum by using microarrays in combination with voxelation at a resolution of 1 mm3. Good reliability of the microarray results were confirmed using multiple replicates, subsequent quantitative RT-PCR voxelation, mass spectrometry voxelation and publicly available in situ hybridization data. Known and novel genes were identified with expression patterns localized to defined substructures within the brain. In addition, genesmore » with unexpected patterns were identified and cluster analysis identified a set of genes with a gradient of dorsal/ventral expression not restricted to known anatomical boundaries. The genome-scale maps of gene expression obtained using voxelation will be a valuable tool for the neuroscience community.« less
GenExp: an interactive web-based genomic DAS client with client-side data rendering.

PubMed

Gel Moreno, Bernat; Messeguer Peypoch, Xavier

2011-01-01

The Distributed Annotation System (DAS) offers a standard protocol for sharing and integrating annotations on biological sequences. There are more than 1000 DAS sources available and the number is steadily increasing. Clients are an essential part of the DAS system and integrate data from several independent sources in order to create a useful representation to the user. While web-based DAS clients exist, most of them do not have direct interaction capabilities such as dragging and zooming with the mouse. Here we present GenExp, a web based and fully interactive visual DAS client. GenExp is a genome oriented DAS client capable of creating informative representations of genomic data zooming out from base level to complete chromosomes. It proposes a novel approach to genomic data rendering and uses the latest HTML5 web technologies to create the data representation inside the client browser. Thanks to client-side rendering most position changes do not need a network request to the server and so responses to zooming and panning are almost immediate. In GenExp it is possible to explore the genome intuitively moving it with the mouse just like geographical map applications. Additionally, in GenExp it is possible to have more than one data viewer at the same time and to save the current state of the application to revisit it later on. GenExp is a new interactive web-based client for DAS and addresses some of the short-comings of the existing clients. It uses client-side data rendering techniques resulting in easier genome browsing and exploration. GenExp is open source under the GPL license and it is freely available at http://gralggen.lsi.upc.edu/recerca/genexp.
Neil2-null Mice Accumulate Oxidized DNA Bases in the Transcriptionally Active Sequences of the Genome and Are Susceptible to Innate Inflammation* ♦

PubMed Central

Chakraborty, Anirban; Wakamiya, Maki; Venkova-Canova, Tatiana; Pandita, Raj K.; Aguilera-Aguirre, Leopoldo; Sarker, Altaf H.; Singh, Dharmendra Kumar; Hosoki, Koa; Wood, Thomas G.; Sharma, Gulshan; Cardenas, Victor; Sarkar, Partha S.; Sur, Sanjiv; Pandita, Tej K.; Boldogh, Istvan; Hazra, Tapas K.

2015-01-01

Why mammalian cells possess multiple DNA glycosylases (DGs) with overlapping substrate ranges for repairing oxidatively damaged bases via the base excision repair (BER) pathway is a long-standing question. To determine the biological role of these DGs, null animal models have been generated. Here, we report the generation and characterization of mice lacking Neil2 (Nei-like 2). As in mice deficient in each of the other four oxidized base-specific DGs (OGG1, NTH1, NEIL1, and NEIL3), Neil2-null mice show no overt phenotype. However, middle-aged to old Neil2-null mice show the accumulation of oxidative genomic damage, mostly in the transcribed regions. Immuno-pulldown analysis from wild-type (WT) mouse tissue showed the association of NEIL2 with RNA polymerase II, along with Cockayne syndrome group B protein, TFIIH, and other BER proteins. Chromatin immunoprecipitation analysis from mouse tissue showed co-occupancy of NEIL2 and RNA polymerase II only on the transcribed genes, consistent with our earlier in vitro findings on NEIL2's role in transcription-coupled BER. This study provides the first in vivo evidence of genomic region-specific repair in mammals. Furthermore, telomere loss and genomic instability were observed at a higher frequency in embryonic fibroblasts from Neil2-null mice than from the WT. Moreover, Neil2-null mice are much more responsive to inflammatory agents than WT mice. Taken together, our results underscore the importance of NEIL2 in protecting mammals from the development of various pathologies that are linked to genomic instability and/or inflammation. NEIL2 is thus likely to play an important role in long term genomic maintenance, particularly in long-lived mammals such as humans. PMID:26245904
MutSpec: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes.

PubMed

Ardin, Maude; Cahais, Vincent; Castells, Xavier; Bouaoun, Liacine; Byrnes, Graham; Herceg, Zdenko; Zavadil, Jiri; Olivier, Magali

2016-04-18

The nature of somatic mutations observed in human tumors at single gene or genome-wide levels can reveal information on past carcinogenic exposures and mutational processes contributing to tumor development. While large amounts of sequencing data are being generated, the associated analysis and interpretation of mutation patterns that may reveal clues about the natural history of cancer present complex and challenging tasks that require advanced bioinformatics skills. To make such analyses accessible to a wider community of researchers with no programming expertise, we have developed within the web-based user-friendly platform Galaxy a first-of-its-kind package called MutSpec. MutSpec includes a set of tools that perform variant annotation and use advanced statistics for the identification of mutation signatures present in cancer genomes and for comparing the obtained signatures with those published in the COSMIC database and other sources. MutSpec offers an accessible framework for building reproducible analysis pipelines, integrating existing methods and scripts developed in-house with publicly available R packages. MutSpec may be used to analyse data from whole-exome, whole-genome or targeted sequencing experiments performed on human or mouse genomes. Results are provided in various formats including rich graphical outputs. An example is presented to illustrate the package functionalities, the straightforward workflow analysis and the richness of the statistics and publication-grade graphics produced by the tool. MutSpec offers an easy-to-use graphical interface embedded in the popular Galaxy platform that can be used by researchers with limited programming or bioinformatics expertise to analyse mutation signatures present in cancer genomes. MutSpec can thus effectively assist in the discovery of complex mutational processes resulting from exogenous and endogenous carcinogenic insults.
GenExp: An Interactive Web-Based Genomic DAS Client with Client-Side Data Rendering

PubMed Central

Gel Moreno, Bernat; Messeguer Peypoch, Xavier

2011-01-01

Background The Distributed Annotation System (DAS) offers a standard protocol for sharing and integrating annotations on biological sequences. There are more than 1000 DAS sources available and the number is steadily increasing. Clients are an essential part of the DAS system and integrate data from several independent sources in order to create a useful representation to the user. While web-based DAS clients exist, most of them do not have direct interaction capabilities such as dragging and zooming with the mouse. Results Here we present GenExp, a web based and fully interactive visual DAS client. GenExp is a genome oriented DAS client capable of creating informative representations of genomic data zooming out from base level to complete chromosomes. It proposes a novel approach to genomic data rendering and uses the latest HTML5 web technologies to create the data representation inside the client browser. Thanks to client-side rendering most position changes do not need a network request to the server and so responses to zooming and panning are almost immediate. In GenExp it is possible to explore the genome intuitively moving it with the mouse just like geographical map applications. Additionally, in GenExp it is possible to have more than one data viewer at the same time and to save the current state of the application to revisit it later on. Conclusions GenExp is a new interactive web-based client for DAS and addresses some of the short-comings of the existing clients. It uses client-side data rendering techniques resulting in easier genome browsing and exploration. GenExp is open source under the GPL license and it is freely available at http://gralggen.lsi.upc.edu/recerca/genexp. PMID:21750706
Retrotransposon expression and incorporation of cloned human and mouse retroelements in human spermatozoa.

PubMed

Lazaros, Leandros; Kitsou, Chrysoula; Kostoulas, Charilaos; Bellou, Sofia; Hatzi, Elissavet; Ladias, Paris; Stefos, Theodoros; Markoula, Sofia; Galani, Vasiliki; Vartholomatos, Georgios; Tzavaras, Theodore; Georgiou, Ioannis

2017-03-01

To investigate the expression of long interspersed element (LINE) 1, human endogenous retrovirus (HERV) K10, and short interspersed element-VNTR-Alu element (SVA) retrotransposons in ejaculated human spermatozoa by means of reverse-transcription (RT) polymerase chain reaction (PCR) analysis as well as the potential incorporation of cloned human and mouse active retroelements in human sperm cell genome. Laboratory study. University research laboratories and academic hospital. Normozoospermic and oligozoospermic white men. RT-PCR analysis was performed to confirm the retrotransposon expression in human spermatozoa. Exogenous retroelements were tagged with a plasmid containing a green fluorescence (EGFP) retrotransposition cassette, and the de novo retrotransposition events were tested with the use of PCR, fluorescence-activated cell sorting analysis, and confocal microscopy. Retroelement expression in human spermatozoa, incorporation of cloned human and mouse active retroelements in human sperm genome, and de novo retrotransposition events in human spermatozoa. RT-PCR products of expressed human LINE-1, HERV-K10, and SVA retrotransposons were observed in ejaculated human sperm samples. The incubation of human spermatozoa with either retrotransposition-active human LINE-1 and HERV-K10 or mouse reverse transcriptase-deficient VL30 retrotransposons tagged with an EGFP-based retrotransposition cassette led to EGFP-positive spermatozo; 16.67% of the samples were positive for retrotransposition. The respective retrotransposition frequencies for the LINE-1, HERV-K10, and VL30 retrotransposons in the positive samples were 0.34 ± 0.13%, 0.37 ± 0.17%, and 0.30 ± 0.14% per sample of 10,000 spermatozoa. Our results show that: 1) LINE-1, HERV-K10, and SVA retrotransposons are transcriptionally expressed in human spermatozoa; 2) cloned active retroelements of human and mammalian origin can be incorporated in human sperm genome; 3) active reverse transcriptases exist in human spermatozoa; and 4) de novo retrotransposition events occur in human spermatozoa. Copyright © 2017 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
Development of forensic-quality full mtGenome haplotypes: success rates with low template specimens.

PubMed

Just, Rebecca S; Scheible, Melissa K; Fast, Spence A; Sturk-Andreaggi, Kimberly; Higginbotham, Jennifer L; Lyons, Elizabeth A; Bush, Jocelyn M; Peck, Michelle A; Ring, Joseph D; Diegoli, Toni M; Röck, Alexander W; Huber, Gabriela E; Nagl, Simone; Strobl, Christina; Zimmermann, Bettina; Parson, Walther; Irwin, Jodi A

2014-05-01

Forensic mitochondrial DNA (mtDNA) testing requires appropriate, high quality reference population data for estimating the rarity of questioned haplotypes and, in turn, the strength of the mtDNA evidence. Available reference databases (SWGDAM, EMPOP) currently include information from the mtDNA control region; however, novel methods that quickly and easily recover mtDNA coding region data are becoming increasingly available. Though these assays promise to both facilitate the acquisition of mitochondrial genome (mtGenome) data and maximize the general utility of mtDNA testing in forensics, the appropriate reference data and database tools required for their routine application in forensic casework are lacking. To address this deficiency, we have undertaken an effort to: (1) increase the large-scale availability of high-quality entire mtGenome reference population data, and (2) improve the information technology infrastructure required to access/search mtGenome data and employ them in forensic casework. Here, we describe the application of a data generation and analysis workflow to the development of more than 400 complete, forensic-quality mtGenomes from low DNA quantity blood serum specimens as part of a U.S. National Institute of Justice funded reference population databasing initiative. We discuss the minor modifications made to a published mtGenome Sanger sequencing protocol to maintain a high rate of throughput while minimizing manual reprocessing with these low template samples. The successful use of this semi-automated strategy on forensic-like samples provides practical insight into the feasibility of producing complete mtGenome data in a routine casework environment, and demonstrates that large (>2kb) mtDNA fragments can regularly be recovered from high quality but very low DNA quantity specimens. Further, the detailed empirical data we provide on the amplification success rates across a range of DNA input quantities will be useful moving forward as PCR-based strategies for mtDNA enrichment are considered for targeted next-generation sequencing workflows. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

PubMed

Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

2017-04-28

Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.
The Arab genome: Health and wealth.

PubMed

Zayed, Hatem

2016-11-05

The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.
Draft Genome Sequence of the Serratia rubidaea CIP 103234T Reference Strain, a Human-Opportunistic Pathogen.

PubMed

Bonnin, Rémy A; Girlich, Delphine; Imanci, Dilek; Dortet, Laurent; Naas, Thierry

2015-11-19

We provide here the first genome sequence of a Serratia rubidaea isolate, a human-opportunistic pathogen. This reference sequence will permit a comparison of this species with others of the Serratia genus. Copyright © 2015 Bonnin et al.
Effect of cow reference group on validation reliability of genomic evaluation.

PubMed

Koivula, M; Strandén, I; Aamand, G P; Mäntysaari, E A

2016-06-01

We studied the effect of including genomic data for cows in the reference population of single-step evaluations. Deregressed individual cow genetic evaluations (DRP) from milk production evaluations of Nordic Red Dairy cattle were used to estimate the single-step breeding values. Validation reliability and bias of the evaluations were calculated with four data sets including different amount of DRP record information from genotyped cows in the reference population. The gain in reliability was from 2% to 4% units for the production traits, depending on the used DRP data and the amount of genomic data. Moreover, inclusion of genotyped bull dams and their genotyped daughters seemed to create some bias in the single-step evaluation. Still, genotyping cows and their inclusion in the reference population is advantageous and should be encouraged.
Saccharomyces cerevisiae: gene annotation and genome variability, state of the art through comparative genomics.

PubMed

Louis, Ed

2011-01-01

In the early days of the yeast genome sequencing project, gene annotation was in its infancy and suffered the problem of many false positive annotations as well as missed genes. The lack of other sequences for comparison also prevented the annotation of conserved, functional sequences that were not coding. We are now in an era of comparative genomics where many closely related as well as more distantly related genomes are available for direct sequence and synteny comparisons allowing for more probable predictions of genes and other functional sequences due to conservation. We also have a plethora of functional genomics data which helps inform gene annotation for previously uncharacterised open reading frames (ORFs)/genes. For Saccharomyces cerevisiae this has resulted in a continuous updating of the gene and functional sequence annotations in the reference genome helping it retain its position as the best characterized eukaryotic organism's genome. A single reference genome for a species does not accurately describe the species and this is quite clear in the case of S. cerevisiae where the reference strain is not ideal for brewing or baking due to missing genes. Recent surveys of numerous isolates, from a variety of sources, using a variety of technologies have revealed a great deal of variation amongst isolates with genome sequence surveys providing information on novel genes, undetectable by other means. We now have a better understanding of the extant variation in S. cerevisiae as a species as well as some idea of how much we are missing from this understanding. As with gene annotation, comparative genomics enhances the discovery and description of genome variation and is providing us with the tools for understanding genome evolution, adaptation and selection, and underlying genetics of complex traits.
Technical note: Equivalent genomic models with a residual polygenic effect.

PubMed

Liu, Z; Goddard, M E; Hayes, B J; Reinhardt, F; Reents, R

2016-03-01

Routine genomic evaluations in animal breeding are usually based on either a BLUP with genomic relationship matrix (GBLUP) or single nucleotide polymorphism (SNP) BLUP model. For a multi-step genomic evaluation, these 2 alternative genomic models were proven to give equivalent predictions for genomic reference animals. The model equivalence was verified also for young genotyped animals without phenotypes. Due to incomplete linkage disequilibrium of SNP markers to genes or causal mutations responsible for genetic inheritance of quantitative traits, SNP markers cannot explain all the genetic variance. A residual polygenic effect is normally fitted in the genomic model to account for the incomplete linkage disequilibrium. In this study, we start by showing the proof that the multi-step GBLUP and SNP BLUP models are equivalent for the reference animals, when they have a residual polygenic effect included. Second, the equivalence of both multi-step genomic models with a residual polygenic effect was also verified for young genotyped animals without phenotypes. Additionally, we derived formulas to convert genomic estimated breeding values of the GBLUP model to its components, direct genomic values and residual polygenic effect. Third, we made a proof that the equivalence of these 2 genomic models with a residual polygenic effect holds also for single-step genomic evaluation. Both the single-step GBLUP and SNP BLUP models lead to equal prediction for genotyped animals with phenotypes (e.g., reference animals), as well as for (young) genotyped animals without phenotypes. Finally, these 2 single-step genomic models with a residual polygenic effect were proven to be equivalent for estimation of SNP effects, too. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
COMPARISON OF COMPARATIVE GENOMIC HYBRIDIZATIONS TECHNOLOGIES ACROSS MICROARRAY PLATFORMS

EPA Science Inventory

Comparative Genomic Hybridization (CGH) measures DNA copy number differences between a reference genome and a test genome. The DNA samples are differentially labeled and hybridized to an immobilized substrate. In early CGH experiments, the DNA targets were hybridized to metaphase...
Genetic characterization and improved genotyping of the dysferlin-deficient mouse strain Dysf (tm1Kcam).

PubMed

Wiktorowicz, Tatiana; Kinter, Jochen; Kobuke, Kazuhiro; Campbell, Kevin P; Sinnreich, Michael

2015-01-01

Mouse models of dysferlinopathies are valuable tools with which to investigate the pathomechanisms underlying these diseases and to test novel therapeutic strategies. One such mouse model is the Dysf (tm1Kcam) strain, which was generated using a targeting vector to replace a 12-kb region of the dysferlin gene and which features a progressive muscular dystrophy. A prerequisite for successful animal studies using genetic mouse models is an accurate genotyping protocol. Unfortunately, the lack of robustness of currently available genotyping protocols for the Dysf (tm1Kcam) mouse has prevented efficient colony management. Initial attempts to improve the genotyping protocol based on the published genomic structure failed. These difficulties led us to analyze the targeted locus of the dysferlin gene of the Dysf (tm1Kcam) mouse in greater detail. In this study we resequenced and analyzed the targeted locus of the Dysf (tm1Kcam) mouse and developed a novel PCR protocol for genotyping. We found that instead of a deletion, the dysferlin locus in the Dysf (tm1Kcam) mouse carries a targeted insertion. This genetic characterization enabled us to establish a reliable method for genotyping of the Dysf (tm1Kcam) mouse, and thus has made efficient colony management possible. Our work will make the Dysf (tm1Kcam) mouse model more attractive for animal studies of dysferlinopathies.
Histological and reference system for the analysis of mouse intervertebral disc.

PubMed

Tam, Vivian; Chan, Wilson C W; Leung, Victor Y L; Cheah, Kathryn S E; Cheung, Kenneth M C; Sakai, Daisuke; McCann, Matthew R; Bedore, Jake; Séguin, Cheryle A; Chan, Danny

2018-01-01

A new scoring system based on histo-morphology of mouse intervertebral disc (IVD) was established to assess changes in different mouse models of IVD degeneration and repair. IVDs from mouse strains of different ages, transgenic mice, or models of artificially induced IVD degeneration were assessed. Morphological features consistently observed in normal, and early/later stages of degeneration were categorized into a scoring system focused on nucleus pulposus (NP) and annulus fibrosus (AF) changes. "Normal NP" exhibited a highly cellularized cell mass that decreased with natural ageing and in disc degeneration. "Normal AF" consisted of distinct concentric lamellar structures, which was disrupted in severe degeneration. NP/AF clefts indicated more severe changes. Consistent scores were obtained between experienced and new users. Altogether, our scoring system effectively differentiated IVD changes in various strains of wild-type and genetically modified mice and in induced models of IVD degeneration, and is applicable from the post-natal stage to the aged mouse. This scoring tool and reference resource addresses a pressing need in the field for studying IVD changes and cross-study comparisons in mice, and facilitates a means to normalize mouse IVD assessment between different laboratories. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 36:233-243, 2018. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.
Genome-wide alteration of 5-hydroxymethylcytosine in a mouse model of fragile X-associated tremor/ataxia syndrome.

PubMed

Yao, Bing; Lin, Li; Street, R Craig; Zalewski, Zachary A; Galloway, Jocelyn N; Wu, Hao; Nelson, David L; Jin, Peng

2014-02-15

Fragile X-associated tremor/ataxia syndrome (FXTAS) is a late-onset neurodegenerative disorder in which patients carry premutation alleles of 55-200 CGG repeats in the FMR1 gene. To date, whether alterations in epigenetic regulation modulate FXTAS has gone unexplored. 5-Hydroxymethylcytosine (5hmC) converted from 5-methylcytosine (5mC) by the ten-eleven translocation (TET) family of proteins has been found recently to play key roles in neuronal functions. Here, we undertook genome-wide profiling of cerebellar 5hmC in a FXTAS mouse model (rCGG mice) and found that rCGG mice at 16 weeks showed overall reduced 5hmC levels genome-wide compared with age-matched wild-type littermates. However, we also observed gain-of-5hmC regions in repetitive elements, as well as in cerebellum-specific enhancers, but not in general enhancers. Genomic annotation and motif prediction of wild-type- and rCGG-specific differential 5-hydroxymethylated regions (DhMRs) revealed their high correlation with genes and transcription factors that are important in neuronal developmental and functional pathways. DhMR-associated genes partially overlapped with genes that were differentially associated with ribosomes in CGG mice identified by bacTRAP ribosomal profiling. Taken together, our data strongly indicate a functional role for 5hmC-mediated epigenetic modulation in the etiology of FXTAS, possibly through the regulation of transcription.
A non-inheritable maternal Cas9-based multiple-gene editing system in mice.

PubMed

Sakurai, Takayuki; Kamiyoshi, Akiko; Kawate, Hisaka; Mori, Chie; Watanabe, Satoshi; Tanaka, Megumu; Uetake, Ryuichi; Sato, Masahiro; Shindo, Takayuki

2016-01-28

The CRISPR/Cas9 system is capable of editing multiple genes through one-step zygote injection. The preexisting method is largely based on the co-injection of Cas9 DNA (or mRNA) and guide RNAs (gRNAs); however, it is unclear how many genes can be simultaneously edited by this method, and a reliable means to generate transgenic (Tg) animals with multiple gene editing has yet to be developed. Here, we employed non-inheritable maternal Cas9 (maCas9) protein derived from Tg mice with systemic Cas9 overexpression (Cas9 mice). The maCas9 protein in zygotes derived from mating or in vitro fertilization of Tg/+ oocytes and +/+ sperm could successfully edit the target genome. The efficiency of such maCas9-based genome editing was comparable to that of zygote microinjection-based genome editing widely used at present. Furthermore, we demonstrated a novel approach to create "Cas9 transgene-free" gene-modified mice using non-Tg (+/+) zygotes carrying maCas9. The maCas9 protein in mouse zygotes edited nine target loci simultaneously after injection with nine different gRNAs alone. Cas9 mouse-derived zygotes have the potential to facilitate the creation of genetically modified animals carrying the Cas9 transgene, enabling repeatable genome engineering and the production of Cas9 transgene-free mice.
Quantitative LC-MS Provides No Evidence for m6 dA or m4 dC in the Genome of Mouse Embryonic Stem Cells and Tissues.

PubMed

Schiffers, Sarah; Ebert, Charlotte; Rahimoff, René; Kosmatchev, Olesea; Steinbacher, Jessica; Bohne, Alexandra-Viola; Spada, Fabio; Michalakis, Stylianos; Nickelsen, Jörg; Müller, Markus; Carell, Thomas

2017-09-04

Until recently, it was believed that the genomes of higher organisms contain, in addition to the four canonical DNA bases, only 5-methyl-dC (m 5 dC) as a modified base to control epigenetic processes. In recent years, this view has changed dramatically with the discovery of 5-hydroxymethyl-dC (hmdC), 5-formyl-dC (fdC), and 5-carboxy-dC (cadC) in DNA from stem cells and brain tissue. N 6 -methyldeoxyadenosine (m 6 dA) is the most recent base reported to be present in the genome of various eukaryotic organisms. This base, together with N 4 -methyldeoxycytidine (m 4 dC), was first reported to be a component of bacterial genomes. In this work, we investigated the levels and distribution of these potentially epigenetically relevant DNA bases by using a novel ultrasensitive UHPLC-MS method. We further report quantitative data for m 5 dC, hmdC, fdC, and cadC, but we were unable to detect either m 4 dC or m 6 dA in DNA isolated from mouse embryonic stem cells or brain and liver tissue, which calls into question their epigenetic relevance. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Stromal-Based Signatures for the Classification of Gastric Cancer.

PubMed

Uhlik, Mark T; Liu, Jiangang; Falcon, Beverly L; Iyer, Seema; Stewart, Julie; Celikkaya, Hilal; O'Mahony, Marguerita; Sevinsky, Christopher; Lowes, Christina; Douglass, Larry; Jeffries, Cynthia; Bodenmiller, Diane; Chintharlapalli, Sudhakar; Fischl, Anthony; Gerald, Damien; Xue, Qi; Lee, Jee-Yun; Santamaria-Pang, Alberto; Al-Kofahi, Yousef; Sui, Yunxia; Desai, Keyur; Doman, Thompson; Aggarwal, Amit; Carter, Julia H; Pytowski, Bronislaw; Jaminet, Shou-Ching; Ginty, Fiona; Nasir, Aejaz; Nagy, Janice A; Dvorak, Harold F; Benjamin, Laura E

2016-05-01

Treatment of metastatic gastric cancer typically involves chemotherapy and monoclonal antibodies targeting HER2 (ERBB2) and VEGFR2 (KDR). However, reliable methods to identify patients who would benefit most from a combination of treatment modalities targeting the tumor stroma, including new immunotherapy approaches, are still lacking. Therefore, we integrated a mouse model of stromal activation and gastric cancer genomic information to identify gene expression signatures that may inform treatment strategies. We generated a mouse model in which VEGF-A is expressed via adenovirus, enabling a stromal response marked by immune infiltration and angiogenesis at the injection site, and identified distinct stromal gene expression signatures. With these data, we designed multiplexed IHC assays that were applied to human primary gastric tumors and classified each tumor to a dominant stromal phenotype representative of the vascular and immune diversity found in gastric cancer. We also refined the stromal gene signatures and explored their relation to the dominant patient phenotypes identified by recent large-scale studies of gastric cancer genomics (The Cancer Genome Atlas and Asian Cancer Research Group), revealing four distinct stromal phenotypes. Collectively, these findings suggest that a genomics-based systems approach focused on the tumor stroma can be used to discover putative predictive biomarkers of treatment response, especially to antiangiogenesis agents and immunotherapy, thus offering an opportunity to improve patient stratification. Cancer Res; 76(9); 2573-86. ©2016 AACR. ©2016 American Association for Cancer Research.
Can multi-subpopulation reference sets improve the genomic predictive ability for pigs?

PubMed

Fangmann, A; Bergfelder-Drüing, S; Tholen, E; Simianer, H; Erbe, M

2015-12-01

In most countries and for most livestock species, genomic evaluations are obtained from within-breed analyses. To achieve reliable breeding values, however, a sufficient reference sample size is essential. To increase this size, the use of multibreed reference populations for small populations is considered a suitable option in other species. Over decades, the separate breeding work of different pig breeding organizations in Germany has led to stratified subpopulations in the breed German Large White. Due to this fact and the limited number of Large White animals available in each organization, there was a pressing need for ascertaining if multi-subpopulation genomic prediction is superior compared with within-subpopulation prediction in pigs. Direct genomic breeding values were estimated with genomic BLUP for the trait "number of piglets born alive" using genotype data (Illumina Porcine 60K SNP BeadChip) from 2,053 German Large White animals from five different commercial pig breeding companies. To assess the prediction accuracy of within- and multi-subpopulation reference sets, a random 5-fold cross-validation with 20 replications was performed. The five subpopulations considered were only slightly differentiated from each other. However, the prediction accuracy of the multi-subpopulations approach was not better than that of the within-subpopulation evaluation, for which the predictive ability was already high. Reference sets composed of closely related multi-subpopulation sets performed better than sets of distantly related subpopulations but not better than the within-subpopulation approach. Despite the low differentiation of the five subpopulations, the genetic connectedness between these different subpopulations seems to be too small to improve the prediction accuracy by applying multi-subpopulation reference sets. Consequently, resources should be used for enlarging the reference population within subpopulation, for example, by adding genotyped females.
Global methylation screening in the Arabidopsis thaliana and Mus musculus genome: applications of virtual image restriction landmark genomic scanning (Vi-RLGS)

PubMed Central

Matsuyama, Tomoki; Kimura, Makoto T.; Koike, Kuniaki; Abe, Tomoko; Nakano, Takeshi; Asami, Tadao; Ebisuzaki, Toshikazu; Held, William A.; Yoshida, Shigeo; Nagase, Hiroki

2003-01-01

Understanding the role of ‘epigenetic’ changes such as DNA methylation and chromatin remodeling has now become critical in understanding many biological processes. In order to delineate the global methylation pattern in a given genomic DNA, computer software has been developed to create a virtual image of restriction landmark genomic scanning (Vi-RLGS). When using a methylation- sensitive enzyme such as NotI as the restriction landmark, the comparison between real and in silico RLGS profiles of the genome provides a methylation map of genomic NotI sites. A methylation map of the Arabidopsis genome was created that could be confirmed by a methylation-sensitive PCR assay. The method has also been applied to the mouse genome. Although a complete methylation map has not been completed, a region of methylation difference between two tissues has been tested and confirmed by bisulfite sequencing. Vi-RLGS in conjunction with real RLGS will make it possible to develop a more complete map of genomic sites that are methylated or demethylated as a consequence of normal or abnormal development. PMID:12888509
Developmental stage related patterns of codon usage and genomic GC content: searching for evolutionary fingerprints with models of stem cell differentiation

PubMed Central

2007-01-01

Background The usage of synonymous codons shows considerable variation among mammalian genes. How and why this usage is non-random are fundamental biological questions and remain controversial. It is also important to explore whether mammalian genes that are selectively expressed at different developmental stages bear different molecular features. Results In two models of mouse stem cell differentiation, we established correlations between codon usage and the patterns of gene expression. We found that the optimal codons exhibited variation (AT- or GC-ending codons) in different cell types within the developmental hierarchy. We also found that genes that were enriched (developmental-pivotal genes) or specifically expressed (developmental-specific genes) at different developmental stages had different patterns of codon usage and local genomic GC (GCg) content. Moreover, at the same developmental stage, developmental-specific genes generally used more GC-ending codons and had higher GCg content compared with developmental-pivotal genes. Further analyses suggest that the model of translational selection might be consistent with the developmental stage-related patterns of codon usage, especially for the AT-ending optimal codons. In addition, our data show that after human-mouse divergence, the influence of selective constraints is still detectable. Conclusion Our findings suggest that developmental stage-related patterns of gene expression are correlated with codon usage (GC3) and GCg content in stem cell hierarchies. Moreover, this paper provides evidence for the influence of natural selection at synonymous sites in the mouse genome and novel clues for linking the molecular features of genes to their patterns of expression during mammalian ontogenesis. PMID:17349061
A Reference Genome for US Rice

USDA-ARS?s Scientific Manuscript database

The development of reference genomes for rice has served as means for understanding the allelic diversity and genetic structure of a cereal grain that feeds half of the world. It has long been understood that Oryza sativa diverged into two major sub-populations Indica and Japonica, over 400 K years ...
Potential benefits from using a new reference map in genomic prediction

USDA-ARS?s Scientific Manuscript database

Many genomic studies in cattle have used the 2009 reference assembly from the University of Maryland (UMD3.1). A new USDA Agricultural Research Service-University of California, Davis (ARS-UCD) assembly based on longer DNA reads from the same cow (Dominette) should improve sequence alignment, imputa...
PR-Set7 is degraded in a conditional Cul4A transgenic mouse model of lung cancer

DOE PAGES

Wang, Yang; Xu, Zhidong; Mao, Jian -Hua; ...

2015-06-01

Background and objective. Maintenance of genomic integrity is essential to ensure normal organismal development and to prevent diseases such as cancer. PR-Set7 (also known as Set8) is a cell cycle regulated enzyme that catalyses monomethylation of histone 4 at Lys20 (H4K20me1) to promote chromosome condensation and prevent DNA damage. Recent studies show that CRL4CDT2-mediated ubiquitylation of PR-Set7 leads to its degradation during S phase and after DNA damage. This might occur to ensure appropriate changes in chromosome structure during the cell cycle or to preserve genome integrity after DNA damage. Methods. We developed a new model of lung tumor developmentmore » in mice harboring a conditionally expressed allele of Cul4A. We have therefore used a mouse model to demonstrate for the first time that Cul4A is oncogenic in vivo. With this model, staining of PR-Set7 in the preneoplastic and tumor lesions in AdenoCre-induced mouse lungs was performed. Meanwhile we identified higher protein level changes of γ-tubulin and pericentrin by IHC. Results. The level of PR-Set7 down-regulated in the preneoplastic and adenocarcinomous lesions following over-expression of Cul4A. We also identified higher levels of the proteins pericentrin and γ-tubulin in Cul4A mouse lungs induced by AdenoCre. Conclusion. PR-Set7 is a direct target of Cul4A for degradation and involved in the formation of lung tumors in the conditional Cul4A transgenic mouse model.« less
PR-Set7 is degraded in a conditional Cul4A transgenic mouse model of lung cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Yang; Xu, Zhidong; Mao, Jian -Hua

Background and objective. Maintenance of genomic integrity is essential to ensure normal organismal development and to prevent diseases such as cancer. PR-Set7 (also known as Set8) is a cell cycle regulated enzyme that catalyses monomethylation of histone 4 at Lys20 (H4K20me1) to promote chromosome condensation and prevent DNA damage. Recent studies show that CRL4CDT2-mediated ubiquitylation of PR-Set7 leads to its degradation during S phase and after DNA damage. This might occur to ensure appropriate changes in chromosome structure during the cell cycle or to preserve genome integrity after DNA damage. Methods. We developed a new model of lung tumor developmentmore » in mice harboring a conditionally expressed allele of Cul4A. We have therefore used a mouse model to demonstrate for the first time that Cul4A is oncogenic in vivo. With this model, staining of PR-Set7 in the preneoplastic and tumor lesions in AdenoCre-induced mouse lungs was performed. Meanwhile we identified higher protein level changes of γ-tubulin and pericentrin by IHC. Results. The level of PR-Set7 down-regulated in the preneoplastic and adenocarcinomous lesions following over-expression of Cul4A. We also identified higher levels of the proteins pericentrin and γ-tubulin in Cul4A mouse lungs induced by AdenoCre. Conclusion. PR-Set7 is a direct target of Cul4A for degradation and involved in the formation of lung tumors in the conditional Cul4A transgenic mouse model.« less

Some links on this page may take you to non-federal websites. Their policies may differ from this site.