Sample records for comparative genomic characterization

  1. Carnivore-specific SINEs (Can-SINEs): distribution, evolution, and genomic impact.

    PubMed

    Walters-Conte, Kathryn B; Johnson, Diana L E; Allard, Marc W; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics.

  2. Carnivore-Specific SINEs (Can-SINEs): Distribution, Evolution, and Genomic Impact

    PubMed Central

    Johnson, Diana L.E.; Allard, Marc W.; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics. PMID:21846743

  3. Whole Genome Amplification of Labeled Viable Single Cells Suited for Array-Comparative Genomic Hybridization.

    PubMed

    Kroneis, Thomas; El-Heliebi, Amin

    2015-01-01

    Understanding details of a complex biological system makes it necessary to dismantle it down to its components. Immunostaining techniques allow identification of several distinct cell types thereby giving an inside view of intercellular heterogeneity. Often staining reveals that the most remarkable cells are the rarest. To further characterize the target cells on a molecular level, single cell techniques are necessary. Here, we describe the immunostaining, micromanipulation, and whole genome amplification of single cells for the purpose of genomic characterization. First, we exemplify the preparation of cell suspensions from cultured cells as well as the isolation of peripheral mononucleated cells from blood. The target cell population is then subjected to immunostaining. After cytocentrifugation target cells are isolated by micromanipulation and forwarded to whole genome amplification. For whole genome amplification, we use GenomePlex(®) technology allowing downstream genomic analysis such as array-comparative genomic hybridization.

  4. Phytozome Comparative Plant Genomics Portal

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  5. Improving Microbial Genome Annotations in an Integrated Database Context

    PubMed Central

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Anderson, Iain; Mavromatis, Konstantinos; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2013-01-01

    Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/. PMID:23424620

  6. Typing and comparative genome analysis of Brucella melitensis isolated from Lebanon.

    PubMed

    Abou Zaki, Natalia; Salloum, Tamara; Osman, Marwan; Rafei, Rayane; Hamze, Monzer; Tokajian, Sima

    2017-10-16

    Brucella melitensis is the main causative agent of the zoonotic disease brucellosis. This study aimed at typing and characterizing genetic variation in 33 Brucella isolates recovered from patients in Lebanon. Bruce-ladder multiplex PCR and PCR-RFLP of omp31, omp2a and omp2b were performed. Sixteen representative isolates were chosen for draft-genome sequencing and analyzed to determine variations in virulence, resistance, genomic islands, prophages and insertion sequences. Comparative whole-genome single nucleotide polymorphism analysis was also performed. The isolates were confirmed to be B. melitensis. Genome analysis revealed multiple virulence determinants and efflux pumps. Genome comparisons and single nucleotide polymorphisms divided the isolates based on geographical distribution but revealed high levels of similarity between the strains. Sequence divergence in B. melitensis was mainly due to lateral gene transfer of mobile elements. This is the first report of an in-depth genomic characterization of B. melitensis in Lebanon. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    PubMed Central

    Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value. PMID:27200191

  8. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    PubMed

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  9. Genomic structural differences between cattle and river buffalo identified through a combination and genomic and transcriptomic analysis

    USDA-ARS?s Scientific Manuscript database

    Water buffalo (Bubalus bubalis L.) is an important livestock species worldwide. Like many other livestock species, water buffalo lacks high quality and continuous reference genome assembly required for fine-scale comparative genomics studies. In this work, we present a dataset, which characterizes g...

  10. Comparative genome analysis and characterization of the Salmonella Typhimurium strain CCRJ_26 isolated from swine carcasses using whole-genome sequencing approach.

    PubMed

    Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A

    2018-04-01

    Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.

  11. Use of comparative genomics approaches to characterize interspecies differences in response to environmental chemicals: Challenges, opportunities, and research needs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burgess-Herbert, Sarah L., E-mail: sarah.burgess@alum.mit.edu; Euling, Susan Y.

    A critical challenge for environmental chemical risk assessment is the characterization and reduction of uncertainties introduced when extrapolating inferences from one species to another. The purpose of this article is to explore the challenges, opportunities, and research needs surrounding the issue of how genomics data and computational and systems level approaches can be applied to inform differences in response to environmental chemical exposure across species. We propose that the data, tools, and evolutionary framework of comparative genomics be adapted to inform interspecies differences in chemical mechanisms of action. We compare and contrast existing approaches, from disciplines as varied as evolutionarymore » biology, systems biology, mathematics, and computer science, that can be used, modified, and combined in new ways to discover and characterize interspecies differences in chemical mechanism of action which, in turn, can be explored for application to risk assessment. We consider how genetic, protein, pathway, and network information can be interrogated from an evolutionary biology perspective to effectively characterize variations in biological processes of toxicological relevance among organisms. We conclude that comparative genomics approaches show promise for characterizing interspecies differences in mechanisms of action, and further, for improving our understanding of the uncertainties inherent in extrapolating inferences across species in both ecological and human health risk assessment. To achieve long-term relevance and consistent use in environmental chemical risk assessment, improved bioinformatics tools, computational methods robust to data gaps, and quantitative approaches for conducting extrapolations across species are critically needed. Specific areas ripe for research to address these needs are recommended.« less

  12. Structural characterization of Brachypodium genome and its syntenic relationship with rice and wheat

    USDA-ARS?s Scientific Manuscript database

    Brachypodium distachyon (Brachypodium) has been recently recognized as an emerging model system for both comparative and functional genomics in grass species. In this study, 55,221 repeat masked Brachypodium BAC end sequences (BES) were used for comparative analysis against the 12 rice pseudomolecul...

  13. Comparative population genomics of Fusarium graminearum reveals adaptive divergence among cereal head blight pathogens

    USDA-ARS?s Scientific Manuscript database

    In this study we sequenced the genomes of 60 Fusarium graminearum, the major fungal pathogen responsible for Fusarium head blight (FHB) in cereal crops world-wide. To investigate adaptive evolution of FHB pathogens, we performed population-level analyses to characterize genomic structure, signatures...

  14. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  15. The mitochondrial genome of the ascalaphid owlfly Libelloides macaronius and comparative evolutionary mitochondriomics of neuropterid insects

    PubMed Central

    2011-01-01

    Background The insect order Neuroptera encompasses more than 5,700 described species. To date, only three neuropteran mitochondrial genomes have been fully and one partly sequenced. Current knowledge on neuropteran mitochondrial genomes is limited, and new data are strongly required. In the present work, the mitochondrial genome of the ascalaphid owlfly Libelloides macaronius is described and compared with the known neuropterid mitochondrial genomes: Megaloptera, Neuroptera and Raphidioptera. These analyses are further extended to other endopterygotan orders. Results The mitochondrial genome of L. macaronius is a circular molecule 15,890 bp long. It includes the entire set of 37 genes usually present in animal mitochondrial genomes. The gene order of this newly sequenced genome is unique among Neuroptera and differs from the ancestral type of insects in the translocation of trnC. The L. macaronius genome shows the lowest A+T content (74.50%) among known neuropterid genomes. Protein-coding genes possess the typical mitochondrial start codons, except for cox1, which has an unusual ACG. Comparisons among endopterygotan mitochondrial genomes showed that A+T content and AT/GC-skews exhibit a broad range of variation among 84 analyzed taxa. Comparative analyses showed that neuropterid mitochondrial protein-coding genes experienced complex evolutionary histories, involving features ranging from codon usage to rate of substitution, that make them potential markers for population genetics/phylogenetics studies at different taxonomic ranks. The 22 tRNAs show variable substitution patterns in Neuropterida, with higher sequence conservation in genes located on the α strand. Inferred secondary structures for neuropterid rrnS and rrnL genes largely agree with those known for other insects. For the first time, a model is provided for domain I of an insect rrnL. The control region in Neuropterida, as in other insects, is fast-evolving genomic region, characterized by AT-rich motifs. Conclusions The new genome shares many features with known neuropteran genomes but differs in its low A+T content. Comparative analysis of neuropterid mitochondrial genes showed that they experienced distinct evolutionary patterns. Both tRNA families and ribosomal RNAs show composite substitution pathways. The neuropterid mitochondrial genome is characterized by a complex evolutionary history. PMID:21569260

  16. Alu repeat discovery and characterization within human genomes

    PubMed Central

    Hormozdiari, Fereydoun; Alkan, Can; Ventura, Mario; Hajirasouliha, Iman; Malig, Maika; Hach, Faraz; Yorukoglu, Deniz; Dao, Phuong; Bakhshi, Marzieh; Sahinalp, S. Cenk; Eichler, Evan E.

    2011-01-01

    Human genomes are now being rapidly sequenced, but not all forms of genetic variation are routinely characterized. In this study, we focus on Alu retrotransposition events and seek to characterize differences in the pattern of mobile insertion between individuals based on the analysis of eight human genomes sequenced using next-generation sequencing. Applying a rapid read-pair analysis algorithm, we discover 4342 Alu insertions not found in the human reference genome and show that 98% of a selected subset (63/64) experimentally validate. Of these new insertions, 89% correspond to AluY elements, suggesting that they arose by retrotransposition. Eighty percent of the Alu insertions have not been previously reported and more novel events were detected in Africans when compared with non-African samples (76% vs. 69%). Using these data, we develop an experimental and computational screen to identify ancestry informative Alu retrotransposition events among different human populations. PMID:21131385

  17. Comparative Genomics and Host Resistance against Infectious Diseases

    PubMed Central

    Qureshi, Salman T.; Skamene, Emil

    1999-01-01

    The large size and complexity of the human genome have limited the identification and functional characterization of components of the innate immune system that play a critical role in front-line defense against invading microorganisms. However, advances in genome analysis (including the development of comprehensive sets of informative genetic markers, improved physical mapping methods, and novel techniques for transcript identification) have reduced the obstacles to discovery of novel host resistance genes. Study of the genomic organization and content of widely divergent vertebrate species has shown a remarkable degree of evolutionary conservation and enables meaningful cross-species comparison and analysis of newly discovered genes. Application of comparative genomics to host resistance will rapidly expand our understanding of human immune defense by facilitating the translation of knowledge acquired through the study of model organisms. We review the rationale and resources for comparative genomic analysis and describe three examples of host resistance genes successfully identified by this approach. PMID:10081670

  18. Whole-Genome Characterization of Prunus necrotic ringspot virus Infecting Sweet Cherry in China

    PubMed Central

    2018-01-01

    ABSTRACT Prunus necrotic ringspot virus (PNRSV) causes yield loss in most cultivated stone fruits, including sweet cherry. Using a small RNA deep-sequencing approach combined with end-genome sequence cloning, we identified the complete genomes of all three PNRSV strands from PNRSV-infected sweet cherry trees and compared them with those of two previously reported isolates. PMID:29496825

  19. Comparative genomic and phylogenetic investigation of the xenobiotic metabolizing arylamine N-acetyltransferase enzyme family

    USDA-ARS?s Scientific Manuscript database

    Arylamine N-acetyltransferases (NATs) are xenobiotic metabolizing enzymes characterized in several bacteria and eukaryotic organisms. We report a comprehensive phylogenetic analysis employing an exhaustive dataset of NAT-homologous sequences recovered through inspection of 2445 genomes. We describe ...

  20. Genome-wide array-based comparative genomic hybridization (array-CGH) analysis in Aicardi Syndrome

    USDA-ARS?s Scientific Manuscript database

    Aicardi syndrome is characterized by agenesis of the corpus callosum, chorioretinal lacunae, severe seizures (starting as infantile spasms), neuronal migration defects, mental retardation, costovertebral defects, and typical facial features. Because Aicardi syndrome is sporadic and affects only fem...

  1. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  2. Genetic, genomic, and molecular tools for studying the protoploid yeast, L. waltii.

    PubMed

    Di Rienzi, Sara C; Lindstrom, Kimberly C; Lancaster, Ragina; Rolczynski, Lisa; Raghuraman, M K; Brewer, Bonita J

    2011-02-01

    Sequencing of the yeast Kluyveromyces waltii (recently renamed Lachancea waltii) provided evidence of a whole genome duplication event in the lineage leading to the well-studied Saccharomyces cerevisiae. While comparative genomic analyses of these yeasts have proven to be extremely instructive in modeling the loss or maintenance of gene duplicates, experimental tests of the ramifications following such genome alterations remain difficult. To transform L. waltii from an organism of the computational comparative genomic literature into an organism of the functional comparative genomic literature, we have developed genetic, molecular and genomic tools for working with L. waltii. In particular, we have characterized basic properties of L. waltii (growth, ploidy, molecular karyotype, mating type and the sexual cycle), developed transformation, cell cycle arrest and synchronization protocols, and have created centromeric and non-centromeric vectors as well as a genome browser for L. waltii. We hope that these tools will be used by the community to follow up on the ideas generated by sequence data and lead to a greater understanding of eukaryotic biology and genome evolution. 2010 John Wiley & Sons, Ltd.

  3. Genetic, genomic, and molecular tools for studying the protoploid yeast, L. waltii

    PubMed Central

    Di Rienzi, Sara C.; Lindstrom, Kimberly C.; Lancaster, Ragina; Rolczynski, Lisa; Raghuraman, M. K.; Brewer, Bonita J.

    2011-01-01

    Sequencing of the yeast Kluyveromyces waltii (recently renamed Lachancea waltii) provided evidence of a whole genome duplication event in the lineage leading to the well-studied Saccharomyces cerevisiae. While comparative genomic analyses of these yeasts have proven to be extremely instructive in modeling the loss or maintenance of gene duplicates, experimental tests of the ramifications following such genome alterations remain difficult. To transform L. waltii from an organism of the computational comparative genomic literature into an organism of the functional comparative genomic literature, we have developed genetic, molecular and genomic tools for working with L. waltii. In particular, we have characterized basic properties of L. waltii (growth, ploidy, molecular karyotype, mating type and the sexual cycle), developed transformation, cell cycle arrest and synchronization protocols, and have created centromeric and non-centromeric vectors as well as a genome browser for L. waltii. We hope that these tools will be used by the community to follow up on the ideas generated by sequence data and lead to a greater understanding of eukaryotic biology and genome evolution. PMID:21246627

  4. Characterization of Deletions of the HBA and HBB Loci by Array Comparative Genomic Hybridization

    PubMed Central

    Sabath, Daniel E.; Bender, Michael A.; Sankaran, Vijay G.; Vamos, Esther; Kentsis, Alex; Yi, Hye-Son; Greisman, Harvey A.

    2017-01-01

    Thalassemia is among the most common genetic diseases worldwide. α-Thalassemia is usually caused by deletion of one or more of the duplicated HBA genes on chromosome 16. In contrast, most β-thalassemia results from point mutations that decrease or eliminate expression of the HBB gene on chromosome 11. Deletions within the HBB locus result in thalassemia or hereditary persistence of fetal Hb. Although routine diagnostic testing cannot distinguish thalassemia deletions from point mutations, deletional hereditary persistence of fetal Hb is notable for having an elevated HbF level with a normal mean corpuscular volume. A small number of deletions accounts for most α-thalassemias; in contrast, there are no predominant HBB deletions causing β-thalassemia. To facilitate the identification and characterization of deletions of the HBA and HBB globin loci, we performed array-based comparative genomic hybridization using a custom oligonucleotide microarray. We accurately mapped the breakpoints of known and previously uncharacterized HBB deletions defining previously uncharacterized deletion breakpoints by PCR amplification and sequencing. The array also successfully identified the common HBA deletions --SEA and --FIL. In summary, comparative genomic hybridization can be used to characterize deletions of the HBA and HBB loci, allowing high-resolution characterization of novel deletions that are not readily detected by PCR-based methods. PMID:26612711

  5. An Exploration into Fern Genome Space.

    PubMed

    Wolf, Paul G; Sessa, Emily B; Marchant, Daniel Blaine; Li, Fay-Wei; Rothfels, Carl J; Sigel, Erin M; Gitzendanner, Matthew A; Visger, Clayton J; Banks, Jo Ann; Soltis, Douglas E; Soltis, Pamela S; Pryer, Kathleen M; Der, Joshua P

    2015-08-26

    Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Characterization of canine osteosarcoma by array comparative genomic hybridization and RT-qPCR: signatures of genomic imbalance in canine osteosarcoma parallel the human counterpart.

    PubMed

    Angstadt, Andrea Y; Motsinger-Reif, Alison; Thomas, Rachael; Kisseberth, William C; Guillermo Couto, C; Duval, Dawn L; Nielsen, Dahlia M; Modiano, Jaime F; Breen, Matthew

    2011-11-01

    Osteosarcoma (OS) is the most commonly diagnosed malignant bone tumor in humans and dogs, characterized in both species by extremely complex karyotypes exhibiting high frequencies of genomic imbalance. Evaluation of genomic signatures in human OS using array comparative genomic hybridization (aCGH) has assisted in uncovering genetic mechanisms that result in disease phenotype. Previous low-resolution (10-20 Mb) aCGH analysis of canine OS identified a wide range of recurrent DNA copy number aberrations, indicating extensive genomic instability. In this study, we profiled 123 canine OS tumors by 1 Mb-resolution aCGH to generate a dataset for direct comparison with current data for human OS, concluding that several high frequency aberrations in canine and human OS are orthologous. To ensure complete coverage of gene annotation, we identified the human refseq genes that map to these orthologous aberrant dog regions and found several candidate genes warranting evaluation for OS involvement. Specifically, subsequenct FISH and qRT-PCR analysis of RUNX2, TUSC3, and PTEN indicated that expression levels correlated with genomic copy number status, showcasing RUNX2 as an OS associated gene and TUSC3 as a possible tumor suppressor candidate. Together these data demonstrate the ability of genomic comparative oncology to identify genetic abberations which may be important for OS progression. Large scale screening of genomic imbalance in canine OS further validates the use of the dog as a suitable model for human cancers, supporting the idea that dysregulation discovered in canine cancers will provide an avenue for complementary study in human counterparts. Copyright © 2011 Wiley-Liss, Inc.

  7. Whole-Genome Characterization of Prunus necrotic ringspot virus Infecting Sweet Cherry in China.

    PubMed

    Wang, Jiawei; Zhai, Ying; Zhu, Dongzi; Liu, Weizhen; Pappu, Hanu R; Liu, Qingzhong

    2018-03-01

    Prunus necrotic ringspot virus (PNRSV) causes yield loss in most cultivated stone fruits, including sweet cherry. Using a small RNA deep-sequencing approach combined with end-genome sequence cloning, we identified the complete genomes of all three PNRSV strands from PNRSV-infected sweet cherry trees and compared them with those of two previously reported isolates. Copyright © 2018 Wang et al.

  8. Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana.

    PubMed

    Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi

    2014-01-03

    Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.

  9. PLAZA 3.0: an access point for plant comparative genomics

    PubMed Central

    Proost, Sebastian; Van Bel, Michiel; Vaneechoutte, Dries; Van de Peer, Yves; Inzé, Dirk; Mueller-Roeber, Bernd; Vandepoele, Klaas

    2015-01-01

    Comparative sequence analysis has significantly altered our view on the complexity of genome organization and gene functions in different kingdoms. PLAZA 3.0 is designed to make comparative genomics data for plants available through a user-friendly web interface. Structural and functional annotation, gene families, protein domains, phylogenetic trees and detailed information about genome organization can easily be queried and visualized. Compared with the first version released in 2009, which featured nine organisms, the number of integrated genomes is more than four times higher, and now covers 37 plant species. The new species provide a wider phylogenetic range as well as a more in-depth sampling of specific clades, and genomes of additional crop species are present. The functional annotation has been expanded and now comprises data from Gene Ontology, MapMan, UniProtKB/Swiss-Prot, PlnTFDB and PlantTFDB. Furthermore, we improved the algorithms to transfer functional annotation from well-characterized plant genomes to other species. The additional data and new features make PLAZA 3.0 (http://bioinformatics.psb.ugent.be/plaza/) a versatile and comprehensible resource for users wanting to explore genome information to study different aspects of plant biology, both in model and non-model organisms. PMID:25324309

  10. COGNATE: comparative gene annotation characterizer.

    PubMed

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https://github.com/ZFMK/COGNATE ). The tool COGNATE allows comparing genome assemblies and structural elements on multiples levels (e.g., scaffold or contig sequence, gene). It clearly enhances comparability between analyses. Thus, COGNATE can provide the important standardization of both genome and gene structure parameter disclosure as well as data acquisition for future comparative analyses. With the establishment of comprehensive descriptive standards and the extensive availability of genomes, an encompassing database will become possible.

  11. Comparative genomic analysis of the multispecies probiotic-marketed product VSL#3.

    PubMed

    Douillard, François P; Mora, Diego; Eijlander, Robyn T; Wels, Michiel; de Vos, Willem M

    2018-01-01

    Several probiotic-marketed formulations available for the consumers contain live lactic acid bacteria and/or bifidobacteria. The multispecies product commercialized as VSL#3 has been used for treating various gastro-intestinal disorders. However, like many other products, the bacterial strains present in VSL#3 have only been characterized to a limited extent and their efficacy as well as their predicted mode of action remain unclear, preventing further applications or comparative studies. In this work, the genomes of all eight bacterial strains present in VSL#3 were sequenced and characterized, to advance insights into the possible mode of action of this product and also to serve as a basis for future work and trials. Phylogenetic and genomic data analysis allowed us to identify the 7 species present in the VSL#3 product as specified by the manufacturer. The 8 strains present belong to the species Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus helveticus, Bifidobacterium breve and B. animalis subsp. lactis (two distinct strains). Comparative genomics revealed that the draft genomes of the S. thermophilus and L. helveticus strains were predicted to encode most of the defence systems such as restriction modification and CRISPR-Cas systems. Genes associated with a variety of potential probiotic functions were also identified. Thus, in the three Bifidobacterium spp., gene clusters were predicted to encode tight adherence pili, known to promote bacteria-host interaction and intestinal barrier integrity, and to impact host cell development. Various repertoires of putative signalling proteins were predicted to be encoded by the genomes of the Lactobacillus spp., i.e. surface layer proteins, LPXTG-containing proteins, or sortase-dependent pili that may interact with the intestinal mucosa and dendritic cells. Taken altogether, the individual genomic characterization of the strains present in the VSL#3 product confirmed the product specifications, determined its coding capacity as well as identified potential probiotic functions.

  12. Comparative genomic analysis of the multispecies probiotic-marketed product VSL#3

    PubMed Central

    Mora, Diego; Eijlander, Robyn T.; Wels, Michiel; de Vos, Willem M.

    2018-01-01

    Several probiotic-marketed formulations available for the consumers contain live lactic acid bacteria and/or bifidobacteria. The multispecies product commercialized as VSL#3 has been used for treating various gastro-intestinal disorders. However, like many other products, the bacterial strains present in VSL#3 have only been characterized to a limited extent and their efficacy as well as their predicted mode of action remain unclear, preventing further applications or comparative studies. In this work, the genomes of all eight bacterial strains present in VSL#3 were sequenced and characterized, to advance insights into the possible mode of action of this product and also to serve as a basis for future work and trials. Phylogenetic and genomic data analysis allowed us to identify the 7 species present in the VSL#3 product as specified by the manufacturer. The 8 strains present belong to the species Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus helveticus, Bifidobacterium breve and B. animalis subsp. lactis (two distinct strains). Comparative genomics revealed that the draft genomes of the S. thermophilus and L. helveticus strains were predicted to encode most of the defence systems such as restriction modification and CRISPR-Cas systems. Genes associated with a variety of potential probiotic functions were also identified. Thus, in the three Bifidobacterium spp., gene clusters were predicted to encode tight adherence pili, known to promote bacteria-host interaction and intestinal barrier integrity, and to impact host cell development. Various repertoires of putative signalling proteins were predicted to be encoded by the genomes of the Lactobacillus spp., i.e. surface layer proteins, LPXTG-containing proteins, or sortase-dependent pili that may interact with the intestinal mucosa and dendritic cells. Taken altogether, the individual genomic characterization of the strains present in the VSL#3 product confirmed the product specifications, determined its coding capacity as well as identified potential probiotic functions. PMID:29451876

  13. Functional genomics of lactic acid bacteria: from food to health

    PubMed Central

    2014-01-01

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health. PMID:25186768

  14. Functional genomics of lactic acid bacteria: from food to health.

    PubMed

    Douillard, François P; de Vos, Willem M

    2014-08-29

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health.

  15. Genomic characterization and phylogenetic analysis of Zika virus circulating in the Americas.

    PubMed

    Ye, Qing; Liu, Zhong-Yu; Han, Jian-Feng; Jiang, Tao; Li, Xiao-Feng; Qin, Cheng-Feng

    2016-09-01

    The rapid spread and potential link with birth defects have made Zika virus (ZIKV) a global public health problem. The virus was discovered 70years ago, yet the knowledge about its genomic structure and the genetic variations associated with current ZIKV explosive epidemics remains not fully understood. In this review, the genome organization, especially conserved terminal structures of ZIKV genome were characterized and compared with other mosquito-borne flaviviruses. It is suggested that major viral proteins of ZIKV share high structural and functional similarity with other known flaviviruses as shown by sequence comparison and prediction of functional motifs in viral proteins. Phylogenetic analysis demonstrated that all ZIKV strains circulating in the America form a unique clade within the Asian lineage. Furthermore, we identified a series of conserved amino acid residues that differentiate the Asian strains including the current circulating American strains from the ancient African strains. Overall, our findings provide an overview of ZIKV genome characterization and evolutionary dynamics in the Americas and point out critical clues for future virological and epidemiological studies. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. MICRA: an automatic pipeline for fast characterization of microbial genomes from high-throughput sequencing data.

    PubMed

    Caboche, Ségolène; Even, Gaël; Loywick, Alexandre; Audebert, Christophe; Hot, David

    2017-12-19

    The increase in available sequence data has advanced the field of microbiology; however, making sense of these data without bioinformatics skills is still problematic. We describe MICRA, an automatic pipeline, available as a web interface, for microbial identification and characterization through reads analysis. MICRA uses iterative mapping against reference genomes to identify genes and variations. Additional modules allow prediction of antibiotic susceptibility and resistance and comparing the results of several samples. MICRA is fast, producing few false-positive annotations and variant calls compared to current methods, making it a tool of great interest for fully exploiting sequencing data.

  17. Genome-wide analysis reveals class and gene specific codon usage adaptation in avian paramyxoviruses 1

    USDA-ARS?s Scientific Manuscript database

    In order to characterize the evolutionary adaptations of avian paramyxovirus 1 (APMV-1) genomes, we have compared codon usage and codon adaptation indexes among groups of Newcastle disease viruses that differ in biological, ecological, and genetic characteristics. We have used available GenBank com...

  18. GENOMIC DIVERSITY AND THE MICROENVIRONMENT AS DRIVERS OF PROGRESSION IN DCIS

    DTIC Science & Technology

    2017-10-01

    stains, including quantitative analysis, 7) Identification of upstaged DCIS cases for the radiology aim, 8) Development of image analysis methods for...goals of the project? Aim 1. Determine whether genetic diversity of DCIS is greater in DCIS with adjacent invasive disease compared to DCIS without... compared to DCIS without IDC. Since genomics is not the sole driver of tumor behavior, we will phenotypically characterize DCIS and its

  19. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    PubMed

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  20. Complete sequence and comparative analysis of the chloroplast genome of Plinia trunciflora

    PubMed Central

    Eguiluz, Maria; Yuyama, Priscila Mary; Guzman, Frank; Rodrigues, Nureyev Ferreira; Margis, Rogerio

    2017-01-01

    Abstract Plinia trunciflora is a Brazilian native fruit tree from the Myrtaceae family, also known as jaboticaba. This species has great potential by its fruit production. Due to the high content of essential oils in their leaves and of anthocyanins in the fruits, there is also an increasing interest by the pharmaceutical industry. Nevertheless, there are few studies focusing on its molecular biology and genetic characterization. We herein report the complete chloroplast (cp) genome of P. trunciflora using high-throughput sequencing and compare it to other previously sequenced Myrtaceae genomes. The cp genome of P. trunciflora is 159,512 bp in size, comprising inverted repeats of 26,414 bp and single-copy regions of 88,097 bp (LSC) and 18,587 bp (SSC). The genome contains 111 single-copy genes (77 protein-coding, 30 tRNA and four rRNA genes). Phylogenetic analysis using 57 cp protein-coding genes demonstrated that P. trunciflora, Eugenia uniflora and Acca sellowiana form a cluster with closer relationship to Syzygium cumini than with Eucalyptus. The complete cp sequence reported here can be used in evolutionary and population genetics studies, contributing to resolve the complex taxonomy of this species and fill the gap in genetic characterization. PMID:29111566

  1. Proteolysis in hyperthermophilic microorganisms

    DOE PAGES

    Ward, Donald E.; Shockley, Keith R.; Chang, Lara S.; ...

    2002-01-01

    Proteases are found in every cell, where they recognize and break down unneeded or abnormal polypeptides or peptide-based nutrients within or outside the cell. Genome sequence data can be used to compare proteolytic enzyme inventories of different organisms as they relate to physiological needs for protein modification and hydrolysis. In this review, we exploit genome sequence data to compare hyperthermophilic microorganisms from the euryarchaeotal genus Pyrococcus , the crenarchaeote Sulfolobus solfataricus , and the bacterium Thermotoga maritima . An overview of the proteases in these organisms is given based on those proteases that have been characterized and on putative proteasesmore » that have been identified from genomic sequences, but have yet to be characterized. The analysis revealed both similarities and differences in the mechanisms utilized for proteolysis by each of these hyperthermophiles and indicated how these mechanisms relate to proteolysis in less thermophilic cells and organisms.« less

  2. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution.

    PubMed

    Verde, Ignazio; Abbott, Albert G; Scalabrin, Simone; Jung, Sook; Shu, Shengqiang; Marroni, Fabio; Zhebentyayeva, Tatyana; Dettori, Maria Teresa; Grimwood, Jane; Cattonaro, Federica; Zuccolo, Andrea; Rossini, Laura; Jenkins, Jerry; Vendramin, Elisa; Meisel, Lee A; Decroocq, Veronique; Sosinski, Bryon; Prochnik, Simon; Mitros, Therese; Policriti, Alberto; Cipriani, Guido; Dondini, Luca; Ficklin, Stephen; Goodstein, David M; Xuan, Pengfei; Del Fabbro, Cristian; Aramini, Valeria; Copetti, Dario; Gonzalez, Susana; Horner, David S; Falchi, Rachele; Lucas, Susan; Mica, Erica; Maldonado, Jonathan; Lazzari, Barbara; Bielenberg, Douglas; Pirona, Raul; Miculan, Mara; Barakat, Abdelali; Testolin, Raffaele; Stella, Alessandra; Tartarini, Stefano; Tonutti, Pietro; Arús, Pere; Orellana, Ariel; Wells, Christina; Main, Dorrie; Vizzotto, Giannina; Silva, Herman; Salamini, Francesco; Schmutz, Jeremy; Morgante, Michele; Rokhsar, Daniel S

    2013-05-01

    Rosaceae is the most important fruit-producing clade, and its key commercially relevant genera (Fragaria, Rosa, Rubus and Prunus) show broadly diverse growth habits, fruit types and compact diploid genomes. Peach, a diploid Prunus species, is one of the best genetically characterized deciduous trees. Here we describe the high-quality genome sequence of peach obtained from a completely homozygous genotype. We obtained a complete chromosome-scale assembly using Sanger whole-genome shotgun methods. We predicted 27,852 protein-coding genes, as well as noncoding RNAs. We investigated the path of peach domestication through whole-genome resequencing of 14 Prunus accessions. The analyses suggest major genetic bottlenecks that have substantially shaped peach genome diversity. Furthermore, comparative analyses showed that peach has not undergone recent whole-genome duplication, and even though the ancestral triplicated blocks in peach are fragmentary compared to those in grape, all seven paleosets of paralogs from the putative paleoancestor are detectable.

  3. Genome size diversity in orchids: consequences and evolution

    PubMed Central

    Leitch, I. J.; Kahandawala, I.; Suda, J.; Hanson, L.; Ingrouille, M. J.; Chase, M. W.; Fay, M. F.

    2009-01-01

    Background The amount of DNA comprising the genome of an organism (its genome size) varies a remarkable 40 000-fold across eukaryotes, yet most groups are characterized by much narrower ranges (e.g. 14-fold in gymnosperms, 3- to 4-fold in mammals). Angiosperms stand out as one of the most variable groups with genome sizes varying nearly 2000-fold. Nevertheless within angiosperms the majority of families are characterized by genomes which are small and vary little. Species with large genomes are mostly restricted to a few monocots families including Orchidaceae. Scope A survey of the literature revealed that genome size data for Orchidaceae are comparatively rare representing just 327 species. Nevertheless they reveal that Orchidaceae are currently the most variable angiosperm family with genome sizes ranging 168-fold (1C = 0·33–55·4 pg). Analysing the data provided insights into the distribution, evolution and possible consequences to the plant of this genome size diversity. Conclusions Superimposing the data onto the increasingly robust phylogenetic tree of Orchidaceae revealed how different subfamilies were characterized by distinct genome size profiles. Epidendroideae possessed the greatest range of genome sizes, although the majority of species had small genomes. In contrast, the largest genomes were found in subfamilies Cypripedioideae and Vanilloideae. Genome size evolution within this subfamily was analysed as this is the only one with reasonable representation of data. This approach highlighted striking differences in genome size and karyotype evolution between the closely related Cypripedium, Paphiopedilum and Phragmipedium. As to the consequences of genome size diversity, various studies revealed that this has both practical (e.g. application of genetic fingerprinting techniques) and biological consequences (e.g. affecting where and when an orchid may grow) and emphasizes the importance of obtaining further genome size data given the considerable phylogenetic gaps which have been highlighted by the current study. PMID:19168860

  4. Improved maize reference genome with single-molecule technologies.

    PubMed

    Jiao, Yinping; Peluso, Paul; Shi, Jinghua; Liang, Tiffany; Stitzer, Michelle C; Wang, Bo; Campbell, Michael S; Stein, Joshua C; Wei, Xuehong; Chin, Chen-Shan; Guill, Katherine; Regulski, Michael; Kumari, Sunita; Olson, Andrew; Gent, Jonathan; Schneider, Kevin L; Wolfgruber, Thomas K; May, Michael R; Springer, Nathan M; Antoniou, Eric; McCombie, W Richard; Presting, Gernot G; McMullen, Michael; Ross-Ibarra, Jeffrey; Dawe, R Kelly; Hastie, Alex; Rank, David R; Ware, Doreen

    2017-06-22

    Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and notable improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed more than 130,000 intact transposable elements, allowing us to identify transposable element lineage expansions that are unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by single-molecule real-time sequencing. In addition, comparative optical mapping of two other inbred maize lines revealed a prevalence of deletions in regions of low gene density and maize lineage-specific genes.

  5. How much does the amphioxus genome represent the ancestor of chordates?

    PubMed

    Louis, Alexandra; Roest Crollius, Hugues; Robinson-Rechavi, Marc

    2012-03-01

    One of the main motivations to study amphioxus is its potential for understanding the last common ancestor of chordates, which notably gave rise to the vertebrates. An important feature in this respect is the slow evolutionary rate that seems to have characterized the cephalochordate lineage, making amphioxus an interesting proxy for the chordate ancestor, as well as a key lineage to include in comparative studies. Whereas slow evolution was first noticed at the phenotypic level, it has also been described at the genomic level. Here, we examine whether the amphioxus genome is indeed a good proxy for the genome of the chordate ancestor, with a focus on protein-coding genes. We investigate genome features, such as synteny, gene duplication and gene loss, and contrast the amphioxus genome with those of other deuterostomes that are used in comparative studies, such as Ciona, Oikopleura and urchin.

  6. PLAZA 3.0: an access point for plant comparative genomics.

    PubMed

    Proost, Sebastian; Van Bel, Michiel; Vaneechoutte, Dries; Van de Peer, Yves; Inzé, Dirk; Mueller-Roeber, Bernd; Vandepoele, Klaas

    2015-01-01

    Comparative sequence analysis has significantly altered our view on the complexity of genome organization and gene functions in different kingdoms. PLAZA 3.0 is designed to make comparative genomics data for plants available through a user-friendly web interface. Structural and functional annotation, gene families, protein domains, phylogenetic trees and detailed information about genome organization can easily be queried and visualized. Compared with the first version released in 2009, which featured nine organisms, the number of integrated genomes is more than four times higher, and now covers 37 plant species. The new species provide a wider phylogenetic range as well as a more in-depth sampling of specific clades, and genomes of additional crop species are present. The functional annotation has been expanded and now comprises data from Gene Ontology, MapMan, UniProtKB/Swiss-Prot, PlnTFDB and PlantTFDB. Furthermore, we improved the algorithms to transfer functional annotation from well-characterized plant genomes to other species. The additional data and new features make PLAZA 3.0 (http://bioinformatics.psb.ugent.be/plaza/) a versatile and comprehensible resource for users wanting to explore genome information to study different aspects of plant biology, both in model and non-model organisms. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform1[OPEN

    PubMed Central

    Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-chen; Paterson, Andrew H.

    2017-01-01

    Mainly due to their economic importance, genomes of 10 legumes, including soybean (Glycine max), wild peanut (Arachis duranensis and Arachis ipaensis), and barrel medic (Medicago truncatula), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape (Vitis vinifera) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). PMID:28325848

  8. Comparative genomics of pyridoxal 5′-phosphate-dependent transcription factor regulons in Bacteria

    PubMed Central

    Suvorova, Inna A.

    2016-01-01

    The MocR-subfamily transcription factors (MocR-TFs) characterized by the GntR-family DNA-binding domain and aminotransferase-like sensory domain are broadly distributed among certain lineages of Bacteria. Characterized MocR-TFs bind pyridoxal 5′-phosphate (PLP) and control transcription of genes involved in PLP, gamma aminobutyric acid (GABA) and taurine metabolism via binding specific DNA operator sites. To identify putative target genes and DNA binding motifs of MocR-TFs, we performed comparative genomics analysis of over 250 bacterial genomes. The reconstructed regulons for 825 MocR-TFs comprise structural genes from over 200 protein families involved in diverse biological processes. Using the genome context and metabolic subsystem analysis we tentatively assigned functional roles for 38 out of 86 orthologous groups of studied regulators. Most of these MocR-TF regulons are involved in PLP metabolism, as well as utilization of GABA, taurine and ectoine. The remaining studied MocR-TF regulators presumably control genes encoding enzymes involved in reduction/oxidation processes, various transporters and PLP-dependent enzymes, for example aminotransferases. Predicted DNA binding motifs of MocR-TFs are generally similar in each orthologous group and are characterized by two to four repeated sequences. Identified motifs were classified according to their structures. Motifs with direct and/or inverted repeat symmetry constitute the majority of inferred DNA motifs, suggesting preferable TF dimerization in head-to-tail or head-to-head configuration. The obtained genomic collection of in silico reconstructed MocR-TF motifs and regulons in Bacteria provides a basis for future experimental characterization of molecular mechanisms for various regulators in this family. PMID:28348826

  9. Development of a fluorescence-activated cell sorting method coupled with whole genome amplification to analyze minority and trace Dehalococcoides genomes in microbial communities.

    PubMed

    Lee, Patrick K H; Men, Yujie; Wang, Shanquan; He, Jianzhong; Alvarez-Cohen, Lisa

    2015-02-03

    Dehalococcoides mccartyi are functionally important bacteria that catalyze the reductive dechlorination of chlorinated ethenes. However, these anaerobic bacteria are fastidious to isolate, making downstream genomic characterization challenging. In order to facilitate genomic analysis, a fluorescence-activated cell sorting (FACS) method was developed in this study to separate D. mccartyi cells from a microbial community, and the DNA of the isolated cells was processed by whole genome amplification (WGA) and hybridized onto a D. mccartyi microarray for comparative genomics against four sequenced strains. First, FACS was successfully applied to a D. mccartyi isolate as positive control, and then microarray results verified that WGA from 10(6) cells or ∼1 ng of genomic DNA yielded high-quality coverage detecting nearly all genes across the genome. As expected, some inter- and intrasample variability in WGA was observed, but these biases were minimized by performing multiple parallel amplifications. Subsequent application of the FACS and WGA protocols to two enrichment cultures containing ∼10% and ∼1% D. mccartyi cells successfully enabled genomic analysis. As proof of concept, this study demonstrates that coupling FACS with WGA and microarrays is a promising tool to expedite genomic characterization of target strains in environmental communities where the relative concentrations are low.

  10. A genomic view of food-related and probiotic Enterococcus strains

    PubMed Central

    Suárez, Nadia; Hormigo, Ricardo; Fadda, Silvina; Saavedra, Lucila

    2017-01-01

    Abstract The study of enterococcal genomes has grown considerably in recent years. While special attention is paid to comparative genomic analysis among clinical relevant isolates, in this study we performed an exhaustive comparative analysis of enterococcal genomes of food origin and/or with potential to be used as probiotics. Beyond common genetic features, we especially aimed to identify those that are specific to enterococcal strains isolated from a certain food-related source as well as features present in a species-specific manner. Thus, the genome sequences of 25 Enterococcus strains, from 7 different species, were examined and compared. Their phylogenetic relationship was reconstructed based on orthologous proteins and whole genomes. Likewise, markers associated with a successful colonization (bacteriocin genes and genomic islands) and genome plasticity (phages and clustered regularly interspaced short palindromic repeats) were investigated for lifestyle specific genetic features. At the same time, a search for antibiotic resistance genes was carried out, since they are of big concern in the food industry. Finally, it was possible to locate 1617 FIGfam families as a core proteome universally present among the genera and to determine that most of the accessory genes code for hypothetical proteins, providing reasonable hints to support their functional characterization. PMID:27773878

  11. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    PubMed

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes.

  12. Tracing phylogenomic events leading to diversity of Haemophilus influenzae and the emergence of Brazilian Purpuric Fever (BPF)-associated clones

    PubMed Central

    Papazisi, Leka; Ratnayake, Shashikala; Remortel, Brian G.; Bock, Geoffrey R.; Liang, Wei; Saeed, Alexander I.; Liu, Jia; Fleischmann, Robert D.; Kilian, Mogens; Peterson, Scott N.

    2010-01-01

    Here we report the use of a multi-genome DNA microarray to elucidate the genomic events associated with the emergence of the clonal variants of H. influenzae biogroup aegyptius causing Brazilian Purpuric Fever (BPF), an important pediatric disease with a high mortality rate. We performed directed genome sequencing of strain HK1212 unique loci to construct a species DNA microarray. Comparative genome hybridization using this microarray enabled us to determine and compare gene complements, and infer reliable phylogenomic relationships among members of the species. The higher genomic variability observed in the genomes of BPF-related strains (clones) and their close relatives may be characterized by significant gene flux related to a subset of functional role categories. We found that the acquisition of a large number of virulence determinants featuring numerous cell membrane proteins coupled to the loss of genes involved in transport, central biosynthetic pathways and in particular, energy production pathways to be characteristics of the BPF genomic variants. PMID:20654709

  13. Comparative Genomic Analysis Indicates that Niche Adaptation of Terrestrial Flavobacteria Is Strongly Linked to Plant Glycan Metabolism

    PubMed Central

    Kolton, Max; Sela, Noa; Elad, Yigal; Cytryn, Eddie

    2013-01-01

    Flavobacteria are important members of aquatic and terrestrial bacterial communities, displaying extreme variations in lifestyle, geographical distribution and genome size. They are ubiquitous in soil, but are often strongly enriched in the rhizosphere and phyllosphere of plants. In this study, we compared the genome of a root-associated Flavobacterium that we recently isolated, physiologically characterized and sequenced, to 14 additional Flavobacterium genomes, in order to pinpoint characteristics associated with its high abundance in the rhizosphere. Interestingly, flavobacterial genomes vary in size by approximately two-fold, with terrestrial isolates having predominantly larger genomes than those from aquatic environments. Comparative functional gene analysis revealed that terrestrial and aquatic Flavobacteria generally segregated into two distinct clades. Members of the aquatic clade had a higher ratio of peptide and protein utilization genes, whereas members of the terrestrial clade were characterized by a significantly higher abundance and diversity of genes involved in metabolism of carbohydrates such as xylose, arabinose and pectin. Interestingly, genes encoding glycoside hydrolase (GH) families GH78 and GH106, responsible for rhamnogalacturonan utilization (exclusively associated with terrestrial plant hemicelluloses), were only present in terrestrial clade genomes, suggesting adaptation of the terrestrial strains to plant-related carbohydrate metabolism. The Peptidase/GH ratio of aquatic clade Flavobacteria was significantly higher than that of terrestrial strains (1.7±0.7 and 9.7±4.7, respectively), supporting the concept that this relation can be used to infer Flavobacterium lifestyles. Collectively, our research suggests that terrestrial Flavobacteria are highly adapted to plant carbohydrate metabolism, which appears to be a key to their profusion in plant environments. PMID:24086761

  14. Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System.

    PubMed

    Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

    2017-01-01

    Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.

  15. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    PubMed Central

    2011-01-01

    Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models. PMID:21542930

  16. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    PubMed

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.

  17. Development of eSSR-Markers in Setaria italica and Their Applicability in Studying Genetic Diversity, Cross-Transferability and Comparative Mapping in Millet and Non-Millet Species

    PubMed Central

    Misra, Gopal; Gupta, Sarika; Subramanian, Alagesan; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-01-01

    Foxtail millet ( Setaria italica L.) is a tractable experimental model crop for studying functional genomics of millets and bioenergy grasses. But the limited availability of genomic resources, particularly expressed sequence-based genic markers is significantly impeding its genetic improvement. Considering this, we attempted to develop EST-derived-SSR (eSSR) markers and utilize them in germplasm characterization, cross-genera transferability and in silico comparative mapping. From 66,027 foxtail millet EST sequences 24,828 non-redundant ESTs were deduced, representing ~16 Mb, which revealed 534 (~2%) eSSRs in 495 SSR containing ESTs at a frequency of 1/30 kb. A total of 447 pp were successfully designed, of which 327 were mapped physically onto nine chromosomes. About 106 selected primer pairs representing the foxtail millet genome showed high-level of cross-genera amplification at an average of ~88% in eight millets and four non-millet species. Broad range of genetic diversity (0.02–0.65) obtained in constructed phylogenetic tree using 40 eSSR markers demonstrated its utility in germplasm characterizations and phylogenetics. Comparative mapping of physically mapped eSSR markers showed considerable proportion of sequence-based orthology and syntenic relationship between foxtail millet chromosomes and sorghum (~68%), maize (~61%) and rice (~42%) chromosomes. Synteny analysis of eSSRs of foxtail millet, rice, maize and sorghum suggested the nested chromosome fusion frequently observed in grass genomes. Thus, for the first time we had generated large-scale eSSR markers in foxtail millet and demonstrated their utility in germplasm characterization, transferability, phylogenetics and comparative mapping studies in millets and bioenergy grass species. PMID:23805325

  18. Development of eSSR-Markers in Setaria italica and Their Applicability in Studying Genetic Diversity, Cross-Transferability and Comparative Mapping in Millet and Non-Millet Species.

    PubMed

    Kumari, Kajal; Muthamilarasan, Mehanathan; Misra, Gopal; Gupta, Sarika; Subramanian, Alagesan; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-01-01

    Foxtail millet (Setariaitalica L.) is a tractable experimental model crop for studying functional genomics of millets and bioenergy grasses. But the limited availability of genomic resources, particularly expressed sequence-based genic markers is significantly impeding its genetic improvement. Considering this, we attempted to develop EST-derived-SSR (eSSR) markers and utilize them in germplasm characterization, cross-genera transferability and in silico comparative mapping. From 66,027 foxtail millet EST sequences 24,828 non-redundant ESTs were deduced, representing ~16 Mb, which revealed 534 (~2%) eSSRs in 495 SSR containing ESTs at a frequency of 1/30 kb. A total of 447 pp were successfully designed, of which 327 were mapped physically onto nine chromosomes. About 106 selected primer pairs representing the foxtail millet genome showed high-level of cross-genera amplification at an average of ~88% in eight millets and four non-millet species. Broad range of genetic diversity (0.02-0.65) obtained in constructed phylogenetic tree using 40 eSSR markers demonstrated its utility in germplasm characterizations and phylogenetics. Comparative mapping of physically mapped eSSR markers showed considerable proportion of sequence-based orthology and syntenic relationship between foxtail millet chromosomes and sorghum (~68%), maize (~61%) and rice (~42%) chromosomes. Synteny analysis of eSSRs of foxtail millet, rice, maize and sorghum suggested the nested chromosome fusion frequently observed in grass genomes. Thus, for the first time we had generated large-scale eSSR markers in foxtail millet and demonstrated their utility in germplasm characterization, transferability, phylogenetics and comparative mapping studies in millets and bioenergy grass species.

  19. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  20. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles* #

    PubMed Central

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-yu; Zhang, Xiao-mei; Song, Da-feng; Zhang, Chen

    2016-01-01

    Objective: In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. Methods: The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). Results: We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Conclusions: Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate. PMID:27487802

  1. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles.

    PubMed

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-Yu; Zhang, Xiao-Mei; Song, Da-Feng; Zhang, Chen

    2016-08-01

    In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate.

  2. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  3. Genomic characterization of a Helicobacter pylori isolate from a patient with gastric cancer in China

    PubMed Central

    2014-01-01

    Background Helicobacter pylori is well known for its relationship with the occurrence of several severe gastric diseases. The mechanisms of pathogenesis triggered by H. pylori are less well known. In this study, we report the genome sequence and genomic characterizations of H. pylori strain HLJ039 that was isolated from a patient with gastric cancer in the Chinese province of Heilongjiang, where there is a high incidence of gastric cancer. To investigate potential genomic features that may be involved in pathogenesis of carcinoma, the genome was compared to three previously sequenced genomes in this area. Result We obtained 42 contigs with a total length of 1,611,192 bp and predicted 1,687 coding sequences. Compared to strains isolated from gastritis and ulcers in this area, 10 different regions were identified as being unique for HLJ039; they mainly encoded type II restriction-modification enzyme, type II m6A methylase, DNA-cytosine methyltransferase, DNA methylase, and hypothetical proteins. A unique 547-bp fragment sharing 93% identity with a hypothetical protein of Helicobacter cinaedi ATCC BAA-847 was not present in any other previous H. pylori strains. Phylogenetic analysis based on core genome single nucleotide polymorphisms shows that HLJ039 is defined as hspEAsia subgroup, which belongs to the hpEastAsia group. Conclusion DNA methylations, variations of the genomic regions involved in restriction and modification systems, are the “hot” regions that may be related to the mechanism of H. pylori-induced gastric cancer. The genome sequence will provide useful information for the deep mining of potential mechanisms related to East Asian gastric cancer. PMID:24565107

  4. Mitochondrial pathogenic mutations are population-specific.

    PubMed

    Breen, Michael S; Kondrashov, Fyodor A

    2010-12-31

    Surveying deleterious variation in human populations is crucial for our understanding, diagnosis and potential treatment of human genetic pathologies. A number of recent genome-wide analyses focused on the prevalence of segregating deleterious alleles in the nuclear genome. However, such studies have not been conducted for the mitochondrial genome. We present a systematic survey of polymorphisms in the human mitochondrial genome, including those predicted to be deleterious and those that correspond to known pathogenic mutations. Analyzing 4458 completely sequenced mitochondrial genomes we characterize the genetic diversity of different types of single nucleotide polymorphisms (SNPs) in African (L haplotypes) and non-African (M and N haplotypes) populations. We find that the overall level of polymorphism is higher in the mitochondrial compared to the nuclear genome, although the mitochondrial genome appears to be under stronger selection as indicated by proportionally fewer nonsynonymous than synonymous substitutions. The African mitochondrial genomes show higher heterozygosity, a greater number of polymorphic sites and higher frequencies of polymorphisms for synonymous, benign and damaging polymorphism than non-African genomes. However, African genomes carry significantly fewer SNPs that have been previously characterized as pathogenic compared to non-African genomes. Finding SNPs classified as pathogenic to be the only category of polymorphisms that are more abundant in non-African genomes is best explained by a systematic ascertainment bias that favours the discovery of pathogenic polymorphisms segregating in non-African populations. This further suggests that, contrary to the common disease-common variant hypothesis, pathogenic mutations are largely population-specific and different SNPs may be associated with the same disease in different populations. Therefore, to obtain a comprehensive picture of the deleterious variability in the human population, as well as to improve the diagnostics of individuals carrying African mitochondrial haplotypes, it is necessary to survey different populations independently. This article was reviewed by Dr Mikhail Gelfand, Dr Vasily Ramensky (nominated by Dr Eugene Koonin) and Dr David Rand (nominated by Dr Laurence Hurst).

  5. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    PubMed Central

    Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

    2011-01-01

    Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358

  6. Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

    PubMed

    Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

    2016-12-01

    In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  7. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world

    PubMed Central

    Koonin, Eugene V.; Wolf, Yuri I.

    2008-01-01

    The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution. PMID:18948295

  8. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    PubMed

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  9. Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform.

    PubMed

    Wang, Jinpeng; Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Wang, Zhenyi; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Jin, Dianchuan; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Cheng, Rui; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-Chen; Paterson, Andrew H; Wang, Xiyin

    2017-05-01

    Mainly due to their economic importance, genomes of 10 legumes, including soybean ( Glycine max ), wild peanut ( Arachis duranensis and Arachis ipaensis ), and barrel medic ( Medicago truncatula ), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape ( Vitis vinifera ) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). © 2017 American Society of Plant Biologists. All Rights Reserved.

  10. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    PubMed Central

    Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

    2003-01-01

    Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626

  11. Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes

    PubMed Central

    Doerr, Daniel; Chauve, Cedric

    2017-01-01

    Yersinia pestis is the causative agent of the bubonic plague, a disease responsible for several dramatic historical pandemics. Progress in ancient DNA (aDNA) sequencing rendered possible the sequencing of whole genomes of important human pathogens, including the ancient Y. pestis strains responsible for outbreaks of the bubonic plague in London in the 14th century and in Marseille in the 18th century, among others. However, aDNA sequencing data are still characterized by short reads and non-uniform coverage, so assembling ancient pathogen genomes remains challenging and often prevents a detailed study of genome rearrangements. It has recently been shown that comparative scaffolding approaches can improve the assembly of ancient Y. pestis genomes at a chromosome level. In the present work, we address the last step of genome assembly, the gap-filling stage. We describe an optimization-based method AGapEs (ancestral gap estimation) to fill in inter-contig gaps using a combination of a template obtained from related extant genomes and aDNA reads. We show how this approach can be used to refine comparative scaffolding by selecting contig adjacencies supported by a mix of unassembled aDNA reads and comparative signal. We applied our method to two Y. pestis data sets from the London and Marseilles outbreaks, for which we obtained highly improved genome assemblies for both genomes, comprised of, respectively, five and six scaffolds with 95 % of the assemblies supported by ancient reads. We analysed the genome evolution between both ancient genomes in terms of genome rearrangements, and observed a high level of synteny conservation between these strains. PMID:29114402

  12. Phylogenomic Insights into Mouse Evolution Using a Pseudoreference Approach

    PubMed Central

    Sarver, Brice A.J.; Keeble, Sara; Cosart, Ted; Tucker, Priscilla K.; Dean, Matthew D.

    2017-01-01

    Comparative genomic studies are now possible across a broad range of evolutionary timescales, but the generation and analysis of genomic data across many different species still present a number of challenges. The most sophisticated genotyping and down-stream analytical frameworks are still predominantly based on comparisons to high-quality reference genomes. However, established genomic resources are often limited within a given group of species, necessitating comparisons to divergent reference genomes that could restrict or bias comparisons across a phylogenetic sample. Here, we develop a scalable pseudoreference approach to iteratively incorporate sample-specific variation into a genome reference and reduce the effects of systematic mapping bias in downstream analyses. To characterize this framework, we used targeted capture to sequence whole exomes (∼54 Mbp) in 12 lineages (ten species) of mice spanning the Mus radiation. We generated whole exome pseudoreferences for all species and show that this iterative reference-based approach improved basic genomic analyses that depend on mapping accuracy while preserving the associated annotations of the mouse reference genome. We then use these pseudoreferences to resolve evolutionary relationships among these lineages while accounting for phylogenetic discordance across the genome, contributing an important resource for comparative studies in the mouse system. We also describe patterns of genomic introgression among lineages and compare our results to previous studies. Our general approach can be applied to whole or partitioned genomic data and is easily portable to any system with sufficient genomic resources, providing a useful framework for phylogenomic studies in mice and other taxa. PMID:28338821

  13. Chompy: an infestation of MITE-like repetitive elements in the crocodilian genome.

    PubMed

    Ray, David A; Hedges, Dale J; Herke, Scott W; Fowlkes, Justin D; Barnes, Erin W; LaVie, Daniel K; Goodwin, Lindsey M; Densmore, Llewellyn D; Batzer, Mark A

    2005-12-05

    Interspersed repeats are a major component of most eukaryotic genomes and have an impact on genome size and stability, but the repetitive element landscape of crocodilian genomes has not yet been fully investigated. In this report, we provide the first detailed characterization of an interspersed repeat element in any crocodilian genome. Chompy is a putative miniature inverted-repeat transposable element (MITE) family initially recovered from the genome of Alligator mississippiensis (American alligator) but also present in the genomes of Crocodylus moreletii (Morelet's crocodile) and Gavialis gangeticus (Indian gharial). The element has all of the hallmarks of MITEs including terminal inverted repeats, possible target site duplications, and a tendency to form secondary structures. We estimate the copy number in the alligator genome to be approximately 46,000 copies. As a result of their size and unique properties, Chompy elements may provide a useful source of genomic variation for crocodilian comparative genomics.

  14. Complete Genome Sequences, before and after Mammalian Cell Culture, of Zika Virus Isolated from the Serum of a Symptomatic Male Patient from Oaxaca, Mexico.

    PubMed

    Boukadida, Celia; Torres-Flores, Jesús M; Yocupicio-Monroy, Martha; Piten-Isidro, Elvira; Rivero-Arrieta, Amaranta Y; Luna-Villalobos, Yara A; Martínez-Vargas, Liliane; Alcaraz-Estrada, Sofía L; Torres, Klintsy J; Lira, Rosalia; Reyes-Terán, Gustavo; Sevilla-Reyes, Edgar E

    2017-03-23

    Zika virus (ZIKV) is an emerging arthropod-borne flavivirus associated with severe congenital malformations and neurological complications. Although the ZIKV genome is well characterized, there is limited information regarding changes after cell isolation and culture adaptation. We isolated, and passaged in Vero cells, ZIKV from the serum of a symptomatic male patient and compared the viral genomes before and after culture. Single nucleotide polymorphisms were characteristic among serum-circulating genomes, while such diversity decreased after cell culture. Copyright © 2017 Boukadida et al.

  15. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    PubMed

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA biosynthesis and disease resistance. The study demonstrated the advantages of having an integrated approach to gene prediction and developed a computational framework for combining multiple genome annotations. These results, available in the oil palm annotation database ( http://palmxplore.mpob.gov.my ), will provide important resources for studies on the genomes of oil palm and related crops. This article was reviewed by Alexander Kel, Igor Rogozin, and Vladimir A. Kuznetsov.

  16. A genomic view of food-related and probiotic Enterococcus strains.

    PubMed

    Bonacina, Julieta; Suárez, Nadia; Hormigo, Ricardo; Fadda, Silvina; Lechner, Marcus; Saavedra, Lucila

    2017-02-01

    The study of enterococcal genomes has grown considerably in recent years. While special attention is paid to comparative genomic analysis among clinical relevant isolates, in this study we performed an exhaustive comparative analysis of enterococcal genomes of food origin and/or with potential to be used as probiotics. Beyond common genetic features, we especially aimed to identify those that are specific to enterococcal strains isolated from a certain food-related source as well as features present in a species-specific manner. Thus, the genome sequences of 25 Enterococcus strains, from 7 different species, were examined and compared. Their phylogenetic relationship was reconstructed based on orthologous proteins and whole genomes. Likewise, markers associated with a successful colonization (bacteriocin genes and genomic islands) and genome plasticity (phages and clustered regularly interspaced short palindromic repeats) were investigated for lifestyle specific genetic features. At the same time, a search for antibiotic resistance genes was carried out, since they are of big concern in the food industry. Finally, it was possible to locate 1617 FIGfam families as a core proteome universally present among the genera and to determine that most of the accessory genes code for hypothetical proteins, providing reasonable hints to support their functional characterization. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  17. Characterization and Comparative Analysis of the Complete Chloroplast Genome of the Critically Endangered Species Streptocarpus teitensis (Gesneriaceae).

    PubMed

    Kyalo, Cornelius M; Gichira, Andrew W; Li, Zhi-Zhong; Saina, Josphat K; Malombe, Itambo; Hu, Guang-Wan; Wang, Qing-Feng

    2018-01-01

    Streptocarpus teitensis (Gesneriaceae) is an endemic species listed as critically endangered in the International Union for Conservation of Nature (IUCN) red list of threatened species. However, the sequence and genome information of this species remains to be limited. In this article, we present the complete chloroplast genome structure of Streptocarpus teitensis and its evolution inferred through comparative studies with other related species. S. teitensis displayed a chloroplast genome size of 153,207 bp, sheltering a pair of inverted repeats (IR) of 25,402 bp each split by small and large single-copy (SSC and LSC) regions of 18,300 and 84,103 bp, respectively. The chloroplast genome was observed to contain 116 unique genes, of which 80 are protein-coding, 32 are transfer RNAs, and four are ribosomal RNAs. In addition, a total of 196 SSR markers were detected in the chloroplast genome of Streptocarpus teitensis with mononucleotides (57.1%) being the majority, followed by trinucleotides (33.2%) and dinucleotides and tetranucleotides (both 4.1%), and pentanucleotides being the least (1.5%). Genome alignment indicated that this genome was comparable to other sequenced members of order Lamiales. The phylogenetic analysis suggested that Streptocarpus teitensis is closely related to Lysionotus pauciflorus and Dorcoceras hygrometricum .

  18. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  19. Genomic and probiotic characterization of SJP-SNU strain of Pichia kudriavzevii.

    PubMed

    Hong, Seung-Min; Kwon, Hyuk-Joon; Park, Se-Joon; Seong, Won-Jin; Kim, Ilhwan; Kim, Jae-Hong

    2018-05-17

    The yeast strain SJP-SNU was investigated as a probiotic and was characterized with respect to growth temperature, bile salt resistance, hydrogen sulfide reducing activity, intestinal survival ability and chicken embryo pathogenicity. In addition, we determined the complete genomic and mitochondrial sequences of SJP-SNU and conducted comparative genomics analyses. SJP-SNU grew rapidly at 37 °C and formed colonies on MacConkey agar containing bile salt. SJP-SNU reduced hydrogen sulfide produced by Salmonella serotype Enteritidis and, after being fed to 4-week-old chickens, could be isolated from cecal feces. SJP-SNU did not cause mortality in 10-day-old chicken embryos. From 13 initial contigs, 11 were finally assembled and represented 10 chromosomal sequences and 1 mitochondrial DNA sequence. Comparative genomic analyses revealed that SJP-SNU was a strain of Pichia kudriavzevii. Although SJP-SNU possesses pathogenicity-related genes, they showed very low amino acid sequence identities to those of Candida albicans. Furthermore, SJP-SNU possessed useful genes, such as phytases and cellulase. Thus, SJP-SNU is a useful yeast possessing the basic traits of a probiotic, and further studies to demonstrate its efficacy as a probiotic in the future may be warranted.

  20. A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy

    PubMed Central

    Brenton, Zachary W.; Cooper, Elizabeth A.; Myers, Mathew T.; Boyles, Richard E.; Shakoor, Nadia; Zielinski, Kelsey J.; Rauh, Bradley L.; Bridges, William C.; Morris, Geoffrey P.; Kresovich, Stephen

    2016-01-01

    With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production. PMID:27356613

  1. Identification of Brucella melitensis Rev.1 vaccine-strain genetic markers: Towards understanding the molecular mechanism behind virulence attenuation.

    PubMed

    Issa, Mohammad Nouh; Ashhab, Yaqoub

    2016-09-22

    Brucella melitensis Rev.1 is an avirulent strain that is widely used as a live vaccine to control brucellosis in small ruminants. Although an assembled draft version of Rev.1 genome has been available since 2009, this genome has not been investigated to characterize this important vaccine. In the present work, we used the draft genome of Rev.1 to perform a thorough genomic comparison and sequence analysis to identify and characterize the panel of its unique genetic markers. The draft genome of Rev.1 was compared with genome sequences of 36 different Brucella melitensis strains from the Brucella project of the Broad Institute of MIT and Harvard. The comparative analyses revealed 32 genetic alterations (30 SNPs, 1 single-bp insertion and 1 single-bp deletion) that are exclusively present in the Rev.1 genome. In silico analyses showed that 9 out of the 17 non-synonymous mutations are deleterious. Three ABC transporters are among the disrupted genes that can be linked to virulence attenuation. Out of the 32 mutations, 11 Rev.1 specific markers were selected to test their potential to discriminate Rev.1 using a bi-directional allele-specific PCR assay. Six markers were able to distinguish between Rev.1 and a set of control strains. We succeeded in identifying a panel of 32 genome-specific markers of the B. melitensis Rev.1 vaccine strain. Extensive in silico analysis showed that a considerable number of these mutations could severely affect the function of the associated genes. In addition, some of the discovered markers were able to discriminate Rev.1 strain from a group of control strains using practical PCR tests that can be applied in resource-limited settings. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Comparative genome analysis of Prevotella ruminicola and Prevotella bryantii: insights into their environmental niche.

    PubMed

    Purushe, Janaki; Fouts, Derrick E; Morrison, Mark; White, Bryan A; Mackie, Roderick I; Coutinho, Pedro M; Henrissat, Bernard; Nelson, Karen E

    2010-11-01

    The Prevotellas comprise a diverse group of bacteria that has received surprisingly limited attention at the whole genome-sequencing level. In this communication, we present the comparative analysis of the genomes of Prevotella ruminicola 23 (GenBank: CP002006) and Prevotella bryantii B(1)4 (GenBank: ADWO00000000), two gastrointestinal isolates. Both P. ruminicola and P. bryantii have acquired an extensive repertoire of glycoside hydrolases that are targeted towards non-cellulosic polysaccharides, especially GH43 bifunctional enzymes. Our analysis demonstrates the diversity of this genus. The results from these analyses highlight their role in the gastrointestinal tract, and provide a template for additional work on genetic characterization of these species.

  3. Tracing phylogenomic events leading to diversity of Haemophilus influenzae and the emergence of Brazilian Purpuric Fever (BPF)-associated clones.

    PubMed

    Papazisi, Leka; Ratnayake, Shashikala; Remortel, Brian G; Bock, Geoffrey R; Liang, Wei; Saeed, Alexander I; Liu, Jia; Fleischmann, Robert D; Kilian, Mogens; Peterson, Scott N

    2010-11-01

    Here we report the use of a multi-genome DNA microarray to elucidate the genomic events associated with the emergence of the clonal variants of Haemophilus influenzae biogroup aegyptius causing Brazilian Purpuric Fever (BPF), an important pediatric disease with a high mortality rate. We performed directed genome sequencing of strain HK1212 unique loci to construct a species DNA microarray. Comparative genome hybridization using this microarray enabled us to determine and compare gene complements, and infer reliable phylogenomic relationships among members of the species. The higher genomic variability observed in the genomes of BPF-related strains (clones) and their close relatives may be characterized by significant gene flux related to a subset of functional role categories. We found that the acquisition of a large number of virulence determinants featuring numerous cell membrane proteins coupled to the loss of genes involved in transport, central biosynthetic pathways and in particular, energy production pathways to be characteristics of the BPF genomic variants. Copyright © 2010 Elsevier Inc. All rights reserved.

  4. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

    PubMed

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-07-20

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  5. Exploiting genotyping by sequencing to characterize the genomic structure of the American cranberry through high-density linkage mapping.

    PubMed

    Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Schlautman, Brandon; Deutsch, Joseph; Salazar, Walter; Hernandez-Ochoa, Miguel; Grygleski, Edward; Steffan, Shawn; Iorizzo, Massimo; Polashock, James; Vorsa, Nicholi; Zalapa, Juan

    2016-06-13

    The application of genotyping by sequencing (GBS) approaches, combined with data imputation methodologies, is narrowing the genetic knowledge gap between major and understudied, minor crops. GBS is an excellent tool to characterize the genomic structure of recently domesticated (~200 years) and understudied species, such as cranberry (Vaccinium macrocarpon Ait.), by generating large numbers of markers for genomic studies such as genetic mapping. We identified 10842 potentially mappable single nucleotide polymorphisms (SNPs) in a cranberry pseudo-testcross population wherein 5477 SNPs and 211 short sequence repeats (SSRs) were used to construct a high density linkage map in cranberry of which a total of 4849 markers were mapped. Recombination frequency, linkage disequilibrium (LD), and segregation distortion at the genomic level in the parental and integrated linkage maps were characterized for first time in cranberry. SSR markers, used as the backbone in the map, revealed high collinearity with previously published linkage maps. The 4849 point map consisted of twelve linkage groups spanning 1112 cM, which anchored 2381 nuclear scaffolds accounting for ~13 Mb of the estimated 470 Mb cranberry genome. Bin mapping identified 592 and 672 unique bins in the parentals and a total of 1676 unique marker positions in the integrated map. Synteny analyses comparing the order of anchored cranberry scaffolds to their homologous positions in kiwifruit, grape, and coffee genomes provided initial evidence of homology between cranberry and closely related species. GBS data was used to rapidly saturate the cranberry genome with markers in a pseudo-testcross population. Collinearity between the present saturated genetic map and previous cranberry SSR maps suggests that the SNP locations represent accurate marker order and chromosome structure of the cranberry genome. SNPs greatly improved current marker genome coverage, which allowed for genome-wide structure investigations such as segregation distortion, recombination, linkage disequilibrium, and synteny analyses. In the future, GBS can be used to accelerate cranberry molecular breeding through QTL mapping and genome-wide association studies (GWAS).

  6. Whole-genome comparative analysis of three phytopathogenic Xylella fastidiosa strains.

    PubMed

    Bhattacharyya, Anamitra; Stilwagen, Stephanie; Ivanova, Natalia; D'Souza, Mark; Bernal, Axel; Lykidis, Athanasios; Kapatral, Vinayak; Anderson, Iain; Larsen, Niels; Los, Tamara; Reznik, Gary; Selkov, Eugene; Walunas, Theresa L; Feil, Helene; Feil, William S; Purcell, Alexander; Lassez, Jean-Louis; Hawkins, Trevor L; Haselkorn, Robert; Overbeek, Ross; Predki, Paul F; Kyrpides, Nikos C

    2002-09-17

    Xylella fastidiosa (Xf) causes wilt disease in plants and is responsible for major economic and crop losses globally. Owing to the public importance of this phytopathogen we embarked on a comparative analysis of the complete genome of Xf pv citrus and the partial genomes of two recently sequenced strains of this species: Xf pv almond and Xf pv oleander, which cause leaf scorch in almond and oleander plants, respectively. We report a reanalysis of the previously sequenced Xf 9a5c (CVC, citrus) strain and the two "gapped" Xf genomes revealing ORFs encoding critical functions in pathogenicity and conjugative transfer. Second, a detailed whole-genome functional comparison was based on the three sequenced Xf strains, identifying the unique genes present in each strain, in addition to those shared between strains. Third, an "in silico" cellular reconstruction of these organisms was made, based on a comparison of their core functional subsystems that led to a characterization of their conjugative transfer machinery, identification of potential differences in their adhesion mechanisms, and highlighting of the absence of a classical quorum-sensing mechanism. This study demonstrates the effectiveness of comparative analysis strategies in the interpretation of genomes that are closely related.

  7. First draft genome of an iconic clownfish species (Amphiprion frenatus).

    PubMed

    Marcionetti, Anna; Rossier, Victor; Bertrand, Joris A M; Litsios, Glenn; Salamin, Nicolas

    2018-02-17

    Clownfishes (or anemonefishes) form an iconic group of coral reef fishes, principally known for their mutualistic interaction with sea anemones. They are characterized by particular life history traits, such as a complex social structure and mating system involving sequential hermaphroditism, coupled with an exceptionally long lifespan. Additionally, clownfishes are considered to be one of the rare groups to have experienced an adaptive radiation in the marine environment. Here, we assembled and annotated the first genome of a clownfish species, the tomato clownfish (Amphiprion frenatus). We obtained 17,801 assembled scaffolds, containing a total of 26,917 genes. The completeness of the assembly and annotation was satisfying, with 96.5% of the Actinopterygii Benchmarking Universal Single-Copy Orthologs (BUSCOs) being retrieved in A. frenatus assembly. The quality of the resulting assembly is comparable to other bony fish assemblies. This resource is valuable for advancing studies of the particular life history traits of clownfishes, as well as being useful for population genetic studies and the development of new phylogenetic markers. It will also open the way to comparative genomics. Indeed, future genomic comparison among closely related fishes may provide means to identify genes related to the unique adaptations to different sea anemone hosts, as well as better characterize the genomic signatures of an adaptive radiation. © 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  8. Comparative pathogenomic characterization of a non-invasive serotype M71 strain Streptococcus pyogenes NS53 reveals incongruent phenotypic implications from distinct genotypic markers.

    PubMed

    Bao, Yun-Juan; Li, Yang; Liang, Zhong; Agrahari, Garima; Lee, Shaun W; Ploplis, Victoria A; Castellino, Francis J

    2017-07-31

    The strains serotyped as M71 from group A Streptococcus are common causes of pharyngeal and skin diseases worldwide. Here we characterize the genome of a unique non-invasive M71 human isolate, NS53. The genome does not contain structural rearrangements or large-scale gene gains/losses, but encodes a full set of non-truncated known virulence factors, thus providing an ideal reference for comparative studies. However, the NS53 genome showed incongruent phenotypic implications from distinct genotypic markers. NS53 is characterized as an emm pattern D and FCT (fibronectin-collagen-T antigen) type-3 strain, typical of skin tropic strains, but is phylogenetically close to emm pattern E strains with preference for both skin and pharyngeal infections. We propose that this incongruence could result from recombination within the emm gene locus, or, alternatively, selection has been against those genetic alterations. Combined with the inability to select for CovS switching, a process is indicated whereby NS53 has been pre-adapted to specific host niches selecting against variations in CovS and many other genes. This may allow the strain to attain successful colonization and long-term survival. A balance between genetic variations and fitness may exist for this bacterium to form a stabilized genome optimized for survival in specific host environments. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Single-Cell Whole-Genome Amplification and Sequencing: Methodology and Applications.

    PubMed

    Huang, Lei; Ma, Fei; Chapman, Alec; Lu, Sijia; Xie, Xiaoliang Sunney

    2015-01-01

    We present a survey of single-cell whole-genome amplification (WGA) methods, including degenerate oligonucleotide-primed polymerase chain reaction (DOP-PCR), multiple displacement amplification (MDA), and multiple annealing and looping-based amplification cycles (MALBAC). The key parameters to characterize the performance of these methods are defined, including genome coverage, uniformity, reproducibility, unmappable rates, chimera rates, allele dropout rates, false positive rates for calling single-nucleotide variations, and ability to call copy-number variations. Using these parameters, we compare five commercial WGA kits by performing deep sequencing of multiple single cells. We also discuss several major applications of single-cell genomics, including studies of whole-genome de novo mutation rates, the early evolution of cancer genomes, circulating tumor cells (CTCs), meiotic recombination of germ cells, preimplantation genetic diagnosis (PGD), and preimplantation genomic screening (PGS) for in vitro-fertilized embryos.

  10. Comparative Genomics of the Genus Porphyromonas Identifies Adaptations for Heme Synthesis within the Prevalent Canine Oral Species Porphyromonas cangingivalis

    PubMed Central

    O’Flynn, Ciaran; Deusch, Oliver; Darling, Aaron E.; Eisen, Jonathan A.; Wallis, Corrin; Davis, Ian J.; Harris, Stephen J.

    2015-01-01

    Porphyromonads play an important role in human periodontal disease and recently have been shown to be highly prevalent in canine mouths. Porphyromonas cangingivalis is the most prevalent canine oral bacterial species in both plaque from healthy gingiva and plaque from dogs with early periodontitis. The ability of P. cangingivalis to flourish in the different environmental conditions characterized by these two states suggests a degree of metabolic flexibility. To characterize the genes responsible for this, the genomes of 32 isolates (including 18 newly sequenced and assembled) from 18 Porphyromonad species from dogs, humans, and other mammals were compared. Phylogenetic trees inferred using core genes largely matched previous findings; however, comparative genomic analysis identified several genes and pathways relating to heme synthesis that were present in P. cangingivalis but not in other Porphyromonads. Porphyromonas cangingivalis has a complete protoporphyrin IX synthesis pathway potentially allowing it to synthesize its own heme unlike pathogenic Porphyromonads such as Porphyromonas gingivalis that acquire heme predominantly from blood. Other pathway differences such as the ability to synthesize siroheme and vitamin B12 point to enhanced metabolic flexibility for P. cangingivalis, which may underlie its prevalence in the canine oral cavity. PMID:26568374

  11. Virion Architecture Unifies Globally Distributed Pleolipoviruses Infecting Halophilic Archaea

    PubMed Central

    Pietilä, Maija K.; Atanasova, Nina S.; Manole, Violeta; Liljeroos, Lassi; Butcher, Sarah J.; Oksanen, Hanna M.

    2012-01-01

    Our understanding of the third domain of life, Archaea, has greatly increased since its establishment some 20 years ago. The increasing information on archaea has also brought their viruses into the limelight. Today, about 100 archaeal viruses are known, which is a low number compared to the numbers of characterized bacterial or eukaryotic viruses. Here, we have performed a comparative biological and structural study of seven pleomorphic viruses infecting extremely halophilic archaea. The pleomorphic nature of this novel virion type was established by sedimentation analysis and cryo-electron microscopy. These nonlytic viruses form virions characterized by a lipid vesicle enclosing the genome, without any nucleoproteins. The viral lipids are unselectively acquired from host cell membranes. The virions contain two to three major structural proteins, which either are embedded in the membrane or form spikes distributed randomly on the external membrane surface. Thus, the most important step during virion assembly is most likely the interaction of the membrane proteins with the genome. The interaction can be driven by single-stranded or double-stranded DNA, resulting in the virions having similar architectures but different genome types. Based on our comparative study, these viruses probably form a novel group, which we define as pleolipoviruses. PMID:22357279

  12. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  13. Optical mapping reveals a large genetic inversion between two methicillin-resistant Staphylococcus aureus strains.

    PubMed

    Shukla, Sanjay K; Kislow, Jennifer; Briska, Adam; Henkhaus, John; Dykes, Colin

    2009-09-01

    Staphylococcus aureus is a highly versatile and evolving bacterium of great clinical importance. S. aureus can evolve by acquiring single nucleotide polymorphisms and mobile genetic elements and by recombination events. Identification and location of novel genomic elements in a bacterial genome are not straightforward, unless the whole genome is sequenced. Optical mapping is a new tool that creates a high-resolution, in situ ordered restriction map of a bacterial genome. These maps can be used to determine genomic organization and perform comparative genomics to identify genomic rearrangements, such as insertions, deletions, duplications, and inversions, compared to an in silico (virtual) restriction map of a known genome sequence. Using this technology, we report here the identification, approximate location, and characterization of a genetic inversion of approximately 500 kb of a DNA element between the NRS387 (USA800) and FPR3757 (USA300) strains. The presence of the inversion and location of its junction sites were confirmed by site-specific PCR and sequencing. At both the left and right junction sites in NRS387, an IS1181 element and a 73-bp sequence were identified as inverted repeats, which could explain the possible mechanism of the inversion event.

  14. Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System

    PubMed Central

    Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

    2017-01-01

    Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252

  15. Deeper insight into the structure of the anaerobic digestion microbial community; the biogas microbiome database is expanded with 157 new genomes.

    PubMed

    Treu, Laura; Kougias, Panagiotis G; Campanaro, Stefano; Bassani, Ilaria; Angelidaki, Irini

    2016-09-01

    This research aimed to better characterize the biogas microbiome by means of high throughput metagenomic sequencing and to elucidate the core microbial consortium existing in biogas reactors independently from the operational conditions. Assembly of shotgun reads followed by an established binning strategy resulted in the highest, up to now, extraction of microbial genomes involved in biogas producing systems. From the 236 extracted genome bins, it was remarkably found that the vast majority of them could only be characterized at high taxonomic levels. This result confirms that the biogas microbiome is comprised by a consortium of unknown species. A comparative analysis between the genome bins of the current study and those extracted from a previous metagenomic assembly demonstrated a similar phylogenetic distribution of the main taxa. Finally, this analysis led to the identification of a subset of common microbes that could be considered as the core essential group in biogas production. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Genome sequencing of ovine isolates of Mycobacterium avium subspecies paratuberculosis offers insights into host association

    PubMed Central

    2012-01-01

    Background The genome of Mycobacterium avium subspecies paratuberculosis (MAP) is remarkably homogeneous among the genomes of bovine, human and wildlife isolates. However, previous work in our laboratories with the bovine K-10 strain has revealed substantial differences compared to sheep isolates. To systematically characterize all genomic differences that may be associated with the specific hosts, we sequenced the genomes of three U.S. sheep isolates and also obtained an optical map. Results Our analysis of one of the isolates, MAP S397, revealed a genome 4.8 Mb in size with 4,700 open reading frames (ORFs). Comparative analysis of the MAP S397 isolate showed it acquired approximately 10 large sequence regions that are shared with the human M. avium subsp. hominissuis strain 104 and lost 2 large regions that are present in the bovine strain. In addition, optical mapping defined the presence of 7 large inversions between the bovine and ovine genomes (~ 2.36 Mb). Whole-genome sequencing of 2 additional sheep strains of MAP (JTC1074 and JTC7565) further confirmed genomic homogeneity of the sheep isolates despite the presence of polymorphisms on the nucleotide level. Conclusions Comparative sequence analysis employed here provided a better understanding of the host association, evolution of members of the M. avium complex and could help in deciphering the phenotypic differences observed among sheep and cattle strains of MAP. A similar approach based on whole-genome sequencing combined with optical mapping could be employed to examine closely related pathogens. We propose an evolutionary scenario for M. avium complex strains based on these genome sequences. PMID:22409516

  17. Saccharomyces cerevisiae: gene annotation and genome variability, state of the art through comparative genomics.

    PubMed

    Louis, Ed

    2011-01-01

    In the early days of the yeast genome sequencing project, gene annotation was in its infancy and suffered the problem of many false positive annotations as well as missed genes. The lack of other sequences for comparison also prevented the annotation of conserved, functional sequences that were not coding. We are now in an era of comparative genomics where many closely related as well as more distantly related genomes are available for direct sequence and synteny comparisons allowing for more probable predictions of genes and other functional sequences due to conservation. We also have a plethora of functional genomics data which helps inform gene annotation for previously uncharacterised open reading frames (ORFs)/genes. For Saccharomyces cerevisiae this has resulted in a continuous updating of the gene and functional sequence annotations in the reference genome helping it retain its position as the best characterized eukaryotic organism's genome. A single reference genome for a species does not accurately describe the species and this is quite clear in the case of S. cerevisiae where the reference strain is not ideal for brewing or baking due to missing genes. Recent surveys of numerous isolates, from a variety of sources, using a variety of technologies have revealed a great deal of variation amongst isolates with genome sequence surveys providing information on novel genes, undetectable by other means. We now have a better understanding of the extant variation in S. cerevisiae as a species as well as some idea of how much we are missing from this understanding. As with gene annotation, comparative genomics enhances the discovery and description of genome variation and is providing us with the tools for understanding genome evolution, adaptation and selection, and underlying genetics of complex traits.

  18. Identification, characterization and expression analysis of lineage-specific genes within sweet orange (Citrus sinensis).

    PubMed

    Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang

    2015-11-23

    With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.

  19. Genome Dynamics and Molecular Infection Epidemiology of Multidrug-Resistant Helicobacter pullorum Isolates Obtained from Broiler and Free-Range Chickens in India.

    PubMed

    Qumar, Shamsul; Majid, Mohammad; Kumar, Narender; Tiwari, Sumeet K; Semmler, Torsten; Devi, Savita; Baddam, Ramani; Hussain, Arif; Shaik, Sabiha; Ahmed, Niyaz

    2017-01-01

    Some life-threatening, foodborne, and zoonotic infections are transmitted through poultry birds. Inappropriate and indiscriminate use of antimicrobials in the livestock industry has led to an increased prevalence of multidrug-resistant bacteria with epidemic potential. Here, we present a functional molecular epidemiological analysis entailing the phenotypic and whole-genome sequence-based characterization of 11 H. pullorum isolates from broiler and free-range chickens sampled from retail wet markets in Hyderabad City, India. Antimicrobial susceptibility tests revealed all of the isolates to be resistant to multiple antibiotic classes such as fluoroquinolones, cephalosporins, sulfonamides, and macrolides. The isolates were also found to be extended-spectrum β-lactamase producers and were even resistant to clavulanic acid. Whole-genome sequencing and comparative genomic analysis of these isolates revealed the presence of five or six well-characterized antimicrobial resistance genes, including those encoding a resistance-nodulation-division efflux pump(s). Phylogenetic analysis combined with pan-genome analysis revealed a remarkable degree of genetic diversity among the isolates from free-range chickens; in contrast, a high degree of genetic similarity was observed among broiler chicken isolates. Comparative genomic analysis of all publicly available H. pullorum genomes, including our isolates (n = 16), together with the genomes of 17 other Helicobacter species, revealed a high number (8,560) of H. pullorum-specific protein-encoding genes, with an average of 535 such genes per isolate. In silico virulence screening identified 182 important virulence genes and also revealed high strain-specific gene content in isolates from free-range chickens (average, 34) compared to broiler chicken isolates. A significant prevalence of prophages (ranging from 1 to 9) and a significant presence of genomic islands (0 to 4) were observed in free-range and broiler chicken isolates. Taken together, these observations provide significant baseline data for functional molecular infection epidemiology of nonpyloric Helicobacter species such as H. pullorum by unraveling their evolution in chickens and their possible zoonotic transmission to humans. Globally, the poultry industry is expanding with an ever-growing consumer base for chicken meat. Given this, food-associated transmission of multidrug-resistant bacteria represents an important health care issue. Our study involves a critical baseline approach directed at genome sequence-based epidemiology and transmission dynamics of H. pullorum, a poultry pathogen having established zoonotic potential. We believe our studies would facilitate the development of surveillance systems that ensure the safety of food for humans and guide public health policies related to the use of antibiotics in animal feed in countries such as India. We sequenced 11 new genomes of H. pullorum as a part of this study. These genomes would provide much value in addition to the ongoing comparative genomic studies of helicobacters. Copyright © 2016 American Society for Microbiology.

  20. From genomics to functional markers in the era of next-generation sequencing.

    PubMed

    Salgotra, R K; Gupta, B B; Stewart, C N

    2014-03-01

    The availability of complete genome sequences, along with other genomic resources for Arabidopsis, rice, pigeon pea, soybean and other crops, has revolutionized our understanding of the genetic make-up of plants. Next-generation DNA sequencing (NGS) has facilitated single nucleotide polymorphism discovery in plants. Functionally-characterized sequences can be identified and functional markers (FMs) for important traits can be developed at an ever-increasing ease. FMs are derived from sequence polymorphisms found in allelic variants of a functional gene. Linkage disequilibrium-based association mapping and homologous recombinants have been developed for identification of "perfect" markers for their use in crop improvement practices. Compared with many other molecular markers, FMs derived from the functionally characterized sequence genes using NGS techniques and their use provide opportunities to develop high-yielding plant genotypes resistant to various stresses at a fast pace.

  1. Comparative high-throughput transcriptome sequencing and development of SiESTa, the Silene EST annotation database

    PubMed Central

    2011-01-01

    Background The genus Silene is widely used as a model system for addressing ecological and evolutionary questions in plants, but advances in using the genus as a model system are impeded by the lack of available resources for studying its genome. Massively parallel sequencing cDNA has recently developed into an efficient method for characterizing the transcriptomes of non-model organisms, generating massive amounts of data that enable the study of multiple species in a comparative framework. The sequences generated provide an excellent resource for identifying expressed genes, characterizing functional variation and developing molecular markers, thereby laying the foundations for future studies on gene sequence and gene expression divergence. Here, we report the results of a comparative transcriptome sequencing study of eight individuals representing four Silene and one Dianthus species as outgroup. All sequences and annotations have been deposited in a newly developed and publicly available database called SiESTa, the Silene EST annotation database. Results A total of 1,041,122 EST reads were generated in two runs on a Roche GS-FLX 454 pyrosequencing platform. EST reads were analyzed separately for all eight individuals sequenced and were assembled into contigs using TGICL. These were annotated with results from BLASTX searches and Gene Ontology (GO) terms, and thousands of single-nucleotide polymorphisms (SNPs) were characterized. Unassembled reads were kept as singletons and together with the contigs contributed to the unigenes characterized in each individual. The high quality of unigenes is evidenced by the proportion (49%) that have significant hits in similarity searches with the A. thaliana proteome. The SiESTa database is accessible at http://www.siesta.ethz.ch. Conclusion The sequence collections established in the present study provide an important genomic resource for four Silene and one Dianthus species and will help to further develop Silene as a plant model system. The genes characterized will be useful for future research not only in the species included in the present study, but also in related species for which no genomic resources are yet available. Our results demonstrate the efficiency of massively parallel transcriptome sequencing in a comparative framework as an approach for developing genomic resources in diverse groups of non-model organisms. PMID:21791039

  2. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanismsmore » in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.« less

  3. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert

    2011-05-16

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanismsmore » in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.« less

  4. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri,more » 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.« less

  5. A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy.

    PubMed

    Brenton, Zachary W; Cooper, Elizabeth A; Myers, Mathew T; Boyles, Richard E; Shakoor, Nadia; Zielinski, Kelsey J; Rauh, Bradley L; Bridges, William C; Morris, Geoffrey P; Kresovich, Stephen

    2016-09-01

    With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production. Copyright © 2016 by the Genetics Society of America.

  6. Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution

    PubMed Central

    Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.; Peebles, Craig L.; Al-Atrache, Zein; Alcoser, Turi A.; Alexander, Lisa M.; Alfano, Matthew B.; Alford, Samantha T.; Amy, Nichols E.; Anderson, Marie D.; Anderson, Alexander G.; Ang, Andrew A. S.; Ares, Manuel; Barber, Amanda J.; Barker, Lucia P.; Barrett, Jonathan M.; Barshop, William D.; Bauerle, Cynthia M.; Bayles, Ian M.; Belfield, Katherine L.; Best, Aaron A.; Borjon, Agustin; Bowman, Charles A.; Boyer, Christine A.; Bradley, Kevin W.; Bradley, Victoria A.; Broadway, Lauren N.; Budwal, Keshav; Busby, Kayla N.; Campbell, Ian W.; Campbell, Anne M.; Carey, Alyssa; Caruso, Steven M.; Chew, Rebekah D.; Cockburn, Chelsea L.; Cohen, Lianne B.; Corajod, Jeffrey M.; Cresawn, Steven G.; Davis, Kimberly R.; Deng, Lisa; Denver, Dee R.; Dixon, Breyon R.; Ekram, Sahrish; Elgin, Sarah C. R.; Engelsen, Angela E.; English, Belle E. V.; Erb, Marcella L.; Estrada, Crystal; Filliger, Laura Z.; Findley, Ann M.; Forbes, Lauren; Forsyth, Mark H.; Fox, Tyler M.; Fritz, Melissa J.; Garcia, Roberto; George, Zindzi D.; Georges, Anne E.; Gissendanner, Christopher R.; Goff, Shannon; Goldstein, Rebecca; Gordon, Kobie C.; Green, Russell D.; Guerra, Stephanie L.; Guiney-Olsen, Krysta R.; Guiza, Bridget G.; Haghighat, Leila; Hagopian, Garrett V.; Harmon, Catherine J.; Harmson, Jeremy S.; Hartzog, Grant A.; Harvey, Samuel E.; He, Siping; He, Kevin J.; Healy, Kaitlin E.; Higinbotham, Ellen R.; Hildebrandt, Erin N.; Ho, Jason H.; Hogan, Gina M.; Hohenstein, Victoria G.; Holz, Nathan A.; Huang, Vincent J.; Hufford, Ericka L.; Hynes, Peter M.; Jackson, Arrykka S.; Jansen, Erica C.; Jarvik, Jonathan; Jasinto, Paul G.; Jordan, Tuajuanda C.; Kasza, Tomas; Katelyn, Murray A.; Kelsey, Jessica S.; Kerrigan, Larisa A.; Khaw, Daryl; Kim, Junghee; Knutter, Justin Z.; Ko, Ching-Chung; Larkin, Gail V.; Laroche, Jennifer R.; Latif, Asma; Leuba, Kohana D.; Leuba, Sequoia I.; Lewis, Lynn O.; Loesser-Casey, Kathryn E.; Long, Courtney A.; Lopez, A. Javier; Lowery, Nicholas; Lu, Tina Q.; Mac, Victor; Masters, Isaac R.; McCloud, Jazmyn J.; McDonough, Molly J.; Medenbach, Andrew J.; Menon, Anjali; Miller, Rachel; Morgan, Brandon K.; Ng, Patrick C.; Nguyen, Elvis; Nguyen, Katrina T.; Nguyen, Emilie T.; Nicholson, Kaylee M.; Parnell, Lindsay A.; Peirce, Caitlin E.; Perz, Allison M.; Peterson, Luke J.; Pferdehirt, Rachel E.; Philip, Seegren V.; Pogliano, Kit; Pogliano, Joe; Polley, Tamsen; Puopolo, Erica J.; Rabinowitz, Hannah S.; Resiss, Michael J.; Rhyan, Corwin N.; Robinson, Yetta M.; Rodriguez, Lauren L.; Rose, Andrew C.; Rubin, Jeffrey D.; Ruby, Jessica A.; Saha, Margaret S.; Sandoz, James W.; Savitskaya, Judith; Schipper, Dale J.; Schnitzler, Christine E.; Schott, Amanda R.; Segal, J. Bradley; Shaffer, Christopher D.; Sheldon, Kathryn E.; Shepard, Erica M.; Shepardson, Jonathan W.; Shroff, Madav K.; Simmons, Jessica M.; Simms, Erika F.; Simpson, Brandy M.; Sinclair, Kathryn M.; Sjoholm, Robert L.; Slette, Ingrid J.; Spaulding, Blaire C.; Straub, Clark L.; Stukey, Joseph; Sughrue, Trevor; Tang, Tin-Yun; Tatyana, Lyons M.; Taylor, Stephen B.; Taylor, Barbara J.; Temple, Louise M.; Thompson, Jasper V.; Tokarz, Michael P.; Trapani, Stephanie E.; Troum, Alexander P.; Tsay, Jonathan; Tubbs, Anthony T.; Walton, Jillian M.; Wang, Danielle H.; Wang, Hannah; Warner, John R.; Weisser, Emilie G.; Wendler, Samantha C.; Weston-Hafer, Kathleen A.; Whelan, Hilary M.; Williamson, Kurt E.; Willis, Angelica N.; Wirtshafter, Hannah S.; Wong, Theresa W.; Wu, Phillip; Yang, Yun jeong; Yee, Brandon C.; Zaidins, David A.; Zhang, Bo; Zúniga, Melina Y.; Hendrix, Roger W.; Hatfull, Graham F.

    2011-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists. PMID:21298013

  7. Expanding the diversity of mycobacteriophages: insights into genome architecture and evolution.

    PubMed

    Pope, Welkin H; Jacobs-Sera, Deborah; Russell, Daniel A; Peebles, Craig L; Al-Atrache, Zein; Alcoser, Turi A; Alexander, Lisa M; Alfano, Matthew B; Alford, Samantha T; Amy, Nichols E; Anderson, Marie D; Anderson, Alexander G; Ang, Andrew A S; Ares, Manuel; Barber, Amanda J; Barker, Lucia P; Barrett, Jonathan M; Barshop, William D; Bauerle, Cynthia M; Bayles, Ian M; Belfield, Katherine L; Best, Aaron A; Borjon, Agustin; Bowman, Charles A; Boyer, Christine A; Bradley, Kevin W; Bradley, Victoria A; Broadway, Lauren N; Budwal, Keshav; Busby, Kayla N; Campbell, Ian W; Campbell, Anne M; Carey, Alyssa; Caruso, Steven M; Chew, Rebekah D; Cockburn, Chelsea L; Cohen, Lianne B; Corajod, Jeffrey M; Cresawn, Steven G; Davis, Kimberly R; Deng, Lisa; Denver, Dee R; Dixon, Breyon R; Ekram, Sahrish; Elgin, Sarah C R; Engelsen, Angela E; English, Belle E V; Erb, Marcella L; Estrada, Crystal; Filliger, Laura Z; Findley, Ann M; Forbes, Lauren; Forsyth, Mark H; Fox, Tyler M; Fritz, Melissa J; Garcia, Roberto; George, Zindzi D; Georges, Anne E; Gissendanner, Christopher R; Goff, Shannon; Goldstein, Rebecca; Gordon, Kobie C; Green, Russell D; Guerra, Stephanie L; Guiney-Olsen, Krysta R; Guiza, Bridget G; Haghighat, Leila; Hagopian, Garrett V; Harmon, Catherine J; Harmson, Jeremy S; Hartzog, Grant A; Harvey, Samuel E; He, Siping; He, Kevin J; Healy, Kaitlin E; Higinbotham, Ellen R; Hildebrandt, Erin N; Ho, Jason H; Hogan, Gina M; Hohenstein, Victoria G; Holz, Nathan A; Huang, Vincent J; Hufford, Ericka L; Hynes, Peter M; Jackson, Arrykka S; Jansen, Erica C; Jarvik, Jonathan; Jasinto, Paul G; Jordan, Tuajuanda C; Kasza, Tomas; Katelyn, Murray A; Kelsey, Jessica S; Kerrigan, Larisa A; Khaw, Daryl; Kim, Junghee; Knutter, Justin Z; Ko, Ching-Chung; Larkin, Gail V; Laroche, Jennifer R; Latif, Asma; Leuba, Kohana D; Leuba, Sequoia I; Lewis, Lynn O; Loesser-Casey, Kathryn E; Long, Courtney A; Lopez, A Javier; Lowery, Nicholas; Lu, Tina Q; Mac, Victor; Masters, Isaac R; McCloud, Jazmyn J; McDonough, Molly J; Medenbach, Andrew J; Menon, Anjali; Miller, Rachel; Morgan, Brandon K; Ng, Patrick C; Nguyen, Elvis; Nguyen, Katrina T; Nguyen, Emilie T; Nicholson, Kaylee M; Parnell, Lindsay A; Peirce, Caitlin E; Perz, Allison M; Peterson, Luke J; Pferdehirt, Rachel E; Philip, Seegren V; Pogliano, Kit; Pogliano, Joe; Polley, Tamsen; Puopolo, Erica J; Rabinowitz, Hannah S; Resiss, Michael J; Rhyan, Corwin N; Robinson, Yetta M; Rodriguez, Lauren L; Rose, Andrew C; Rubin, Jeffrey D; Ruby, Jessica A; Saha, Margaret S; Sandoz, James W; Savitskaya, Judith; Schipper, Dale J; Schnitzler, Christine E; Schott, Amanda R; Segal, J Bradley; Shaffer, Christopher D; Sheldon, Kathryn E; Shepard, Erica M; Shepardson, Jonathan W; Shroff, Madav K; Simmons, Jessica M; Simms, Erika F; Simpson, Brandy M; Sinclair, Kathryn M; Sjoholm, Robert L; Slette, Ingrid J; Spaulding, Blaire C; Straub, Clark L; Stukey, Joseph; Sughrue, Trevor; Tang, Tin-Yun; Tatyana, Lyons M; Taylor, Stephen B; Taylor, Barbara J; Temple, Louise M; Thompson, Jasper V; Tokarz, Michael P; Trapani, Stephanie E; Troum, Alexander P; Tsay, Jonathan; Tubbs, Anthony T; Walton, Jillian M; Wang, Danielle H; Wang, Hannah; Warner, John R; Weisser, Emilie G; Wendler, Samantha C; Weston-Hafer, Kathleen A; Whelan, Hilary M; Williamson, Kurt E; Willis, Angelica N; Wirtshafter, Hannah S; Wong, Theresa W; Wu, Phillip; Yang, Yun jeong; Yee, Brandon C; Zaidins, David A; Zhang, Bo; Zúniga, Melina Y; Hendrix, Roger W; Hatfull, Graham F

    2011-01-27

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists.

  8. An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains.

    PubMed

    Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal

    2012-04-01

    Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.

  9. An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains

    PubMed Central

    Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal

    2012-01-01

    Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli. PMID:22139924

  10. Draft Whole-Genome Sequence of Bacillus altitudinis Strain B-388, a Producer of Extracellular RNase.

    PubMed

    Shah Mahmud, Raihan; Ulyanova, Vera; Malanin, Sergey; Dudkina, Elena; Vershinina, Valentina; Ilinskaya, Olga

    2015-01-29

    Here, we present a draft genome sequence of Bacillus altitudinis strain B-388, including a putative plasmid. The strain was isolated from the intestine of Indian meal moth, a common pest of stored grains, and it is characterized by the production of extracellular RNase, similar to binase, which is of interest for comparative studies and biotechnology. Copyright © 2015 Shah Mahmud et al.

  11. Whole-Genome Sequencing of Lactobacillus salivarius Strains BCRC 14759 and BCRC 12574

    PubMed Central

    Chiu, Shih-Hau; Wang, Li-Ting; Huang, Lina

    2017-01-01

    ABSTRACT Lactobacillus salivarius BCRC 14759 has been identified as a high-exopolysaccharide-producing strain with potential as a probiotic or fermented dairy product. Here, we report the genome sequences of L. salivarius BCRC 14759 and the comparable strain BCRC 12574, isolated from human saliva. The PacBio RSII sequencing platform was used to obtain high-quality assemblies for characterization of this probiotic candidate. PMID:29167259

  12. Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human.

    PubMed

    MacRae, Sheila L; Zhang, Quanwei; Lemetre, Christophe; Seim, Inge; Calder, Robert B; Hoeijmakers, Jan; Suh, Yousin; Gladyshev, Vadim N; Seluanov, Andrei; Gorbunova, Vera; Vijg, Jan; Zhang, Zhengdong D

    2015-04-01

    Genome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM genes appeared to be strongly conserved, with copy number variation in only four genes. Interestingly, we found NMR to have a higher copy number of CEBPG, a regulator of DNA repair, and TINF2, a protector of telomere integrity. NMR, as well as human, was also found to have a lower rate of germline nucleotide substitution than the mouse. Together, the data suggest that the long-lived NMR, as well as human, has more robust GM than mouse and identifies new targets for the analysis of the exceptional longevity of the NMR. © 2015 The Authors. Aging Cell published by the Anatomical Society and John Wiley & Sons Ltd.

  13. Global mapping of transposon location.

    PubMed

    Gabriel, Abram; Dapprich, Johannes; Kunkel, Mark; Gresham, David; Pratt, Stephen C; Dunham, Maitreya J

    2006-12-15

    Transposable genetic elements are ubiquitous, yet their presence or absence at any given position within a genome can vary between individual cells, tissues, or strains. Transposable elements have profound impacts on host genomes by altering gene expression, assisting in genomic rearrangements, causing insertional mutations, and serving as sources of phenotypic variation. Characterizing a genome's full complement of transposons requires whole genome sequencing, precluding simple studies of the impact of transposition on interindividual variation. Here, we describe a global mapping approach for identifying transposon locations in any genome, using a combination of transposon-specific DNA extraction and microarray-based comparative hybridization analysis. We use this approach to map the repertoire of endogenous transposons in different laboratory strains of Saccharomyces cerevisiae and demonstrate that transposons are a source of extensive genomic variation. We also apply this method to mapping bacterial transposon insertion sites in a yeast genomic library. This unique whole genome view of transposon location will facilitate our exploration of transposon dynamics, as well as defining bases for individual differences and adaptive potential.

  14. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    PubMed

    Li, Fagen; Zhou, Changpin; Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.

  15. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

    PubMed Central

    Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  16. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species

    PubMed Central

    Wang, Jing; Street, Nathaniel R.; Scofield, Douglas G.; Ingvarsson, Pär K.

    2016-01-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. PMID:26721855

  17. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    PubMed

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  18. Comparative genomic analysis by microbial COGs self-attraction rate.

    PubMed

    Santoni, Daniele; Romano-Spica, Vincenzo

    2009-06-21

    Whole genome analysis provides new perspectives to determine phylogenetic relationships among microorganisms. The availability of whole nucleotide sequences allows different levels of comparison among genomes by several approaches. In this work, self-attraction rates were considered for each cluster of orthologous groups of proteins (COGs) class in order to analyse gene aggregation levels in physical maps. Phylogenetic relationships among microorganisms were obtained by comparing self-attraction coefficients. Eighteen-dimensional vectors were computed for a set of 168 completely sequenced microbial genomes (19 archea, 149 bacteria). The components of the vector represent the aggregation rate of the genes belonging to each of 18 COGs classes. Genes involved in nonessential functions or related to environmental conditions showed the highest aggregation rates. On the contrary genes involved in basic cellular tasks showed a more uniform distribution along the genome, except for translation genes. Self-attraction clustering approach allowed classification of Proteobacteria, Bacilli and other species belonging to Firmicutes. Rearrangement and Lateral Gene Transfer events may influence divergences from classical taxonomy. Each set of COG classes' aggregation values represents an intrinsic property of the microbial genome. This novel approach provides a new point of view for whole genome analysis and bacterial characterization.

  19. Comparative Genomic and Transcriptomic Characterization of the Toxigenic Marine Dinoflagellate Alexandrium ostenfeldii

    PubMed Central

    Jaeckisch, Nina; Yang, Ines; Wohlrab, Sylke; Glöckner, Gernot; Kroymann, Juergen; Vogel, Heiko; Cembella, Allan; John, Uwe

    2011-01-01

    Many dinoflagellate species are notorious for the toxins they produce and ecological and human health consequences associated with harmful algal blooms (HABs). Dinoflagellates are particularly refractory to genomic analysis due to the enormous genome size, lack of knowledge about their DNA composition and structure, and peculiarities of gene regulation, such as spliced leader (SL) trans-splicing and mRNA transposition mechanisms. Alexandrium ostenfeldii is known to produce macrocyclic imine toxins, described as spirolides. We characterized the genome of A. ostenfeldii using a combination of transcriptomic data and random genomic clones for comparison with other dinoflagellates, particularly Alexandrium species. Examination of SL sequences revealed similar features as in other dinoflagellates, including Alexandrium species. SL sequences in decay indicate frequent retro-transposition of mRNA species. This probably contributes to overall genome complexity by generating additional gene copies. Sequencing of several thousand fosmid and bacterial artificial chromosome (BAC) ends yielded a wealth of simple repeats and tandemly repeated longer sequence stretches which we estimated to comprise more than half of the whole genome. Surprisingly, the repeats comprise a very limited set of 79–97 bp sequences; in part the genome is thus a relatively uniform sequence space interrupted by coding sequences. Our genomic sequence survey (GSS) represents the largest genomic data set of a dinoflagellate to date. Alexandrium ostenfeldii is a typical dinoflagellate with respect to its transcriptome and mRNA transposition but demonstrates Alexandrium-like stop codon usage. The large portion of repetitive sequences and the organization within the genome is in agreement with several other studies on dinoflagellates using different approaches. It remains to be determined whether this unusual composition is directly correlated to the exceptionally genome organization of dinoflagellates with a low amount of histones and histone-like proteins. PMID:22164224

  20. Emerging patterns of somatic mutations in cancer

    PubMed Central

    Watson, Ian R.; Takahashi, Koichi; Futreal, P. Andrew; Chin, Lynda

    2014-01-01

    The advance in technological tools for massively parallel, high-throughput sequencing of DNA has enabled the comprehensive characterization of somatic mutations in large number of tumor samples. Here, we review recent cancer genomic studies that have assembled emerging views of the landscapes of somatic mutations through deep sequencing analyses of the coding exomes and whole genomes in various cancer types. We discuss the comparative genomics of different cancers, including mutation rates, spectrums, and roles of environmental insults that influence these processes. We highlight the developing statistical approaches used to identify significantly mutated genes, and discuss the emerging biological and clinical insights from such analyses as well as the challenges ahead translating these genomic data into clinical impacts. PMID:24022702

  1. Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

    PubMed Central

    Macas, Jiří; Neumann, Pavel; Navrátilová, Alice

    2007-01-01

    Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining. PMID:18031571

  2. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison

    PubMed Central

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-01-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464

  3. Comparative inference of duplicated genes produced by polyploidization in soybean genome.

    PubMed

    Yang, Yanmei; Wang, Jinpeng; Di, Jianyong

    2013-01-01

    Soybean (Glycine max) is one of the most important crop plants for providing protein and oil. It is important to investigate soybean genome for its economic and scientific value. Polyploidy is a widespread and recursive phenomenon during plant evolution, and it could generate massive duplicated genes which is an important resource for genetic innovation. Improved sequence alignment criteria and statistical analysis are used to identify and characterize duplicated genes produced by polyploidization in soybean. Based on the collinearity method, duplicated genes by whole genome duplication account for 70.3% in soybean. From the statistical analysis of the molecular distances between duplicated genes, our study indicates that the whole genome duplication event occurred more than once in the genome evolution of soybean, which is often distributed near the ends of chromosomes.

  4. Genetic screens and functional genomics using CRISPR/Cas9 technology.

    PubMed

    Hartenian, Ella; Doench, John G

    2015-04-01

    Functional genomics attempts to understand the genome by perturbing the flow of information from DNA to RNA to protein, in order to learn how gene dysfunction leads to disease. CRISPR/Cas9 technology is the newest tool in the geneticist's toolbox, allowing researchers to edit DNA with unprecedented ease, speed and accuracy, and representing a novel means to perform genome-wide genetic screens to discover gene function. In this review, we first summarize the discovery and characterization of CRISPR/Cas9, and then compare it to other genome engineering technologies. We discuss its initial use in screening applications, with a focus on optimizing on-target activity and minimizing off-target effects. Finally, we comment on future challenges and opportunities afforded by this technology. © 2015 FEBS.

  5. Complete genome sequence of Campylobacter concisus ATCC 33237T and draft genome sequences for an additional eight well-characterized C. concisus strains

    USDA-ARS?s Scientific Manuscript database

    This report includes the complete genome of the Campylobacter concisus type strain ATCC 33237T and the draft genomes of eight additional well characterized C. concisus genomes. C. concisus has been shown to be a genetically heterogeneous species and these nine genomes provide valuable information re...

  6. Genome-wide divergence, haplotype distribution and population demographic histories for Gossypium hirsutum and Gossypium barbadense as revealed by genome-anchored SNPs

    PubMed Central

    Reddy, Umesh K.; Nimmakayala, Padma; Abburi, Venkata Lakshmi; Reddy, C. V. C. M.; Saminathan, Thangasamy; Percy, Richard G.; Yu, John Z.; Frelichowski, James; Udall, Joshua A.; Page, Justin T.; Zhang, Dong; Shehzad, Tariq; Paterson, Andrew H.

    2017-01-01

    Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using the SNPs distributed genome-wide, we examined genetic diversity, haplotype distribution and linkage disequilibrium patterns in the G. hirsutum and G. barbadense genomes to clarify population demographic history. Diversity and identity-by-state analyses have revealed little sharing of alleles between the two cultivated allotetraploid genomes, with a few exceptions that indicated sporadic gene flow. We found a high number of new alleles, representing increased nucleotide diversity, on chromosomes 1 and 2 in cultivated G. hirsutum as compared with low nucleotide diversity on these chromosomes in landrace G. hirsutum. In contrast, G. barbadense chromosomes showed negative Tajima’s D on several chromosomes for both cultivated and landrace types, which indicate that speciation of G. barbadense itself, might have occurred with relatively narrow genetic diversity. The presence of conserved linkage disequilibrium (LD) blocks and haplotypes between G. hirsutum and G. barbadense provides strong evidence for comparable patterns of evolution in their domestication processes. Our study illustrates the potential use of population genetic techniques to identify genomic regions for domestication. PMID:28128280

  7. Proteogenomic characterization of human colon and rectal cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Bing; Wang, Jing; Wang, Xiaojing

    2014-09-18

    We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Protein sequence variants encoded by somatic genomic variations displayed reduced expression compared to protein variants encoded by germline variations. mRNA transcript abundance did not reliably predict protein expression differences between tumors. Proteomics identified five protein expression subtypes, two of which were associated with the TCGA "MSI/CIMP" transcriptional subtype, but had distinct mutation and methylation patterns and associated with different clinical outcomes. Although CNAs showed strong cis- and trans-effects on mRNA expression, relatively few of these extend to the proteinmore » level. Thus, proteomics data enabled prioritization of candidate driver genes. Our analyses identified HNF4A, a novel candidate driver gene in tumors with chromosome 20q amplifications. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords novel insights into cancer biology.« less

  8. Proteins of Unknown Biochemical Function: A Persistent Problem and a Roadmap to Help Overcome It.

    PubMed

    Niehaus, Thomas D; Thamm, Antje M K; de Crécy-Lagard, Valérie; Hanson, Andrew D

    2015-11-01

    The number of sequenced genomes is rapidly increasing, but functional annotation of the genes in these genomes lags far behind. Even in Arabidopsis (Arabidopsis thaliana), only approximately 40% of enzyme- and transporter-encoding genes have credible functional annotations, and this number is even lower in nonmodel plants. Functional characterization of unknown genes is a challenge, but various databases (e.g. for protein localization and coexpression) can be mined to provide clues. If homologous microbial genes exist-and about one-half the genes encoding unknown enzymes and transporters in Arabidopsis have microbial homologs-cross-kingdom comparative genomics can powerfully complement plant-based data. Multiple lines of evidence can strengthen predictions and warrant experimental characterization. In some cases, relatively quick tests in genetically tractable microbes can determine whether a prediction merits biochemical validation, which is costly and demands specialized skills. © 2015 American Society of Plant Biologists. All Rights Reserved.

  9. Whole-Genome Sequencing of Lactobacillus salivarius Strains BCRC 14759 and BCRC 12574.

    PubMed

    Chiu, Shih-Hau; Chen, Chien-Chi; Wang, Li-Ting; Huang, Lina

    2017-11-22

    Lactobacillus salivarius BCRC 14759 has been identified as a high-exopolysaccharide-producing strain with potential as a probiotic or fermented dairy product. Here, we report the genome sequences of L. salivarius BCRC 14759 and the comparable strain BCRC 12574, isolated from human saliva. The PacBio RSII sequencing platform was used to obtain high-quality assemblies for characterization of this probiotic candidate. Copyright © 2017 Chiu et al.

  10. Direct Capture Technologies for Genomics-Guided Discovery of Natural Products.

    PubMed

    Chan, Andrew N; Santa Maria, Kevin C; Li, Bo

    2016-01-01

    Microbes are important producers of natural products, which have played key roles in understanding biology and treating disease. However, the full potential of microbes to produce natural products has yet to be realized; the overwhelming majority of natural product gene clusters encoded in microbial genomes remain "cryptic", and have not been expressed or characterized. In contrast to the fast-growing number of genomic sequences and bioinformatic tools, methods to connect these genes to natural product molecules are still limited, creating a bottleneck in genome-mining efforts to discover novel natural products. Here we review developing technologies that leverage the power of homologous recombination to directly capture natural product gene clusters and express them in model hosts for isolation and structural characterization. Although direct capture is still in its early stages of development, it has been successfully utilized in several different classes of natural products. These early successes will be reviewed, and the methods will be compared and contrasted with existing traditional technologies. Lastly, we will discuss the opportunities for the development of direct capture in other organisms, and possibilities to integrate direct capture with emerging genome-editing techniques to accelerate future study of natural products.

  11. The Complete Genome Sequence of Hyperthermophile Dictyoglomus turgidum DSM 6724™ Reveals a Specialized Carbohydrate Fermentor

    PubMed Central

    Brumm, Phillip J.; Gowda, Krishne; Robb, Frank T.; Mead, David A.

    2016-01-01

    Here we report the complete genome sequence of the chemoorganotrophic, extremely thermophilic bacterium, Dictyoglomus turgidum, which is a Gram negative, strictly anaerobic bacterium. D. turgidum and D. thermophilum together form the Dictyoglomi phylum. The two Dictyoglomus genomes are highly syntenic, and both are distantly related to Caldicellulosiruptor spp. D. turgidum is able to grow on a wide variety of polysaccharide substrates due to significant genomic commitment to glycosyl hydrolases, 16 of which were cloned and expressed in our study. The GH5, GH10, and GH42 enzymes characterized in this study suggest that D. turgidum can utilize most plant-based polysaccharides except crystalline cellulose. The DNA polymerase I enzyme was also expressed and characterized. The pure enzyme showed improved amplification of long PCR targets compared to Taq polymerase. The genome contains a full complement of DNA modifying enzymes, and an unusually high copy number (4) of a new, ancestral family of polB type nucleotidyltransferases designated as MNT (minimal nucleotidyltransferases). Considering its optimal growth at 72°C, D. turgidum has an anomalously low G+C content of 39.9% that may account for the presence of reverse gyrase, usually associated with hyperthermophiles. PMID:28066333

  12. Minireview: DNA Replication in Plant Mitochondria

    PubMed Central

    Cupp, John D.; Nielsen, Brent L.

    2014-01-01

    Higher plant mitochondrial genomes exhibit much greater structural complexity as compared to most other organisms. Unlike well-characterized metazoan mitochondrial DNA (mtDNA) replication, an understanding of the mechanism(s) and proteins involved in plant mtDNA replication remains unclear. Several plant mtDNA replication proteins, including DNA polymerases, DNA primase/helicase, and accessory proteins have been identified. Mitochondrial dynamics, genome structure, and the complexity of dual-targeted and dual-function proteins that provide at least partial redundancy suggest that plants have a unique model for maintaining and replicating mtDNA when compared to the replication mechanism utilized by most metazoan organisms. PMID:24681310

  13. Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

    PubMed

    Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

    2014-01-01

    A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.

  14. Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome

    PubMed Central

    Hamilton, Eileen P; Kapusta, Aurélie; Huvos, Piroska E; Bidwell, Shelby L; Zafar, Nikhat; Tang, Haibao; Hadjithomas, Michalis; Krishnakumar, Vivek; Badger, Jonathan H; Caler, Elisabet V; Russ, Carsten; Zeng, Qiandong; Fan, Lin; Levin, Joshua Z; Shea, Terrance; Young, Sarah K; Hegarty, Ryan; Daza, Riza; Gujja, Sharvari; Wortman, Jennifer R; Birren, Bruce W; Nusbaum, Chad; Thomas, Jainy; Carey, Clayton M; Pritham, Ellen J; Feschotte, Cédric; Noto, Tomoko; Mochizuki, Kazufumi; Papazyan, Romeo; Taverna, Sean D; Dear, Paul H; Cassidy-Hanley, Donna M; Xiong, Jie; Miao, Wei; Orias, Eduardo; Coyne, Robert S

    2016-01-01

    The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena’s germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum. DOI: http://dx.doi.org/10.7554/eLife.19090.001 PMID:27892853

  15. Microbial genome analysis: the COG approach.

    PubMed

    Galperin, Michael Y; Kristensen, David M; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

    2017-09-14

    For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  16. Gain-of-function mutagenesis approaches in rice for functional genomics and improvement of crop productivity.

    PubMed

    Moin, Mazahar; Bakshi, Achala; Saha, Anusree; Dutta, Mouboni; Kirti, P B

    2017-07-01

    The epitome of any genome research is to identify all the existing genes in a genome and investigate their roles. Various techniques have been applied to unveil the functions either by silencing or over-expressing the genes by targeted expression or random mutagenesis. Rice is the most appropriate model crop for generating a mutant resource for functional genomic studies because of the availability of high-quality genome sequence and relatively smaller genome size. Rice has syntenic relationships with members of other cereals. Hence, characterization of functionally unknown genes in rice will possibly provide key genetic insights and can lead to comparative genomics involving other cereals. The current review attempts to discuss the available gain-of-function mutagenesis techniques for functional genomics, emphasizing the contemporary approach, activation tagging and alterations to this method for the enhancement of yield and productivity of rice. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  17. The First Endogenous Herpesvirus, Identified in the Tarsier Genome, and Novel Sequences from Primate Rhadinoviruses and Lymphocryptoviruses

    PubMed Central

    Aswad, Amr; Katzourakis, Aris

    2014-01-01

    Herpesviridae is a diverse family of large and complex pathogens whose genomes are extremely difficult to sequence. This is particularly true for clinical samples, and if the virus, host, or both genomes are being sequenced for the first time. Although herpesviruses are known to occasionally integrate in host genomes, and can also be inherited in a Mendelian fashion, they are notably absent from the genomic fossil record comprised of endogenous viral elements (EVEs). Here, we combine paleovirological and metagenomic approaches to both explore the constituent viral diversity of mammalian genomes and search for endogenous herpesviruses. We describe the first endogenous herpesvirus from the genome of the Philippine tarsier, belonging to the Roseolovirus genus, and characterize its highly defective genome that is integrated and flanked by unambiguous host DNA. From a draft assembly of the aye-aye genome, we use bioinformatic tools to reveal over 100,000 bp of a novel rhadinovirus that is the first lemur gammaherpesvirus, closely related to Kaposi's sarcoma-associated virus. We also identify 58 genes of Pan paniscus lymphocryptovirus 1, the bonobo equivalent of human Epstein-Barr virus. For each of the viruses, we postulate gene function via comparative analysis to known viral relatives. Most notably, the evidence from gene content and phylogenetics suggests that the aye-aye sequences represent the most basal known rhadinovirus, and indicates that tumorigenic herpesviruses have been infecting primates since their emergence in the late Cretaceous. Overall, these data show that a genomic fossil record of herpesviruses exists despite their extremely large genomes, and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis. Our analytical approach illustrates the benefit of intersecting evolutionary approaches with metagenomics, genetics and paleovirology. PMID:24945689

  18. Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

    PubMed

    Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

    2016-12-01

    High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Characterization of St and Y genome in StStYY Elymus species (Triticeae: Poaceae) using Sequential FISH and GISH

    USDA-ARS?s Scientific Manuscript database

    Tetraploid species possessing StY genome could be donors to hexaploid species having StYH, StYP, or StYW genome constitution in the genus Elymus, and a few of StY species have been intensely studied for inferring the origin of the Y genome. In this study, genome characterization of St and Y genome w...

  20. Effective normalization for copy number variation detection from whole genome sequencing.

    PubMed

    Janevski, Angel; Varadan, Vinay; Kamalakaran, Sitharthan; Banerjee, Nilanjana; Dimitrova, Nevenka

    2012-01-01

    Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV estimates but the impact of these changes on the estimated CNVs is not well characterized. We evaluate in detail the effect of normalization methodologies in two CNV algorithms FREEC and CNV-seq using whole genome sequencing data from 8 individuals spanning four populations. We apply FREEC and CNV-seq to a sequencing data set consisting of 8 genomes. We use multiple configurations corresponding to different read-count normalization methodologies in FREEC, and statistically characterize the concordance of the CNV calls between FREEC configurations and the analogous output from CNV-seq. The normalization methodologies evaluated in FREEC are: GC content, mappability and control genome. We further stratify the concordance analysis within genic, non-genic, and a collection of validated variant regions. The GC content normalization methodology generates the highest number of altered copy number regions. Both mappability and control genome normalization reduce the total number and length of copy number regions. Mappability normalization yields Jaccard indices in the 0.07 - 0.3 range, whereas using a control genome normalization yields Jaccard index values around 0.4 with normalization based on GC content. The most critical impact of using mappability as a normalization factor is substantial reduction of deletion CNV calls. The output of another method based on control genome normalization, CNV-seq, resulted in comparable CNV call profiles, and substantial agreement in variable gene and CNV region calls. Choice of read-count normalization methodology has a substantial effect on CNV calls and the use of genomic mappability or an appropriately chosen control genome can optimize the output of CNV analysis.

  1. Genomic features of bacterial adaptation to plants

    PubMed Central

    Levy, Asaf; Gonzalez, Isai Salas; Mittelviefhaus, Maximilian; Clingenpeel, Scott; Paredes, Sur Herrera; Miao, Jiamin; Wang, Kunru; Devescovi, Giulia; Stillman, Kyra; Monteiro, Freddy; Alvarez, Bryan Rangel; Lundberg, Derek S.; Lu, Tse-Yuan; Lebeis, Sarah; Jin, Zhao; McDonald, Meredith; Klein, Andrew P.; Feltcher, Meghan E.; del Rio, Tijana Glavina; Grant, Sarah R.; Doty, Sharon L.; Ley, Ruth E.; Zhao, Bingyu; Venturi, Vittorio; Pelletier, Dale A.; Vorholt, Julia A.; Tringe, Susannah G.; Woyke, Tanja; Dangl, Jeffery L.

    2017-01-01

    Plants intimately associate with diverse bacteria. Plant-associated (PA) bacteria have ostensibly evolved genes enabling adaptation to the plant environment. However, the identities of such genes are mostly unknown and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3837 bacterial genomes to identify thousands of PA gene clusters. Genomes of PA bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant associated genomes. We experimentally validated candidates from two sets of PA genes, one involved in plant colonization, the other serving in microbe-microbe competition between PA bacteria. We also identified 64 PA protein domains that potentially mimic plant domains; some are shared with PA fungi and oomycetes. This work expands the genome-based understanding of plant-microbe interactions and provides leads for efficient and sustainable agriculture through microbiome engineering. PMID:29255260

  2. Recovering complete and draft population genomes from metagenome datasets

    DOE PAGES

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    2016-03-08

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  3. Recovering complete and draft population genomes from metagenome datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  4. Campylobacter geochelonis sp. nov. isolated from the western Hermann's tortoise (Testudo hermanni hermanni).

    PubMed

    Piccirillo, Alessandra; Niero, Giulia; Calleros, Lucía; Pérez, Ruben; Naya, Hugo; Iraola, Gregorio

    2016-09-01

    During a screening study to determine the presence of species of the genus Campylobacter in reptiles, three putative strains (RC7, RC11 and RC20T) were isolated from different individuals of the western Hermann's tortoise (Testudo hermanni hermanni). Initially, these isolates were characterized as representing Campylobacterfetus subsp. fetus by multiplex PCR and partial 16S rRNA gene sequence analysis. Further whole- genome characterization revealed considerable differences compared to other Campylobacter species. A polyphasic study was then undertaken to determine the exact taxonomic position of the isolates. The three strains were characterized by conventional phenotypic tests and whole genome sequencing. We generated robust phylogenies that showed a distinct clade containing only these strains using the 16S rRNA and atpA genes and a set of 40 universal proteins. Our phylogenetic analysis demonstrates their designation as representing a novel species and this was further confirmed using whole- genome average nucleotide identity within the genus Campylobacter (~80 %). Compared to most Campylobacter species, these strains hydrolysed hippurate, and grew well at 25 °C but not at 42 °C. Phenotypic and genetic analyses demonstrate that the three Campylobacter strains isolated from the western Hermann's tortoise represent a novel species within the genus Campylobacter, for which the name Campylobactergeochelonis sp. nov. is proposed, with RC20T (=DSM 102159T=LMG 29375T) as the type strain.

  5. OGRO: The Overview of functionally characterized Genes in Rice online database.

    PubMed

    Yamamoto, Eiji; Yonemaru, Jun-Ichi; Yamamoto, Toshio; Yano, Masahiro

    2012-12-01

    The high-quality sequence information and rich bioinformatics tools available for rice have contributed to remarkable advances in functional genomics. To facilitate the application of gene function information to the study of natural variation in rice, we comprehensively searched for articles related to rice functional genomics and extracted information on functionally characterized genes. As of 31 March 2012, 702 functionally characterized genes were annotated. This number represents about 1.6% of the predicted loci in the Rice Annotation Project Database. The compiled gene information is organized to facilitate direct comparisons with quantitative trait locus (QTL) information in the Q-TARO database. Comparison of genomic locations between functionally characterized genes and the QTLs revealed that QTL clusters were often co-localized with high-density gene regions, and that the genes associated with the QTLs in these clusters were different genes, suggesting that these QTL clusters are likely to be explained by tightly linked but distinct genes. Information on the functionally characterized genes compiled during this study is now available in the O verview of Functionally Characterized G enes in R ice O nline database (OGRO) on the Q-TARO website ( http://qtaro.abr.affrc.go.jp/ogro ). The database has two interfaces: a table containing gene information, and a genome viewer that allows users to compare the locations of QTLs and functionally characterized genes. OGRO on Q-TARO will facilitate a candidate-gene approach to identifying the genes responsible for QTLs. Because the QTL descriptions in Q-TARO contain information on agronomic traits, such comparisons will also facilitate the annotation of functionally characterized genes in terms of their effects on traits important for rice breeding. The increasing amount of information on rice gene function being generated from mutant panels and other types of studies will make the OGRO database even more valuable in the future.

  6. From genomes to metabolomes: Understanding mechanisms of symbiosis and cell-cell signaling using the archaeal system Ignicoccus-Nanoarchaeum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Podar, Mircea; Hettich, Robert; Copie, Valerie

    The main objective of this project was to use symbiotic Nanoarchaeaota, a group of thermophilic Archaea that are obligate symbionts/parasites on other Archaea, to develop an integrated multi-omic approach to study inter-species interactions as well as to understand fundamental mechanism that enable such relationships. As part of this grant we have achieved a number of important milestone on both technical and scientific levels. On the technical side, we developed immunofluorescence labeling and tracking methods to follow Nanoarchaeota in cultures and in environmental samples, we applied such methods in conjunction with flow cytometry to quantify and isolate uncultured representatives from themore » environment and characterized them by single cell genomics. On the proteomics side, we developed a more efficient and sensitive method to recover and semi-quantitatively measure membrane proteins, while achieving high total cellular proteome coverage (70-80% of the predicted proteome). Metabolomic analyses used complementary NMR and LC/GC mass spectrometry and led to the identification of novel lipids in these organisms as well as quantification of some of the major metabolites. Importantly, using several informatics approaches we were also able to integrate the transcriptomic, proteomic and metabolomic datasets, revealing aspects of the interspecies interaction that were not evident in the single omic analyses (manuscript in review). On the science side we determined that N. equitans and I. hospitalis are metabolically coupled and that N. equitans is strictly dependent on its host both for metabolic precursors and energetic needs. The actual mechanism by which small molecules move across the cell membrane remains unknown. The Ignicoccus host responds to the metabolic and energetic burned by upregulating of key primary metabolism steps and ATP synthesis. The two species have co-evolved, aspect that we determined by comparative genomics with other species of Ignicoccus (manuscript in preparation) and by characterizing other similar Nanoarchaeota systems. Using a single cell genomics approach we characterized the first terrestrial geothermal Nanoarchaeota system, from Yellowstone National Park. That nanoarchaeon uses a different host, a species of Sulfolobales, and comparative genomics with N. equitans-Ignicoccus allowed us to come up with an evolutionary model for the evolution of this group of organisms across marine and terrestrial ecosystems. Based on metabolic inferences we were also able to isolate in culture the first such terrestrial nanoarchaeal system, also from Yellowstone, which involves a species of Acidilobus. The novel nanoarchaeal system was characterized using proteomics and it helped us better understand the metabolic capabilities of these organisms as well as how co-evolution shapes the genomes of interacting species. It was also one of the very few cases in which prior genomic data was used to successfully design an approach to culture an organism, which remains the gold standard in microbiology research. As a better understanding of interspecies interaction requires multiple model systems, we have pursued identification and genomic characterization or isolation of additional nanoarchaeal systems from geographically and geochemically distinct environments. Two additional nanoarchaeal systems are presently being characterized from hot springs in Yellowstone and Iceland and will be the subject to future publications.« less

  7. Genome Sequence, Assembly and Characterization of Two Metschnikowia fructicola Strains Used as Biocontrol Agents of Postharvest Diseases

    PubMed Central

    Piombo, Edoardo; Sela, Noa; Wisniewski, Michael; Hoffmann, Maria; Gullino, Maria L.; Allard, Marc W.; Levin, Elena; Spadaro, Davide; Droby, Samir

    2018-01-01

    The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term) and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue. PMID:29666611

  8. Comparative Analysis of Transposable Elements Highlights Mobilome Diversity and Evolution in Vertebrates

    PubMed Central

    Chalopin, Domitille; Naville, Magali; Plard, Floriane; Galiana, Delphine; Volff, Jean-Nicolas

    2015-01-01

    Transposable elements (TEs) are major components of vertebrate genomes, with major roles in genome architecture and evolution. In order to characterize both common patterns and lineage-specific differences in TE content and TE evolution, we have compared the mobilomes of 23 vertebrate genomes, including 10 actinopterygian fish, 11 sarcopterygians, and 2 nonbony vertebrates. We found important variations in TE content (from 6% in the pufferfish tetraodon to 55% in zebrafish), with a more important relative contribution of TEs to genome size in fish than in mammals. Some TE superfamilies were found to be widespread in vertebrates, but most elements showed a more patchy distribution, indicative of multiple events of loss or gain. Interestingly, loss of major TE families was observed during the evolution of the sarcopterygian lineage, with a particularly strong reduction in TE diversity in birds and mammals. Phylogenetic trends in TE composition and activity were detected: Teleost fish genomes are dominated by DNA transposons and contain few ancient TE copies, while mammalian genomes have been predominantly shaped by nonlong terminal repeat retrotransposons, along with the persistence of older sequences. Differences were also found within lineages: The medaka fish genome underwent more recent TE amplification than the related platyfish, as observed for LINE retrotransposons in the mouse compared with the human genome. This study allows the identification of putative cases of horizontal transfer of TEs, and to tentatively infer the composition of the ancestral vertebrate mobilome. Taken together, the results obtained highlight the importance of TEs in the structure and evolution of vertebrate genomes, and demonstrate their major impact on genome diversity both between and within lineages. PMID:25577199

  9. Comparative analysis of transposable elements highlights mobilome diversity and evolution in vertebrates.

    PubMed

    Chalopin, Domitille; Naville, Magali; Plard, Floriane; Galiana, Delphine; Volff, Jean-Nicolas

    2015-01-09

    Transposable elements (TEs) are major components of vertebrate genomes, with major roles in genome architecture and evolution. In order to characterize both common patterns and lineage-specific differences in TE content and TE evolution, we have compared the mobilomes of 23 vertebrate genomes, including 10 actinopterygian fish, 11 sarcopterygians, and 2 nonbony vertebrates. We found important variations in TE content (from 6% in the pufferfish tetraodon to 55% in zebrafish), with a more important relative contribution of TEs to genome size in fish than in mammals. Some TE superfamilies were found to be widespread in vertebrates, but most elements showed a more patchy distribution, indicative of multiple events of loss or gain. Interestingly, loss of major TE families was observed during the evolution of the sarcopterygian lineage, with a particularly strong reduction in TE diversity in birds and mammals. Phylogenetic trends in TE composition and activity were detected: Teleost fish genomes are dominated by DNA transposons and contain few ancient TE copies, while mammalian genomes have been predominantly shaped by nonlong terminal repeat retrotransposons, along with the persistence of older sequences. Differences were also found within lineages: The medaka fish genome underwent more recent TE amplification than the related platyfish, as observed for LINE retrotransposons in the mouse compared with the human genome. This study allows the identification of putative cases of horizontal transfer of TEs, and to tentatively infer the composition of the ancestral vertebrate mobilome. Taken together, the results obtained highlight the importance of TEs in the structure and evolution of vertebrate genomes, and demonstrate their major impact on genome diversity both between and within lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Complete genome characterization of a novel enterovirus type EV-B106 isolated in China, 2012.

    PubMed

    Tang, Jingjing; Tao, Zexin; Ding, Zhengrong; Zhang, Yong; Zhang, Jie; Tian, Bingjun; Zhao, Zhixian; Zhang, Lifen; Xu, Wenbo

    2014-03-03

    Human enterovirus B106 (EV-B106) is a recently identified member of enterovirus species B. In this study, we report the complete genomic characterization of an EV-B106 strain (148/YN/CHN/12) isolated from an acute flaccid paralysis patient in Yunnan Province, China. The new strain had 79.2-81.3% nucleotide and 89.1-94.8% amino acid similarity in the VP1 region with the other two EV-B106 strains from Bolivia and Pakistan. When compared with other EV serotypes, it had the highest (73.3%) VP1 nucleotide similarity with the EV-B77 prototype strain CF496-99. However, when aligned with all EV-B106 and EV-B77 sequences available from the GenBank database, two major frame shifts were observed in the VP1 coding region, which resulted in substantial (20.5%) VP1 amino acid divergence between the two serotypes. Phylogenetic analysis and similarity plot analysis revealed multiple recombination events in the genome of this strain. This is the first report of the complete genome of EV-B106.

  11. Comparative genomic analysis of novel bacteriophages infecting Vibrio parahaemolyticus isolated from western and southern coastal areas of Korea.

    PubMed

    Yu, Junhyeok; Lim, Jeong-A; Kwak, Su-Jin; Park, Jong-Hyun; Chang, Hyun-Joo

    2018-05-01

    Vibrio parahaemolyticus, a foodborne pathogen, has become resistant to antibiotics. Therefore, alternative bio-control agents such bacteriophage are urgently needed for its control. Six novel bacteriophages specific to V. parahaemolyticus (vB_VpaP_KF1~2, vB_VpaS_KF3~6) were characterized at the molecular level in this study. Genomic similarity analysis revealed that these six bacteriophages could be divided into two groups with different genomic features, phylogenetic grouping, and morphologies. Two groups of bacteriophages had their own genes with different mechanisms for infection, assembly, and metabolism. Our results could be used as a future reference to study phage genomics or apply phages in future bio-control studies.

  12. Genomic analysis of cold-active Colwelliaphage 9A and psychrophilic phage-host interactions.

    PubMed

    Colangelo-Lillis, Jesse R; Deming, Jody W

    2013-01-01

    The 104 kb genome of cold-active bacteriophage 9A, which replicates in the marine psychrophilic gamma-proteobacterium Colwellia psychrerythraea strain 34H (between -12 and 8 °C), was sequenced and analyzed to investigate elements of molecular adaptation to low temperature and phage-host interactions in the cold. Most characterized ORFs indicated closest similarity to gamma-proteobacteria and their phages, though no single module provided definitive phylogenetic grouping. A subset of primary structural features linked to psychrophily suggested that the majority of annotated phage proteins were not psychrophilic; those that were, primarily serve phage-specific functions and may also contribute to 9A's restricted temperature range for replication as compared to host. Comparative analyses suggest ribonucleotide reductase genes were acquired laterally from host. Neither restriction modification nor the CRISPR-Cas system appeared to be the predominant phage defense mechanism of Cp34H or other cold-adapted bacteria; we hypothesize that psychrophilic hosts rely more on the use of extracellular polymeric material to block cell surface receptors recognized by phages. The relative dearth of evidence for genome-specific defenses, genetic transfer events or auxiliary metabolic genes suggest that the 9A-Cp34H system may be less tightly coupled than are other genomically characterized marine phage-host systems, with possible implications for phage specificity under different environmental conditions.

  13. Genomics of an emerging clone of Salmonella serovar Typhimurium ST313 from Nigeria and the Democratic Republic of Congo.

    PubMed

    Leekitcharoenphon, Pimlapas; Friis, Carsten; Zankari, Ea; Svendsen, Christina Aaby; Price, Lance B; Rahmani, Maral; Herrero-Fresno, Ana; Fashae, Kayode; Vandenberg, Olivier; Aarestrup, Frank M; Hendriksen, Rene S

    2013-10-15

    Salmonella enterica serovar Typhimurium ST313 is an invasive and phylogenetically distinct lineage present in sub-Saharan Africa. We report the presence of S. Typhimurium ST313 from patients in the Democratic Republic of Congo and Nigeria. Eighteen S. Typhimurium ST313 isolates were characterized by antimicrobial susceptibility testing, pulsed-field gel electrophoresis (PFGE), and multilocus sequence typing (MLST). Additionally, six of the isolates were characterized by whole genome sequence typing (WGST). The presence of a putative virulence determinant was examined in 177 Salmonella isolates belonging to 57 different serovars. All S. Typhimurium ST313 isolates harbored resistant genes encoded by blaTEM1b, catA1, strA/B, sul1, and dfrA1. Additionally, aac(6')1aa gene was detected. Phylogenetic analyses revealed close genetic relationships among Congolese and Nigerian isolates from both blood and stool. Comparative genomic analyses identified a putative virulence fragment (ST313-TD) unique to S. Typhimurium ST313 and S. Dublin. We showed in a limited number of isolates that S. Typhimurium ST313 is a prevalent sequence-type causing gastrointestinal diseases and septicemia in patients from Nigeria and DRC. We found three distinct phylogenetic clusters based on the origin of isolation suggesting some spatial evolution. Comparative genomics showed an interesting putative virulence fragment (ST313-TD) unique to S. Typhimurium ST313 and invasive S. Dublin.

  14. A 1,681-locus consensus genetic map of cultivated cucumber including 67 NB-LRR resistance gene homolog and ten gene loci

    PubMed Central

    2013-01-01

    Background Cucumber is an important vegetable crop that is susceptible to many pathogens, but no disease resistance (R) genes have been cloned. The availability of whole genome sequences provides an excellent opportunity for systematic identification and characterization of the nucleotide binding and leucine-rich repeat (NB-LRR) type R gene homolog (RGH) sequences in the genome. Cucumber has a very narrow genetic base making it difficult to construct high-density genetic maps. Development of a consensus map by synthesizing information from multiple segregating populations is a method of choice to increase marker density. As such, the objectives of the present study were to identify and characterize NB-LRR type RGHs, and to develop a high-density, integrated cucumber genetic-physical map anchored with RGH loci. Results From the Gy14 draft genome, 70 NB-containing RGHs were identified and characterized. Most RGHs were in clusters with uneven distribution across seven chromosomes. In silico analysis indicated that all 70 RGHs had EST support for gene expression. Phylogenetic analysis classified 58 RGHs into two clades: CNL and TNL. Comparative analysis revealed high-degree sequence homology and synteny in chromosomal locations of these RGH members between the cucumber and melon genomes. Fifty-four molecular markers were developed to delimit 67 of the 70 RGHs, which were integrated into a genetic map through linkage analysis. A 1,681-locus cucumber consensus map including 10 gene loci and spanning 730.0 cM in seven linkage groups was developed by integrating three component maps with a bin-mapping strategy. Physically, 308 scaffolds with 193.2 Mbp total DNA sequences were anchored onto this consensus map that covered 52.6% of the 367 Mbp cucumber genome. Conclusions Cucumber contains relatively few NB-LRR RGHs that are clustered and unevenly distributed in the genome. All RGHs seem to be transcribed and shared significant sequence homology and synteny with the melon genome suggesting conservation of these RGHs in the Cucumis lineage. The 1,681-locus consensus genetic-physical map developed and the RGHs identified and characterized herein are valuable genomics resources that may have many applications such as quantitative trait loci identification, map-based gene cloning, association mapping, marker-assisted selection, as well as assembly of a more complete cucumber genome. PMID:23531125

  15. Comparative Analysis of the Peanut Witches'-Broom Phytoplasma Genome Reveals Horizontal Transfer of Potential Mobile Units and Effectors

    PubMed Central

    Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution. PMID:23626855

  16. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    PubMed Central

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813

  17. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  18. Plasmodium cynomolgi genome sequences provide insight into Plasmodium vivax and the monkey malaria clade.

    PubMed

    Tachibana, Shin-Ichiro; Sullivan, Steven A; Kawai, Satoru; Nakamura, Shota; Kim, Hyunjae R; Goto, Naohisa; Arisue, Nobuko; Palacpac, Nirianne M Q; Honma, Hajime; Yagi, Masanori; Tougan, Takahiro; Katakai, Yuko; Kaneko, Osamu; Mita, Toshihiro; Kita, Kiyoshi; Yasutomi, Yasuhiro; Sutton, Patrick L; Shakhbatyan, Rimma; Horii, Toshihiro; Yasunaga, Teruo; Barnwell, John W; Escalante, Ananias A; Carlton, Jane M; Tanabe, Kazuyuki

    2012-09-01

    P. cynomolgi, a malaria-causing parasite of Asian Old World monkeys, is the sister taxon of P. vivax, the most prevalent malaria-causing species in humans outside of Africa. Because P. cynomolgi shares many phenotypic, biological and genetic characteristics with P. vivax, we generated draft genome sequences for three P. cynomolgi strains and performed genomic analysis comparing them with the P. vivax genome, as well as with the genome of a third previously sequenced simian parasite, Plasmodium knowlesi. Here, we show that genomes of the monkey malaria clade can be characterized by copy-number variants (CNVs) in multigene families involved in evasion of the human immune system and invasion of host erythrocytes. We identify genome-wide SNPs, microsatellites and CNVs in the P. cynomolgi genome, providing a map of genetic variation that can be used to map parasite traits and study parasite populations. The sequencing of the P. cynomolgi genome is a critical step in developing a model system for P. vivax research and in counteracting the neglect of P. vivax.

  19. proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes.

    PubMed

    Mende, Daniel R; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

    2017-01-04

    The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. A draft genome assembly of the army worm, Spodoptera frugiperda.

    PubMed

    Kakumani, Pavan Kumar; Malhotra, Pawan; Mukherjee, Sunil K; Bhatnagar, Raj K

    2014-08-01

    Spodoptera is an agriculturally important pest insect and studies in understanding its biology have been limited by the unavailability of its genome. In the present study, the genomic DNA was sequenced and assembled into 37,243 scaffolds of size, 358 Mb with N50 of 53.7 kb. Based on degree of identity, we could anchor 305 Mb of the genome onto all the 28 chromosomes of Bombyx mori. Repeat elements were identified, which accounts for 20.28% of the total genome. Further, we predicted 11,595 genes, with an average intron length of 726 bp. The genes were annotated and domain analysis revealed that Sf genes share a significant homology and expression pattern with B. mori, despite differences in KOG gene categories and representation of certain protein families. The present study on Sf genome would help in the characterization of cellular pathways to understand its biology and comparative evolutionary studies among lepidopteran family members to help annotate their genomes. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Construction of two novel reciprocal conplastic rat strains and characterization of cardiac mitochondria

    PubMed Central

    Kumarasamy, Sivarajan; Gopalakrishnan, Kathirvel; Abdul-Majeed, Shakila; Partow-Navid, Rod; Farms, Phyllis

    2013-01-01

    Because of the lack of appropriate animal models, the potentially causal contributions of inherited mitochondrial genomic factors to complex traits are less well studied compared with inherited nuclear genomic factors. We previously detected variations between the mitochondrial DNA (mtDNA) of the Dahl salt-sensitive (S) rat and the spontaneously hypertensive rat (SHR). Specifically, multiple variations were detected in mitochondrial genes coding for subunits of proteins essential for electron transport, in mitochondrial reactive oxygen species production, and within the D-loop region. To evaluate the effects of these mtDNA variations in the absence of the corresponding nuclear genomic factors as confounding variables, novel reciprocal strains of S and SHR were constructed and characterized. When compared with that of the S rat, the heart tissue from the S.SHRmt conplastic strain wherein the mtDNA of the S rat was substituted with that of the SHR had a significant increase in mtDNA copy number and decrease in mitochondrial reactive oxygen species production. A corresponding increase in aerobic treadmill running capacity and a significant increase in survival that was not related to changes in blood pressure were observed in the S.SHRmt rats compared with the S rat. The reciprocal SHR.Smt rats did not differ from the SHR in any phenotype tested, suggesting lower penetrance of the S mtDNA on the nuclear genomic background of the SHR. These novel conplastic strains serve as invaluable tools to further dissect the relationship between heart function, aerobic fitness, cardiovascular disease progression, and mortality. PMID:23125210

  2. Characterization of Vibrio parahaemolyticus clinical strains from Maryland (2012-2013) and comparisons to a locally and globally diverse V. parahaemolyticus strains by whole-genome sequence analysis.

    PubMed

    Haendiges, Julie; Timme, Ruth; Allard, Marc W; Myers, Robert A; Brown, Eric W; Gonzalez-Escalona, Narjol

    2015-01-01

    Vibrio parahaemolyticus is the leading cause of foodborne illnesses in the US associated with the consumption of raw shellfish. Previous population studies of V. parahaemolyticus have used Multi-Locus Sequence Typing (MLST) or Pulsed Field Gel Electrophoresis (PFGE). Whole genome sequencing (WGS) provides a much higher level of resolution, but has been used to characterize only a few United States (US) clinical isolates. Here we report the WGS characterization of 34 genomes of V. parahaemolyticus strains that were isolated from clinical cases in the state of Maryland (MD) during 2 years (2012-2013). These 2 years saw an increase of V. parahaemolyticus cases compared to previous years. Among these MD isolates, 28% were negative for tdh and trh, 8% were tdh positive only, 11% were trh positive only, and 53% contained both genes. We compared this set of V. parahaemolyticus genomes to those of a collection of 17 archival strains from the US (10 previously sequenced strains and 7 from NCBI, collected between 1988 and 2004) and 15 international strains, isolated from geographically-diverse environmental and clinical sources (collected between 1980 and 2010). A WGS phylogenetic analysis of these strains revealed the regional outbreak strains from MD are highly diverse and yet genetically distinct from the international strains. Some MD strains caused outbreaks 2 years in a row, indicating a local source of contamination (e.g., ST631). Advances in WGS will enable this type of analysis to become routine, providing an excellent tool for improved surveillance. Databases built with phylogenetic data will help pinpoint sources of contamination in future outbreaks and contribute to faster outbreak control.

  3. Characterization of Vibrio parahaemolyticus clinical strains from Maryland (2012–2013) and comparisons to a locally and globally diverse V. parahaemolyticus strains by whole-genome sequence analysis

    PubMed Central

    Haendiges, Julie; Timme, Ruth; Allard, Marc W.; Myers, Robert A.; Brown, Eric W.; Gonzalez-Escalona, Narjol

    2015-01-01

    Vibrio parahaemolyticus is the leading cause of foodborne illnesses in the US associated with the consumption of raw shellfish. Previous population studies of V. parahaemolyticus have used Multi-Locus Sequence Typing (MLST) or Pulsed Field Gel Electrophoresis (PFGE). Whole genome sequencing (WGS) provides a much higher level of resolution, but has been used to characterize only a few United States (US) clinical isolates. Here we report the WGS characterization of 34 genomes of V. parahaemolyticus strains that were isolated from clinical cases in the state of Maryland (MD) during 2 years (2012–2013). These 2 years saw an increase of V. parahaemolyticus cases compared to previous years. Among these MD isolates, 28% were negative for tdh and trh, 8% were tdh positive only, 11% were trh positive only, and 53% contained both genes. We compared this set of V. parahaemolyticus genomes to those of a collection of 17 archival strains from the US (10 previously sequenced strains and 7 from NCBI, collected between 1988 and 2004) and 15 international strains, isolated from geographically-diverse environmental and clinical sources (collected between 1980 and 2010). A WGS phylogenetic analysis of these strains revealed the regional outbreak strains from MD are highly diverse and yet genetically distinct from the international strains. Some MD strains caused outbreaks 2 years in a row, indicating a local source of contamination (e.g., ST631). Advances in WGS will enable this type of analysis to become routine, providing an excellent tool for improved surveillance. Databases built with phylogenetic data will help pinpoint sources of contamination in future outbreaks and contribute to faster outbreak control. PMID:25745421

  4. Genomics meets applied ecology: Characterizing habitat quality for sloths in a tropical agroecosystem.

    PubMed

    Fountain, Emily D; Kang, Jung Koo; Tempel, Douglas J; Palsbøll, Per J; Pauli, Jonathan N; Zachariah Peery, M

    2018-01-01

    Understanding how habitat quality in heterogeneous landscapes governs the distribution and fitness of individuals is a fundamental aspect of ecology. While mean individual fitness is generally considered a key to assessing habitat quality, a comprehensive understanding of habitat quality in heterogeneous landscapes requires estimates of dispersal rates among habitat types. The increasing accessibility of genomic approaches, combined with field-based demographic methods, provides novel opportunities for incorporating dispersal estimation into assessments of habitat quality. In this study, we integrated genomic kinship approaches with field-based estimates of fitness components and approximate Bayesian computation (ABC) procedures to estimate habitat-specific dispersal rates and characterize habitat quality in two-toed sloths (Choloepus hoffmanni) occurring in a Costa Rican agricultural ecosystem. Field-based observations indicated that birth and survival rates were similar in a sparsely shaded cacao farm and adjacent cattle pasture-forest mosaic. Sloth density was threefold higher in pasture compared with cacao, whereas home range size and overlap were greater in cacao compared with pasture. Dispersal rates were similar between the two habitats, as estimated using ABC procedures applied to the spatial distribution of pairs of related individuals identified using 3,431 single nucleotide polymorphism and 11 microsatellite locus genotypes. Our results indicate that crops produced under a sparse overstorey can, in some cases, constitute lower-quality habitat than pasture-forest mosaics for sloths, perhaps because of differences in food resources or predator communities. Finally, our study demonstrates that integrating field-based demographic approaches with genomic methods can provide a powerful means for characterizing habitat quality for animal populations occurring in heterogeneous landscapes. © 2017 John Wiley & Sons Ltd.

  5. Analysis of BAC end sequences in oak, a keystone forest tree species, providing insight into the composition of its genome

    PubMed Central

    2011-01-01

    Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357

  6. The genome of the vervet (Chlorocebus aethiops sabaeus)

    PubMed Central

    Warren, Wesley C.; Jasinska, Anna J.; García-Pérez, Raquel; Svardal, Hannes; Tomlinson, Chad; Rocchi, Mariano; Archidiacono, Nicoletta; Capozzi, Oronzo; Minx, Patrick; Montague, Michael J.; Kyung, Kim; Hillier, LaDeana W.; Kremitzki, Milinn; Graves, Tina; Chiang, Colby; Hughes, Jennifer; Tran, Nam; Huang, Yu; Ramensky, Vasily; Choi, Oi-wa; Jung, Yoon J.; Schmitt, Christopher A.; Juretic, Nikoleta; Wasserscheid, Jessica; Turner, Trudy R.; Wiseman, Roger W.; Tuscher, Jennifer J.; Karl, Julie A.; Schmitz, Jörn E.; Zahn, Roland; O'Connor, David H.; Redmond, Eugene; Nisbett, Alex; Jacquelin, Béatrice; Müller-Trutwin, Michaela C.; Brenchley, Jason M.; Dione, Michel; Antonio, Martin; Schroth, Gary P.; Kaplan, Jay R.; Jorgensen, Matthew J.; Thomas, Gregg W.C.; Hahn, Matthew W.; Raney, Brian J.; Aken, Bronwen; Nag, Rishi; Schmitz, Juergen; Churakov, Gennady; Noll, Angela; Stanyon, Roscoe; Webb, David; Thibaud-Nissen, Francoise; Nordborg, Magnus; Marques-Bonet, Tomas; Dewar, Ken; Weinstock, George M.; Wilson, Richard K.; Freimer, Nelson B.

    2015-01-01

    We describe a genome reference of the African green monkey or vervet (Chlorocebus aethiops). This member of the Old World monkey (OWM) superfamily is uniquely valuable for genetic investigations of simian immunodeficiency virus (SIV), for which it is the most abundant natural host species, and of a wide range of health-related phenotypes assessed in Caribbean vervets (C. a. sabaeus), whose numbers have expanded dramatically since Europeans introduced small numbers of their ancestors from West Africa during the colonial era. We use the reference to characterize the genomic relationship between vervets and other primates, the intra-generic phylogeny of vervet subspecies, and genome-wide structural variations of a pedigreed C. a. sabaeus population. Through comparative analyses with human and rhesus macaque, we characterize at high resolution the unique chromosomal fission events that differentiate the vervets and their close relatives from most other catarrhine primates, in whom karyotype is highly conserved. We also provide a summary of transposable elements and contrast these with the rhesus macaque and human. Analysis of sequenced genomes representing each of the main vervet subspecies supports previously hypothesized relationships between these populations, which range across most of sub-Saharan Africa, while uncovering high levels of genetic diversity within each. Sequence-based analyses of major histocompatibility complex (MHC) polymorphisms reveal extremely low diversity in Caribbean C. a. sabaeus vervets, compared to vervets from putatively ancestral West African regions. In the C. a. sabaeus research population, we discover the first structural variations that are, in some cases, predicted to have a deleterious effect; future studies will determine the phenotypic impact of these variations. PMID:26377836

  7. Scanning the human genome at kilobase resolution.

    PubMed

    Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming

    2008-05-01

    Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.

  8. Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses

    PubMed Central

    2011-01-01

    Background Integration of retroviral DNA into a germ cell may lead to a provirus that is transmitted vertically to that host's offspring as an endogenous retrovirus (ERV). In humans, ERVs (HERVs) comprise about 8% of the genome, the vast majority of which are truncated and/or highly mutated and no longer encode functional genes. The most recently active retroviruses that integrated into the human germ line are members of the Betaretrovirus-like HERV-K (HML-2) group, many of which contain intact open reading frames (ORFs) in some or all genes, sometimes encoding functional proteins that are expressed in various tissues. Interestingly, this expression is upregulated in many tumors ranging from breast and ovarian tissues to lymphomas and melanomas, as well as schizophrenia, rheumatoid arthritis, and other disorders. Results No study to date has characterized all HML-2 elements in the genome, an essential step towards determining a possible functional role of HML-2 expression in disease. We present here the most comprehensive and accurate catalog of all full-length and partial HML-2 proviruses, as well as solo LTR elements, within the published human genome to date. Furthermore, we provide evidence for preferential maintenance of proviruses and solo LTR elements on gene-rich chromosomes of the human genome and in proximity to gene regions. Conclusions Our analysis has found and corrected several errors in the annotation of HML-2 elements in the human genome, including mislabeling of a newly identified group called HML-11. HML-elements have been implicated in a wide array of diseases, and characterization of these elements will play a fundamental role to understand the relationship between endogenous retrovirus expression and disease. PMID:22067224

  9. Characterization of novel RS1 exonic deletions in juvenile X-linked retinoschisis

    PubMed Central

    D’Souza, Leera; Cukras, Catherine; Antolik, Christian; Craig, Candice; He, Hong; Li, Shibo; Hejtmancik, James F.; Sieving, Paul A.; Wang, Xinjing

    2013-01-01

    Purpose X-linked juvenile retinoschisis (XLRS) is a vitreoretinal dystrophy characterized by schisis (splitting) of the inner layers of the neuroretina. Mutations within the retinoschisis (RS1) gene are responsible for this disease. The mutation spectrum consists of amino acid substitutions, splice site variations, small indels, and larger genomic deletions. Clinically, genomic deletions are rarely reported. Here, we characterize two novel full exonic deletions: one encompassing exon 1 and the other spanning exons 4–5 of the RS1 gene. We also report the clinical findings in these patients with XLRS with two different exonic deletions. Methods Unrelated XLRS men and boys and their mothers (if available) were enrolled for molecular genetics evaluation. The patients also underwent ophthalmologic examination and in some cases electroretinogram (ERG) recording. All the exons and the flanking intronic regions of the RS1 gene were analyzed with direct sequencing. Two patients with exonic deletions were further evaluated with array comparative genomic hybridization to define the scope of the genomic aberrations. After the deleted genomic region was identified, primer walking followed by direct sequencing was used to determine the exact breakpoints. Results Two novel exonic deletions of the RS1 gene were identified: one including exon 1 and the other spanning exons 4 and 5. The exon 1 deletion extends from the 5′ region of the RS1 gene (including the promoter) through intron 1 (c.(−35)-1723_c.51+2664del4472). The exon 4–5 deletion spans introns 3 to intron 5 (c.185–1020_c.522+1844del5764). Conclusions Here we report two novel exonic deletions within the RS1 gene locus. We have also described the clinical presentations and hypothesized the genomic mechanisms underlying these schisis phenotypes. PMID:24227916

  10. Characterization of novel RS1 exonic deletions in juvenile X-linked retinoschisis.

    PubMed

    D'Souza, Leera; Cukras, Catherine; Antolik, Christian; Craig, Candice; Lee, Ji-Yun; He, Hong; Li, Shibo; Smaoui, Nizar; Hejtmancik, James F; Sieving, Paul A; Wang, Xinjing

    2013-01-01

    X-linked juvenile retinoschisis (XLRS) is a vitreoretinal dystrophy characterized by schisis (splitting) of the inner layers of the neuroretina. Mutations within the retinoschisis (RS1) gene are responsible for this disease. The mutation spectrum consists of amino acid substitutions, splice site variations, small indels, and larger genomic deletions. Clinically, genomic deletions are rarely reported. Here, we characterize two novel full exonic deletions: one encompassing exon 1 and the other spanning exons 4-5 of the RS1 gene. We also report the clinical findings in these patients with XLRS with two different exonic deletions. Unrelated XLRS men and boys and their mothers (if available) were enrolled for molecular genetics evaluation. The patients also underwent ophthalmologic examination and in some cases electroretinogram (ERG) recording. All the exons and the flanking intronic regions of the RS1 gene were analyzed with direct sequencing. Two patients with exonic deletions were further evaluated with array comparative genomic hybridization to define the scope of the genomic aberrations. After the deleted genomic region was identified, primer walking followed by direct sequencing was used to determine the exact breakpoints. Two novel exonic deletions of the RS1 gene were identified: one including exon 1 and the other spanning exons 4 and 5. The exon 1 deletion extends from the 5' region of the RS1 gene (including the promoter) through intron 1 (c.(-35)-1723_c.51+2664del4472). The exon 4-5 deletion spans introns 3 to intron 5 (c.185-1020_c.522+1844del5764). Here we report two novel exonic deletions within the RS1 gene locus. We have also described the clinical presentations and hypothesized the genomic mechanisms underlying these schisis phenotypes.

  11. Whole-genome sequencing of the efficient industrial fuel-ethanol fermentative Saccharomyces cerevisiae strain CAT-1.

    PubMed

    Babrzadeh, Farbod; Jalili, Roxana; Wang, Chunlin; Shokralla, Shadi; Pierce, Sarah; Robinson-Mosher, Avi; Nyren, Pål; Shafer, Robert W; Basso, Luiz C; de Amorim, Henrique V; de Oliveira, Antonio J; Davis, Ronald W; Ronaghi, Mostafa; Gharizadeh, Baback; Stambuk, Boris U

    2012-06-01

    The Saccharomyces cerevisiae strains widely used for industrial fuel-ethanol production have been developed by selection, but their underlying beneficial genetic polymorphisms remain unknown. Here, we report the draft whole-genome sequence of the S. cerevisiae strain CAT-1, which is a dominant fuel-ethanol fermentative strain from the sugarcane industry in Brazil. Our results indicate that strain CAT-1 is a highly heterozygous diploid yeast strain, and the ~12-Mb genome of CAT-1, when compared with the reference S228c genome, contains ~36,000 homozygous and ~30,000 heterozygous single nucleotide polymorphisms, exhibiting an uneven distribution among chromosomes due to large genomic regions of loss of heterozygosity (LOH). In total, 58 % of the 6,652 predicted protein-coding genes of the CAT-1 genome constitute different alleles when compared with the genes present in the reference S288c genome. The CAT-1 genome contains a reduced number of transposable elements, as well as several gene deletions and duplications, especially at telomeric regions, some correlated with several of the physiological characteristics of this industrial fuel-ethanol strain. Phylogenetic analyses revealed that some genes were likely associated with traits important for bioethanol production. Identifying and characterizing the allelic variations controlling traits relevant to industrial fermentation should provide the basis for a forward genetics approach for developing better fermenting yeast strains.

  12. Genome comparison of two Magnaporthe oryzae field isolates reveals genome variations and potential virulence effectors

    PubMed Central

    2013-01-01

    Background Rice blast caused by the fungus Magnaporthe oryzae is an important disease in virtually every rice growing region of the world, which leads to significant annual decreases of grain quality and yield. To prevent disease, resistance genes in rice have been cloned and introduced into susceptible cultivars. However, introduced resistance can often be broken within few years of release, often due to mutation of cognate avirulence genes in fungal field populations. Results To better understand the pattern of mutation of M. oryzae field isolates under natural selection forces, we used a next generation sequencing approach to analyze the genomes of two field isolates FJ81278 and HN19311, as well as the transcriptome of FJ81278. By comparing the de novo genome assemblies of the two isolates against the finished reference strain 70–15, we identified extensive polymorphisms including unique genes, SNPs (single nucleotide polymorphism) and indels, structural variations, copy number variations, and loci under strong positive selection. The 1.75 MB of isolate-specific genome content carrying 118 novel genes from FJ81278, and 0.83 MB from HN19311 were also identified. By analyzing secreted proteins carrying polymorphisms, in total 256 candidate virulence effectors were found and 6 were chosen for functional characterization. Conclusions We provide results from genome comparison analysis showing extensive genome variation, and generated a list of M. oryzae candidate virulence effectors for functional characterization. PMID:24341723

  13. Exception to the Rule: Genomic Characterization of Naturally Occurring Unusual Vibrio cholerae Strains with a Single Chromosome

    DOE PAGES

    Xie, Gary; Johnson, Shannon Lyn; Davenport, Karen Walston; ...

    2017-08-29

    Here, the genetic make-up of most bacteria is encoded in a single chromosome while about 10% have more than one chromosome. Among these, Vibrio cholerae, with two chromosomes, has served as a model system to study various aspects of chromosome maintenance, mainly replication, and faithful partitioning of multipartite genomes. Here, we describe the genomic characterization of strains that are an exception to the two chromosome rules: naturally occurring single-chromosome V. cholerae. Whole genome sequence analyses of NSCV1 and NSCV2 (natural single-chromosome vibrio) revealed that the Chr1 and Chr2 fusion junctions contain prophages, IS elements, and direct repeats, in addition tomore » large-scale chromosomal rearrangements such as inversions, insertions, and long tandem repeats elsewhere in the chromosome compared to prototypical two chromosome V. cholerae genomes. Many of the known cholera virulence factors are absent. The two origins of replication and associated genes are generally intact with synonymous mutations in some genes, as arerecAand mismatch repair (MMR) genes dam, mutH, and mutL; MutS function is probably impaired in NSCV2. These strains are ideal tools for studying mechanistic aspects of maintenance of chromosomes with multiple origins and other rearrangements and the biological, functional, and evolutionary significance of multipartite genome architecture in general.« less

  14. Exception to the Rule: Genomic Characterization of Naturally Occurring Unusual Vibrio cholerae Strains with a Single Chromosome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xie, Gary; Johnson, Shannon Lyn; Davenport, Karen Walston

    Here, the genetic make-up of most bacteria is encoded in a single chromosome while about 10% have more than one chromosome. Among these, Vibrio cholerae, with two chromosomes, has served as a model system to study various aspects of chromosome maintenance, mainly replication, and faithful partitioning of multipartite genomes. Here, we describe the genomic characterization of strains that are an exception to the two chromosome rules: naturally occurring single-chromosome V. cholerae. Whole genome sequence analyses of NSCV1 and NSCV2 (natural single-chromosome vibrio) revealed that the Chr1 and Chr2 fusion junctions contain prophages, IS elements, and direct repeats, in addition tomore » large-scale chromosomal rearrangements such as inversions, insertions, and long tandem repeats elsewhere in the chromosome compared to prototypical two chromosome V. cholerae genomes. Many of the known cholera virulence factors are absent. The two origins of replication and associated genes are generally intact with synonymous mutations in some genes, as arerecAand mismatch repair (MMR) genes dam, mutH, and mutL; MutS function is probably impaired in NSCV2. These strains are ideal tools for studying mechanistic aspects of maintenance of chromosomes with multiple origins and other rearrangements and the biological, functional, and evolutionary significance of multipartite genome architecture in general.« less

  15. Genome expansion via lineage splitting and genome reduction in the cicada endosymbiont Hodgkinia.

    PubMed

    Campbell, Matthew A; Van Leuven, James T; Meister, Russell C; Carey, Kaitlin M; Simon, Chris; McCutcheon, John P

    2015-08-18

    Comparative genomics from mitochondria, plastids, and mutualistic endosymbiotic bacteria has shown that the stable establishment of a bacterium in a host cell results in genome reduction. Although many highly reduced genomes from endosymbiotic bacteria are stable in gene content and genome structure, organelle genomes are sometimes characterized by dramatic structural diversity. Previous results from Candidatus Hodgkinia cicadicola, an endosymbiont of cicadas, revealed that some lineages of this bacterium had split into two new cytologically distinct yet genetically interdependent species. It was hypothesized that the long life cycle of cicadas in part enabled this unusual lineage-splitting event. Here we test this hypothesis by investigating the structure of the Ca. Hodgkinia genome in one of the longest-lived cicadas, Magicicada tredecim. We show that the Ca. Hodgkinia genome from M. tredecim has fragmented into multiple new chromosomes or genomes, with at least some remaining partitioned into discrete cells. We also show that this lineage-splitting process has resulted in a complex of Ca. Hodgkinia genomes that are 1.1-Mb pairs in length when considered together, an almost 10-fold increase in size from the hypothetical single-genome ancestor. These results parallel some examples of genome fragmentation and expansion in organelles, although the mechanisms that give rise to these extreme genome instabilities are likely different.

  16. Comparative Analysis of the Full Genome of Helicobacter pylori Isolate Sahul64 Identifies Genes of High Divergence

    PubMed Central

    Lu, Wei; Wise, Michael J.; Tay, Chin Yen; Windsor, Helen M.; Marshall, Barry J.; Peacock, Christopher

    2014-01-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains. PMID:24375107

  17. Comparative analysis of the full genome of Helicobacter pylori isolate Sahul64 identifies genes of high divergence.

    PubMed

    Lu, Wei; Wise, Michael J; Tay, Chin Yen; Windsor, Helen M; Marshall, Barry J; Peacock, Christopher; Perkins, Tim

    2014-03-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains.

  18. Identification of ecotype-specific marker genes for categorization of beer-spoiling Lactobacillus brevis.

    PubMed

    Behr, Jürgen; Geissler, Andreas J; Preissler, Patrick; Ehrenreich, Armin; Angelov, Angel; Vogel, Rudi F

    2015-10-01

    The tolerance to hop compounds, which is mainly associated with inhibition of bacterial growth in beer, is a multi-factorial trait. Any approaches to predict the physiological differences between beer-spoiling and non-spoiling strains on the basis of a single marker gene are limited. We identified ecotype-specific genes related to the ability to grow in Pilsner beer via comparative genome sequencing. The genome sequences of four different strains of Lactobacillus brevis were compared, including newly established genomes of two highly hop tolerant beer isolates, one strain isolated from faeces and one published genome of a silage isolate. Gene fragments exclusively occurring in beer-spoiling strains as well as sequences only occurring in non-spoiling strains were identified. Comparative genomic arrays were established and hybridized with a set of L. brevis strains, which are characterized by their ability to spoil beer. As result, a set of 33 and 4 oligonucleotide probes could be established specifically detecting beer-spoilers and non-spoilers, respectively. The detection of more than one of these marker sequences according to a genetic barcode enables scoring of L. brevis for their beer-spoiling potential and can thus assist in risk evaluation in brewing industry. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. The Complete Genome Sequence of Hyperthermophile Dictyoglomus turgidum DSM 6724™ Reveals a Specialized Carbohydrate Fermentor

    DOE PAGES

    Brumm, Phillip J.; Gowda, Krishne; Robb, Frank T.; ...

    2016-12-20

    In this study we report the complete genome sequence of the chemoorganotrophic, extremely thermophilic bacterium, Dictyoglomus turgidum, which is a Gram negative, strictly anaerobic bacterium. D. turgidum and D. thermophilum together form the Dictyoglomi phylum. The two Dictyoglomus genomes are highly syntenic, and both are distantly related to Caldicellulosiruptor spp. D. turgidum is able to grow on a wide variety of polysaccharide substrates due to significant genomic commitment to glycosyl hydrolases, 16 of which were cloned and expressed in our study. The GH5, GH10, and GH42 enzymes characterized in this study suggest that D. turgidum can utilize most plant-based polysaccharidesmore » except crystalline cellulose. The DNA polymerase I enzyme was also expressed and characterized. The pure enzyme showed improved amplification of long PCR targets compared to Taq polymerase. The genome contains a full complement of DNA modifying enzymes, and an unusually high copy number (4) of a new, ancestral family of polB type nucleotidyltransferases designated as MNT (minimal nucleotidyltransferases). Considering its optimal growth at 72°C, D. turgidum has an anomalously low G+C content of 39.9% that may account for the presence of reverse gyrase, usually associated with hyperthermophiles.« less

  20. Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio).

    PubMed

    Liu, Xiang; Li, Shangqi; Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A; Xu, Peng

    2016-01-01

    The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.

  1. The Complete Genome Sequence of Hyperthermophile Dictyoglomus turgidum DSM 6724™ Reveals a Specialized Carbohydrate Fermentor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brumm, Phillip J.; Gowda, Krishne; Robb, Frank T.

    In this study we report the complete genome sequence of the chemoorganotrophic, extremely thermophilic bacterium, Dictyoglomus turgidum, which is a Gram negative, strictly anaerobic bacterium. D. turgidum and D. thermophilum together form the Dictyoglomi phylum. The two Dictyoglomus genomes are highly syntenic, and both are distantly related to Caldicellulosiruptor spp. D. turgidum is able to grow on a wide variety of polysaccharide substrates due to significant genomic commitment to glycosyl hydrolases, 16 of which were cloned and expressed in our study. The GH5, GH10, and GH42 enzymes characterized in this study suggest that D. turgidum can utilize most plant-based polysaccharidesmore » except crystalline cellulose. The DNA polymerase I enzyme was also expressed and characterized. The pure enzyme showed improved amplification of long PCR targets compared to Taq polymerase. The genome contains a full complement of DNA modifying enzymes, and an unusually high copy number (4) of a new, ancestral family of polB type nucleotidyltransferases designated as MNT (minimal nucleotidyltransferases). Considering its optimal growth at 72°C, D. turgidum has an anomalously low G+C content of 39.9% that may account for the presence of reverse gyrase, usually associated with hyperthermophiles.« less

  2. Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio)

    PubMed Central

    Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A.

    2016-01-01

    The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp. PMID:27058731

  3. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    PubMed

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  4. Contrasting patterns of evolutionary constraint and novelty revealed by comparative sperm proteomic analysis in Lepidoptera.

    PubMed

    Whittington, Emma; Forsythe, Desiree; Borziak, Kirill; Karr, Timothy L; Walters, James R; Dorus, Steve

    2017-12-02

    Rapid evolution is a hallmark of reproductive genetic systems and arises through the combined processes of sequence divergence, gene gain and loss, and changes in gene and protein expression. While studies aiming to disentangle the molecular ramifications of these processes are progressing, we still know little about the genetic basis of evolutionary transitions in reproductive systems. Here we conduct the first comparative analysis of sperm proteomes in Lepidoptera, a group that exhibits dichotomous spermatogenesis, in which males produce a functional fertilization-competent sperm (eupyrene) and an incompetent sperm morph lacking nuclear DNA (apyrene). Through the integrated application of evolutionary proteomics and genomics, we characterize the genomic patterns potentially associated with the origination and evolution of this unique spermatogenic process and assess the importance of genetic novelty in Lepidopteran sperm biology. Comparison of the newly characterized Monarch butterfly (Danaus plexippus) sperm proteome to those of the Carolina sphinx moth (Manduca sexta) and the fruit fly (Drosophila melanogaster) demonstrated conservation at the level of protein abundance and post-translational modification within Lepidoptera. In contrast, comparative genomic analyses across insects reveals significant divergence at two levels that differentiate the genetic architecture of sperm in Lepidoptera from other insects. First, a significant reduction in orthology among Monarch sperm genes relative to the remainder of the genome in non-Lepidopteran insect species was observed. Second, a substantial number of sperm proteins were found to be specific to Lepidoptera, in that they lack detectable homology to the genomes of more distantly related insects. Lastly, the functional importance of Lepidoptera specific sperm proteins is broadly supported by their increased abundance relative to proteins conserved across insects. Our results identify a burst of genetic novelty amongst sperm proteins that may be associated with the origin of heteromorphic spermatogenesis in ancestral Lepidoptera and/or the subsequent evolution of this system. This pattern of genomic diversification is distinct from the remainder of the genome and thus suggests that this transition has had a marked impact on lepidopteran genome evolution. The identification of abundant sperm proteins unique to Lepidoptera, including proteins distinct between specific lineages, will accelerate future functional studies aiming to understand the developmental origin of dichotomous spermatogenesis and the functional diversification of the fertilization incompetent apyrene sperm morph.

  5. Characterization of hemizygous deletions in Citrus using array-Comparative Genomic Hybridization and microsynteny comparisons with the poplar genome

    PubMed Central

    Ríos, Gabino; Naranjo, Miguel A; Iglesias, Domingo J; Ruiz-Rivero, Omar; Geraud, Marion; Usach, Antonio; Talón, Manuel

    2008-01-01

    Background Many fruit-tree species, including relevant Citrus spp varieties exhibit a reproductive biology that impairs breeding and strongly constrains genetic improvements. In citrus, juvenility increases the generation time while sexual sterility, inbreeding depression and self-incompatibility prevent the production of homozygous cultivars. Genomic technology may provide citrus researchers with a new set of tools to address these various restrictions. In this work, we report a valuable genomics-based protocol for the structural analysis of deletion mutations on an heterozygous background. Results Two independent fast neutron mutants of self-incompatible clementine (Citrus clementina Hort. Ex Tan. cv. Clemenules) were the subject of the study. Both mutants, named 39B3 and 39E7, were expected to carry DNA deletions in hemizygous dosage. Array-based Comparative Genomic Hybridization (array-CGH) using a Citrus cDNA microarray allowed the identification of underrepresented genes in these two mutants. Subsequent comparison of citrus deleted genes with annotated plant genomes, especially poplar, made possible to predict the presence of a large deletion in 39B3 of about 700 kb and at least two deletions of approximately 100 and 500 kb in 39E7. The deletion in 39B3 was further characterized by PCR on available Citrus BACs, which helped us to build a partial physical map of the deletion. Among the deleted genes, ClpC-like gene coding for a putative subunit of a multifunctional chloroplastic protease involved in the regulation of chlorophyll b synthesis was directly related to the mutated phenotype since the mutant showed a reduced chlorophyll a/b ratio in green tissues. Conclusion In this work, we report the use of array-CGH for the successful identification of genes included in a hemizygous deletion induced by fast neutron irradiation on Citrus clementina. The study of gene content and order into the 39B3 deletion also led to the unexpected conclusion that microsynteny and local gene colinearity in this species were higher with Populus trichocarpa than with the phylogenetically closer Arabidopsis thaliana. This work corroborates the potential of Citrus genomic resources to assist mutagenesis-based approaches for functional genetics, structural studies and comparative genomics, and hence to facilitate citrus variety improvement. PMID:18691431

  6. Whole-Genome Comparison Reveals Novel Genetic Elements That Characterize the Genome of Industrial Strains of Saccharomyces cerevisiae

    PubMed Central

    Borneman, Anthony R.; Desany, Brian A.; Riches, David; Affourtit, Jason P.; Forgan, Angus H.; Pretorius, Isak S.; Egholm, Michael; Chambers, Paul J.

    2011-01-01

    Human intervention has subjected the yeast Saccharomyces cerevisiae to multiple rounds of independent domestication and thousands of generations of artificial selection. As a result, this species comprises a genetically diverse collection of natural isolates as well as domesticated strains that are used in specific industrial applications. However the scope of genetic diversity that was captured during the domesticated evolution of the industrial representatives of this important organism remains to be determined. To begin to address this, we have produced whole-genome assemblies of six commercial strains of S. cerevisiae (four wine and two brewing strains). These represent the first genome assemblies produced from S. cerevisiae strains in their industrially-used forms and the first high-quality assemblies for S. cerevisiae strains used in brewing. By comparing these sequences to six existing high-coverage S. cerevisiae genome assemblies, clear signatures were found that defined each industrial class of yeast. This genetic variation was comprised of both single nucleotide polymorphisms and large-scale insertions and deletions, with the latter often being associated with ORF heterogeneity between strains. This included the discovery of more than twenty probable genes that had not been identified previously in the S. cerevisiae genome. Comparison of this large number of S. cerevisiae strains also enabled the characterization of a cluster of five ORFs that have integrated into the genomes of the wine and bioethanol strains on multiple occasions and at diverse genomic locations via what appears to involve the resolution of a circular DNA intermediate. This work suggests that, despite the scrutiny that has been directed at the yeast genome, there remains a significant reservoir of ORFs and novel modes of genetic transmission that may have significant phenotypic impact in this important model and industrial species. PMID:21304888

  7. Reference quality assembly of the 3.5-Gb genome of Capsicum annuum from a single linked-read library.

    PubMed

    Hulse-Kemp, Amanda M; Maheshwari, Shamoni; Stoffel, Kevin; Hill, Theresa A; Jaffe, David; Williams, Stephen R; Weisenfeld, Neil; Ramakrishnan, Srividya; Kumar, Vijay; Shah, Preyas; Schatz, Michael C; Church, Deanna M; Van Deynze, Allen

    2018-01-01

    Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper ( Capsicum annuum ) genome with a single Linked-Read library. Plant genomes, including pepper, are characterized by long, highly similar repetitive sequences. Accordingly, significant effort is used to ensure that the sequenced plant is highly homozygous and the resulting assembly is a haploid consensus. With a phased assembly approach, we targeted a heterozygous F 1 derived from a wide cross to assess the ability to derive both haplotypes and characterize a pungency gene with a large insertion/deletion. The Supernova software generated a highly ordered, more contiguous sequence assembly than all currently available C. annuum reference genomes. Over 83% of the final assembly was anchored and oriented using four publicly available  de novo linkage maps. A comparison of the annotation of conserved eukaryotic genes indicated the completeness of assembly. The validity of the phased assembly is further demonstrated with the complete recovery of both 2.5-Kb insertion/deletion haplotypes of the PUN1 locus in the F 1 sample that represents pungent and nonpungent peppers, as well as nearly full recovery of the BUSCO2 gene set within each of the two haplotypes. The most contiguous pepper genome assembly to date has been generated which demonstrates that Linked-Read library technology provides a tool to de novo assemble complex highly repetitive heterozygous plant genomes. This technology can provide an opportunity to cost-effectively develop high-quality genome assemblies for other complex plants and compare structural and gene differences through accurate haplotype reconstruction.

  8. Comprehensive identification of mutations induced by heavy-ion beam irradiation in Arabidopsis thaliana.

    PubMed

    Hirano, Tomonari; Kazama, Yusuke; Ishii, Kotaro; Ohbu, Sumie; Shirakawa, Yuki; Abe, Tomoko

    2015-04-01

    Heavy-ion beams are widely used for mutation breeding and molecular biology. Although the mutagenic effects of heavy-ion beam irradiation have been characterized by sequence analysis of some restricted chromosomal regions or loci, there have been no evaluations at the whole-genome level or of the detailed genomic rearrangements in the mutant genomes. In this study, using array comparative genomic hybridization (array-CGH) and resequencing, we comprehensively characterized the mutations in Arabidopsis thaliana genomes irradiated with Ar or Fe ions. We subsequently used this information to investigate the mutagenic effects of the heavy-ion beams. Array-CGH demonstrated that the average number of deleted areas per genome were 1.9 and 3.7 following Ar-ion and Fe-ion irradiation, respectively, with deletion sizes ranging from 149 to 602,180 bp; 81% of the deletions were accompanied by genomic rearrangements. To provide a further detailed analysis, the genomes of the mutants induced by Ar-ion beam irradiation were resequenced, and total mutations, including base substitutions, duplications, in/dels, inversions, and translocations, were detected using three algorithms. All three resequenced mutants had genomic rearrangements. Of the 22 DNA fragments that contributed to the rearrangements, 19 fragments were responsible for the intrachromosomal rearrangements, and multiple rearrangements were formed in the localized regions of the chromosomes. The interchromosomal rearrangements were detected in the multiply rearranged regions. These results indicate that the heavy-ion beams led to clustered DNA damage in the chromosome, and that they have great potential to induce complicated intrachromosomal rearrangements. Heavy-ion beams will prove useful as unique mutagens for plant breeding and the establishment of mutant lines. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.

  9. NKp44 expression, phylogenesis and function in non-human primate NK cells

    PubMed Central

    De Maria, Andrea; Ugolotti, Elisabetta; Rutjens, Erik; Mazza, Stefania; Radic, Luana; Faravelli, Alessandro; Koopman, Gerrit; Di Marco, Eddi; Costa, Paola; Ensoli, Barbara; Cafaro, Aurelio; Mingari, Maria Cristina; Moretta, Lorenzo; Heeney, Jonathan

    2009-01-01

    Molecular and functional characterization of the natural cytotoxicity receptor (NCR) NKp44 in species other than Homo sapiens has been elusive, so far. Here, we provide complete phenotypic, molecular and functional characterization for NKp44 triggering receptor on Pan troglodytes NK cells, the closest human relative, and the analysis of NKp44-genomic locus and transcription in Macaca fascicularis. Similar to H. sapiens, NKp44 expression is detectable on chimpanzee NK cells only upon activation. However, basal NKp44 transcription is 5-fold higher in chimpanzees with lower differential increases upon cell activation compared with humans. Upon activation, an overall 12-fold lower NKp44 gene expression is observed in P. troglodytes compared with H. sapiens NK cells with only a slight reduction in NKp44 surface expression. Functional analysis of ‘in vitro’ activated purified NK cells confirms the NKp44 triggering potential compared with other major NCRs. These findings suggest the presence of a post-transcriptional regulation that evolved differently in H. sapiens. Analysis of cynomolgus NKp44-genomic sequence and transcription pattern showed very low levels of transcription with occurrence of out-of-frame transcripts and no surface expression. The present comparative analysis suggests that NKp44-genomic organization appears during macaque speciation, with considerable evolution of its transcriptional and post-transcriptional tuning. Thus, NKp44 may represent an NCR being only recently emerged during speciation, acquiring functional relevance only in non-human primates closest to H. sapiens. PMID:19147838

  10. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    PubMed

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  11. Comparative Genomics of the Genus Porphyromonas Identifies Adaptations for Heme Synthesis within the Prevalent Canine Oral Species Porphyromonas cangingivalis.

    PubMed

    O'Flynn, Ciaran; Deusch, Oliver; Darling, Aaron E; Eisen, Jonathan A; Wallis, Corrin; Davis, Ian J; Harris, Stephen J

    2015-11-13

    Porphyromonads play an important role in human periodontal disease and recently have been shown to be highly prevalent in canine mouths. Porphyromonas cangingivalis is the most prevalent canine oral bacterial species in both plaque from healthy gingiva and plaque from dogs with early periodontitis. The ability of P. cangingivalis to flourish in the different environmental conditions characterized by these two states suggests a degree of metabolic flexibility. To characterize the genes responsible for this, the genomes of 32 isolates (including 18 newly sequenced and assembled) from 18 Porphyromonad species from dogs, humans, and other mammals were compared. Phylogenetic trees inferred using core genes largely matched previous findings; however, comparative genomic analysis identified several genes and pathways relating to heme synthesis that were present in P. cangingivalis but not in other Porphyromonads. Porphyromonas cangingivalis has a complete protoporphyrin IX synthesis pathway potentially allowing it to synthesize its own heme unlike pathogenic Porphyromonads such as Porphyromonas gingivalis that acquire heme predominantly from blood. Other pathway differences such as the ability to synthesize siroheme and vitamin B12 point to enhanced metabolic flexibility for P. cangingivalis, which may underlie its prevalence in the canine oral cavity. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. The somatic genomic landscape of chromophobe renal cell carcinoma

    PubMed Central

    Davis, Caleb F.; Ricketts, Christopher; Wang, Min; Yang, Lixing; Cherniack, Andrew D.; Shen, Hui; Buhay, Christian; Kang, Hyojin; Kim, Sang Cheol; Fahey, Catherine C.; Hacker, Kathryn E.; Bhanot, Gyan; Gordenin, Dmitry A.; Chu, Andy; Gunaratne, Preethi H.; Biehl, Michael; Seth, Sahil; Kaipparettu, Benny A.; Bristow, Christopher A.; Donehower, Lawrence A.; Wallen, Eric M.; Smith, Angela B.; Tickoo, Satish K.; Tamboli, Pheroze; Reuter, Victor; Schmidt, Laura S.; Hsieh, James J.; Choueiri, Toni K.; Hakimi, A. Ari; Chin, Lynda; Meyerson, Matthew; Kucherlapati, Raju; Park, Woong-Yang; Robertson, A. Gordon; Laird, Peter W.; Henske, Elizabeth P.; Kwiatkowski, David J.; Park, Peter J.; Morgan, Margaret; Shuch, Brian; Muzny, Donna; Wheeler, David A.; Linehan, W. Marston; Gibbs, Richard A.; Rathmell, W. Kimryn; Creighton, Chad J.

    2014-01-01

    Summary We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) based on multidimensional and comprehensive characterization, including mitochondrial DNA (mtDNA) and whole genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared to other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT up-regulation in cancer distinct from previously-observed amplifications and point mutations. PMID:25155756

  13. The somatic genomic landscape of chromophobe renal cell carcinoma.

    PubMed

    Davis, Caleb F; Ricketts, Christopher J; Wang, Min; Yang, Lixing; Cherniack, Andrew D; Shen, Hui; Buhay, Christian; Kang, Hyojin; Kim, Sang Cheol; Fahey, Catherine C; Hacker, Kathryn E; Bhanot, Gyan; Gordenin, Dmitry A; Chu, Andy; Gunaratne, Preethi H; Biehl, Michael; Seth, Sahil; Kaipparettu, Benny A; Bristow, Christopher A; Donehower, Lawrence A; Wallen, Eric M; Smith, Angela B; Tickoo, Satish K; Tamboli, Pheroze; Reuter, Victor; Schmidt, Laura S; Hsieh, James J; Choueiri, Toni K; Hakimi, A Ari; Chin, Lynda; Meyerson, Matthew; Kucherlapati, Raju; Park, Woong-Yang; Robertson, A Gordon; Laird, Peter W; Henske, Elizabeth P; Kwiatkowski, David J; Park, Peter J; Morgan, Margaret; Shuch, Brian; Muzny, Donna; Wheeler, David A; Linehan, W Marston; Gibbs, Richard A; Rathmell, W Kimryn; Creighton, Chad J

    2014-09-08

    We describe the landscape of somatic genomic alterations of 66 chromophobe renal cell carcinomas (ChRCCs) on the basis of multidimensional and comprehensive characterization, including mtDNA and whole-genome sequencing. The result is consistent that ChRCC originates from the distal nephron compared with other kidney cancers with more proximal origins. Combined mtDNA and gene expression analysis implicates changes in mitochondrial function as a component of the disease biology, while suggesting alternative roles for mtDNA mutations in cancers relying on oxidative phosphorylation. Genomic rearrangements lead to recurrent structural breakpoints within TERT promoter region, which correlates with highly elevated TERT expression and manifestation of kataegis, representing a mechanism of TERT upregulation in cancer distinct from previously observed amplifications and point mutations. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Genome Analysis of Streptococcus pyogenes Associated with Pharyngitis and Skin Infections

    PubMed Central

    Ibrahim, Joe; Eisen, Jonathan A.; Jospin, Guillaume; Coil, David A.; Khazen, Georges

    2016-01-01

    Streptococcus pyogenes is a very important human pathogen, commonly associated with skin or throat infections but can also cause life-threatening situations including sepsis, streptococcal toxic shock syndrome, and necrotizing fasciitis. Various studies involving typing and molecular characterization of S. pyogenes have been published to date; however next-generation sequencing (NGS) studies provide a comprehensive collection of an organism’s genetic variation. In this study, the genomes of nine S. pyogenes isolates associated with pharyngitis and skin infection were sequenced and studied for the presence of virulence genes, resistance elements, prophages, genomic recombination, and other genomic features. Additionally, a comparative phylogenetic analysis of the isolates with global clones highlighted their possible evolutionary lineage and their site of infection. The genomes were found to also house a multitude of features including gene regulation systems, virulence factors and antimicrobial resistance mechanisms. PMID:27977735

  15. Development and characterization of a complete set of Triticum aestivum-Roegneria ciliaris disomic addition lines.

    PubMed

    Kong, Lingna; Song, Xinying; Xiao, Jin; Sun, Haojie; Dai, Keli; Lan, Caixia; Singh, Pawan; Yuan, Chunxia; Zhang, Shouzhong; Singh, Ravi; Wang, Haiyan; Wang, Xiue

    2018-05-31

    A complete set wheat-R. ciliaris disomic addition lines (DALs) were characterized and the homoeologous groups and genome affinities of R. ciliaris chromosomes were determined. Wild relatives are rich gene resources for cultivated wheat. The development of alien addition chromosome lines not only greatly broadens the genetic diversity, but also provides genetic stocks for comparative genomics studies. Roegneria ciliaris (genome S c S c Y c Y c ), a tetraploid wild relative of wheat, is tolerant or resistant to many abiotic and biotic stresses. To develop a complete set of wheat-R. ciliaris disomic addition lines (DALs), we undertook a euplasmic backcrossing program to overcome allocytoplasmic effects and preferential chromosome transmission. To improve the efficiency of identifying chromosomes from S c and Y c , we established techniques including sequential genomic in situ hybridization/fluorescence in situ hybridization (FISH) and molecular marker analysis. Fourteen DALs of wheat, each containing one pair of R. ciliaris chromosomes pairs, were characterized by FISH using four repetitive sequences [pTa794, pTa71, RcAfa and (GAA) 10 ] as probes. One hundred and sixty-two R. ciliaris-specific markers were developed. FISH and marker analysis enabled us to assign the homoeologous groups and genome affinities of R. ciliaris chromosomes. FHB resistance evaluation in successive five growth seasons showed that the amphiploid, DA2Y c , DA5Y c and DA6S c had improved FHB resistance, indicating their potential value in wheat improvement. The 14 DALs are likely new gene resources and will be phenotyped for more agronomic performances traits.

  16. Newly discovered young CORE-SINEs in marsupial genomes.

    PubMed

    Munemasa, Maruo; Nikaido, Masato; Nishihara, Hidenori; Donnellan, Stephen; Austin, Christopher C; Okada, Norihiro

    2008-01-15

    Although recent mammalian genome projects have uncovered a large part of genomic component of various groups, several repetitive sequences still remain to be characterized and classified for particular groups. The short interspersed repetitive elements (SINEs) distributed among marsupial genomes are one example. We have identified and characterized two new SINEs from marsupial genomes that belong to the CORE-SINE family, characterized by a highly conserved "CORE" domain. PCR and genomic dot blot analyses revealed that the distribution of each SINE shows distinct patterns among the marsupial genomes, implying different timing of their retroposition during the evolution of marsupials. The members of Mar3 (Marsupialia 3) SINE are distributed throughout the genomes of all marsupials, whereas the Mac1 (Macropodoidea 1) SINE is distributed specifically in the genomes of kangaroos. Sequence alignment of the Mar3 SINEs revealed that they can be further divided into four subgroups, each of which has diagnostic nucleotides. The insertion patterns of each SINE at particular genomic loci, together with the distribution patterns of each SINE, suggest that the Mar3 SINEs have intensively amplified after the radiation of diprotodontians, whereas the Mac1 SINE has amplified only slightly after the divergence of hypsiprimnodons from other macropods. By compiling the information of CORE-SINEs characterized to date, we propose a comprehensive picture of how SINE evolution occurred in the genomes of marsupials.

  17. Characterization of Genome-Wide Variation in Four-Row Wax, a Waxy Maize Landrace with a Reduced Kernel Row Phenotype

    PubMed Central

    Liu, Hanmei; Wang, Xuewen; Wei, Bin; Wang, Yongbin; Liu, Yinghong; Zhang, Junjie; Hu, Yufeng; Yu, Guowu; Li, Jian; Xu, Zhanbin; Huang, Yubi

    2016-01-01

    In southwest China, some maize landraces have long been isolated geographically, and have phenotypes that differ from those of widely grown cultivars. These landraces may harbor rich genetic variation responsible for those phenotypes. Four-row Wax is one such landrace, with four rows of kernels on the cob. We resequenced the genome of Four-row Wax, obtaining 50.46 Gb sequence at 21.87× coverage, then identified and characterized 3,252,194 SNPs, 213,181 short InDels (1–5 bp) and 39,631 structural variations (greater than 5 bp). Of those, 312,511 (9.6%) SNPs were novel compared to the most detailed haplotype map (HapMap) SNP database of maize. Characterization of variations in reported kernel row number (KRN) related genes and KRN QTL regions revealed potential causal mutations in fea2, td1, kn1, and te1. Genome-wide comparisons revealed abundant genetic variations in Four-row Wax, which may be associated with environmental adaptation. The sequence and SNP variations described here enrich genetic resources of maize, and provide guidance into study of seed numbers for crop yield improvement. PMID:27242868

  18. Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

    PubMed Central

    2011-01-01

    Background The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host interactions of these motile intestinal lactobacilli. PMID:21995554

  19. The complete mitochondrial genome of the green lizard Lacerta viridis viridis (Reptilia: Lacertidae) and its phylogenetic position within squamate reptiles.

    PubMed

    Böhme, M U; Fritzsch, G; Tippmann, A; Schlegel, M; Berendonk, T U

    2007-06-01

    For the first time the complete mitochondrial genome was sequenced for a member of Lacertidae. Lacerta viridis viridis was sequenced in order to compare the phylogenetic relationships of this family to other reptilian lineages. Using the long-polymerase chain reaction (long PCR) we characterized a mitochondrial genome, 17,156 bp long showing a typical vertebrate pattern with 13 protein coding genes, 22 transfer RNAs (tRNA), two ribosomal RNAs (rRNA) and one major noncoding region. The noncoding region of L. v. viridis was characterized by a conspicuous 35 bp tandem repeat at its 5' terminus. A phylogenetic study including all currently available squamate mitochondrial sequences demonstrates the position of Lacertidae within a monophyletic squamate group. We obtained a narrow relationship of Lacertidae to Scincidae, Iguanidae, Varanidae, Anguidae, and Cordylidae. Although, the internal relationships within this group yielded only a weak resolution and low bootstrap support, the revealed relationships were more congruent with morphological studies than with recent molecular analyses.

  20. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome. PMID:18983662

  1. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this needs to be experimentally characterized with ecologically relevant phenotype properties. This study justifies the need to sequence multiple isolates, especially from P. fluorescens group in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.« less

  2. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE PAGES

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; ...

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this needs to be experimentally characterized with ecologically relevant phenotype properties. This study justifies the need to sequence multiple isolates, especially from P. fluorescens group in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.« less

  3. Comparative genomics of Enterococcus faecalis from healthy Norwegian infants

    PubMed Central

    Solheim, Margrete; Aakra, Ågot; Snipen, Lars G; Brede, Dag A; Nes, Ingolf F

    2009-01-01

    Background Enterococcus faecalis, traditionally considered a harmless commensal of the intestinal tract, is now ranked among the leading causes of nosocomial infections. In an attempt to gain insight into the genetic make-up of commensal E. faecalis, we have studied genomic variation in a collection of community-derived E. faecalis isolated from the feces of Norwegian infants. Results The E. faecalis isolates were first sequence typed by multilocus sequence typing (MLST) and characterized with respect to antibiotic resistance and properties associated with virulence. A subset of the isolates was compared to the vancomycin resistant strain E. faecalis V583 (V583) by whole genome microarray comparison (comparative genomic hybridization (CGH)). Several of the putative enterococcal virulence factors were found to be highly prevalent among the commensal baby isolates. The genomic variation as observed by CGH was less between isolates displaying the same MLST sequence type than between isolates belonging to different evolutionary lineages. Conclusion The variations in gene content observed among the investigated commensal E. faecalis is comparable to the genetic variation previously reported among strains of various origins thought to be representative of the major E. faecalis lineages. Previous MLST analysis of E. faecalis have identified so-called high-risk enterococcal clonal complexes (HiRECC), defined as genetically distinct subpopulations, epidemiologically associated with enterococcal infections. The observed correlation between CGH and MLST presented here, may offer a method for the identification of lineage-specific genes, and may therefore add clues on how to distinguish pathogenic from commensal E. faecalis. In this work, information on the core genome of E. faecalis is also substantially extended. PMID:19393078

  4. Microsatellites in the Genome of the Edible Mushroom, Volvariella volvacea

    PubMed Central

    Chen, Mingjie; Wang, Hong; Bao, Dapeng

    2014-01-01

    Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes. PMID:24575404

  5. Genomic features of bacterial adaptation to plants

    DOE PAGES

    Levy, Asaf; Salas Gonzalez, Isai; Mittelviefhaus, Maximilian; ...

    2017-12-18

    Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. In this study, we sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and themore » other serving in microbe–microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. In conclusion, this work expands the genome-based understanding of plant–microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.« less

  6. Microsatellites in the genome of the edible mushroom, Volvariella volvacea.

    PubMed

    Wang, Ying; Chen, Mingjie; Wang, Hong; Wang, Jing-Fang; Bao, Dapeng

    2014-01-01

    Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes.

  7. Genomic Features and Niche-Adaptation of Enterococcus faecium Strains from Korean Soybean-Fermented Foods.

    PubMed

    Kim, Eun Bae; Jin, Gwi-Deuk; Lee, Jun-Yeong; Choi, Yun-Jaie

    2016-01-01

    Certain strains of Enterococcus faecium contribute beneficially to human health and food fermentation. However, other E. faecium strains are opportunistic pathogens due to the acquisition of virulence factors and antibiotic resistance determinants. To characterize E. faecium from soybean fermentation, we sequenced the genomes of 10 E. faecium strains from Korean soybean-fermented foods and analyzed their genomes by comparing them with 51 clinical and 52 non-clinical strains of different origins. Hierarchical clustering based on 13,820 orthologous genes from all E. faecium genomes showed that the 10 strains are distinguished from most of the clinical strains. Like non-clinical strains, their genomes are significantly smaller than clinical strains due to fewer accessory genes associated with antibiotic resistance, virulence, and mobile genetic elements. Moreover, we identified niche-associated gene gain and loss from the soybean strains. Thus, we conclude that soybean E. faecium strains might have evolved to have distinctive genomic features that may contribute to its ability to thrive during soybean fermentation.

  8. Genomic Features and Niche-Adaptation of Enterococcus faecium Strains from Korean Soybean-Fermented Foods

    PubMed Central

    Kim, Eun Bae; Jin, Gwi-Deuk; Lee, Jun-Yeong; Choi, Yun-Jaie

    2016-01-01

    Certain strains of Enterococcus faecium contribute beneficially to human health and food fermentation. However, other E. faecium strains are opportunistic pathogens due to the acquisition of virulence factors and antibiotic resistance determinants. To characterize E. faecium from soybean fermentation, we sequenced the genomes of 10 E. faecium strains from Korean soybean-fermented foods and analyzed their genomes by comparing them with 51 clinical and 52 non-clinical strains of different origins. Hierarchical clustering based on 13,820 orthologous genes from all E. faecium genomes showed that the 10 strains are distinguished from most of the clinical strains. Like non-clinical strains, their genomes are significantly smaller than clinical strains due to fewer accessory genes associated with antibiotic resistance, virulence, and mobile genetic elements. Moreover, we identified niche-associated gene gain and loss from the soybean strains. Thus, we conclude that soybean E. faecium strains might have evolved to have distinctive genomic features that may contribute to its ability to thrive during soybean fermentation. PMID:27070419

  9. Two fundamentally different classes of microbial genes.

    PubMed

    Wolf, Yuri I; Makarova, Kira S; Lobkovsky, Alexander E; Koonin, Eugene V

    2016-11-07

    The evolution of bacterial and archaeal genomes is highly dynamic and involves extensive horizontal gene transfer and gene loss 1-4 . Furthermore, many microbial species appear to have open pangenomes, where each newly sequenced genome contains more than 10% ORFans, that is, genes without detectable homologues in other species 5,6 . Here, we report a quantitative analysis of microbial genome evolution by fitting the parameters of a simple, steady-state evolutionary model to the comparative genomic data on the gene content and gene order similarity between archaeal genomes. The results reveal two sharply distinct classes of microbial genes, one of which is characterized by effectively instantaneous gene replacement, and the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of the size of the prokaryotic genomic universe, which appears to consist of at least a billion distinct genes. Furthermore, the same distribution of constraints is shown to govern the evolution of gene complement and gene order, without the need to invoke long-range conservation or the selfish operon concept 7 .

  10. Genomic features of bacterial adaptation to plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Levy, Asaf; Salas Gonzalez, Isai; Mittelviefhaus, Maximilian

    Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. In this study, we sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and themore » other serving in microbe–microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. In conclusion, this work expands the genome-based understanding of plant–microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.« less

  11. The genome landscape of indigenous African cattle.

    PubMed

    Kim, Jaemin; Hanotte, Olivier; Mwai, Okeyo Ally; Dessie, Tadelle; Bashir, Salim; Diallo, Boubacar; Agaba, Morris; Kim, Kwondo; Kwak, Woori; Sung, Samsun; Seo, Minseok; Jeong, Hyeonsoo; Kwon, Taehyung; Taye, Mengistie; Song, Ki-Duk; Lim, Dajeong; Cho, Seoae; Lee, Hyun-Jeong; Yoon, Duhak; Oh, Sung Jong; Kemp, Stephen; Lee, Hak-Kyo; Kim, Heebal

    2017-02-20

    The history of African indigenous cattle and their adaptation to environmental and human selection pressure is at the root of their remarkable diversity. Characterization of this diversity is an essential step towards understanding the genomic basis of productivity and adaptation to survival under African farming systems. We analyze patterns of African cattle genetic variation by sequencing 48 genomes from five indigenous populations and comparing them to the genomes of 53 commercial taurine breeds. We find the highest genetic diversity among African zebu and sanga cattle. Our search for genomic regions under selection reveals signatures of selection for environmental adaptive traits. In particular, we identify signatures of selection including genes and/or pathways controlling anemia and feeding behavior in the trypanotolerant N'Dama, coat color and horn development in Ankole, and heat tolerance and tick resistance across African cattle especially in zebu breeds. Our findings unravel at the genome-wide level, the unique adaptive diversity of African cattle while emphasizing the opportunities for sustainable improvement of livestock productivity on the continent.

  12. Characterizing the developmental transcriptome of the oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae) through comparative genomic analysis with Drosophila melanogaster utilizing modENCODE datasets.

    PubMed

    Geib, Scott M; Calla, Bernarda; Hall, Brian; Hou, Shaobin; Manoukis, Nicholas C

    2014-10-28

    The oriental fruit fly, Bactrocera dorsalis, is an important pest of fruit and vegetable crops throughout Asia, and is considered a high risk pest for establishment in the mainland United States. It is a member of the family Tephritidae, which are the most agriculturally important family of flies, and can be considered an out-group to well-studied members of the family Drosophilidae. Despite their importance as pests and their relatedness to Drosophila, little information is present on B. dorsalis transcripts and proteins. The objective of this paper is to comprehensively characterize the transcripts present throughout the life history of B. dorsalis and functionally annotate and analyse these transcripts relative to the presence, expression, and function of orthologous sequences present in Drosophila melanogaster. We present a detailed transcriptome assembly of B. dorsalis from egg through adult stages containing 20,666 transcripts across 10,799 unigene components. Utilizing data available through Flybase and the modENCODE project, we compared expression patterns of these transcripts to putative orthologs in D. melanogaster in terms of timing, abundance, and function. In addition, temporal expression patterns in B. dorsalis were characterized between stages, to establish the constitutive or stage-specific expression patterns of particular transcripts. A fully annotated transcriptome assembly is made available through NCBI, in addition to corresponding expression data. Through characterizing the transcriptome of B. dorsalis through its life history and comparing the transcriptome of B. dorsalis to the model organism D. melanogaster, a database has been developed that can be used as the foundation to functional genomic research in Bactrocera flies and help identify orthologous genes between B. dorsalis and D. melanogaster. This data provides the foundation for future functional genomic research that will focus on improving our understanding of the physiology and biology of this species at the molecular level. This knowledge can also be applied towards developing improved methods for control, survey, and eradication of this important pest.

  13. DNA Asymmetric Strand Bias Affects the Amino Acid Composition of Mitochondrial Proteins

    PubMed Central

    Min, Xiang Jia; Hickey, Donal A.

    2007-01-01

    Abstract Variations in GC content between genomes have been extensively documented. Genomes with comparable GC contents can, however, still differ in the apportionment of the G and C nucleotides between the two DNA strands. This asymmetric strand bias is known as GC skew. Here, we have investigated the impact of differences in nucleotide skew on the amino acid composition of the encoded proteins. We compared orthologous genes between animal mitochondrial genomes that show large differences in GC and AT skews. Specifically, we compared the mitochondrial genomes of mammals, which are characterized by a negative GC skew and a positive AT skew, to those of flatworms, which show the opposite skews for both GC and AT base pairs. We found that the mammalian proteins are highly enriched in amino acids encoded by CA-rich codons (as predicted by their negative GC and positive AT skews), whereas their flatworm orthologs were enriched in amino acids encoded by GT-rich codons (also as predicted from their skews). We found that these differences in mitochondrial strand asymmetry (measured as GC and AT skews) can have very large, predictable effects on the composition of the encoded proteins. PMID:17974594

  14. Extensive Copy-Number Variation of Young Genes across Stickleback Populations

    PubMed Central

    Eizaguirre, Christophe; Samonte, Irene E.; Kalbe, Martin; Lenz, Tobias L.; Stoll, Monika; Bornberg-Bauer, Erich; Milinski, Manfred; Reusch, Thorsten B. H.

    2014-01-01

    Duplicate genes emerge as copy-number variations (CNVs) at the population level, and remain copy-number polymorphic until they are fixed or lost. The successful establishment of such structural polymorphisms in the genome plays an important role in evolution by promoting genetic diversity, complexity and innovation. To characterize the early evolutionary stages of duplicate genes and their potential adaptive benefits, we combine comparative genomics with population genomics analyses to evaluate the distribution and impact of CNVs across natural populations of an eco-genomic model, the three-spined stickleback. With whole genome sequences of 66 individuals from populations inhabiting three distinct habitats, we find that CNVs generally occur at low frequencies and are often only found in one of the 11 populations surveyed. A subset of CNVs, however, displays copy-number differentiation between populations, showing elevated within-population frequencies consistent with local adaptation. By comparing teleost genomes to identify lineage-specific genes and duplications in sticklebacks, we highlight rampant gene content differences among individuals in which over 30% of young duplicate genes are CNVs. These CNV genes are evolving rapidly at the molecular level and are enriched with functional categories associated with environmental interactions, depicting the dynamic early copy-number polymorphic stage of genes during population differentiation. PMID:25474574

  15. Detecting microsatellites within genomes: significant variation among algorithms.

    PubMed

    Leclercq, Sébastien; Rivals, Eric; Jarne, Philippe

    2007-04-18

    Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.

  16. Detecting microsatellites within genomes: significant variation among algorithms

    PubMed Central

    Leclercq, Sébastien; Rivals, Eric; Jarne, Philippe

    2007-01-01

    Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker). Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster) spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp), regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions. PMID:17442102

  17. Comparative Genomics of Burkholderia singularis sp. nov., a Low G+C Content, Free-Living Bacterium That Defies Taxonomic Dissection of the Genus Burkholderia

    PubMed Central

    Vandamme, Peter; Peeters, Charlotte; De Smet, Birgit; Price, Erin P.; Sarovich, Derek S.; Henry, Deborah A.; Hird, Trevor J.; Zlosnik, James E. A.; Mayo, Mark; Warner, Jeffrey; Baker, Anthony; Currie, Bart J.; Carlier, Aurélien

    2017-01-01

    Four Burkholderia pseudomallei-like isolates of human clinical origin were examined by a polyphasic taxonomic approach that included comparative whole genome analyses. The results demonstrated that these isolates represent a rare and unusual, novel Burkholderia species for which we propose the name B. singularis. The type strain is LMG 28154T (=CCUG 65685T). Its genome sequence has an average mol% G+C content of 64.34%, which is considerably lower than that of other Burkholderia species. The reduced G+C content of strain LMG 28154T was characterized by a genome wide AT bias that was not due to reduced GC-biased gene conversion or reductive genome evolution, but might have been caused by an altered DNA base excision repair pathway. B. singularis can be differentiated from other Burkholderia species by multilocus sequence analysis, MALDI-TOF mass spectrometry and a distinctive biochemical profile that includes the absence of nitrate reduction, a mucoid appearance on Columbia sheep blood agar, and a slowly positive oxidase reaction. Comparisons with publicly available whole genome sequences demonstrated that strain TSV85, an Australian water isolate, also represents the same species and therefore, to date, B. singularis has been recovered from human or environmental samples on three continents. PMID:28932212

  18. Error baseline rates of five sample preparation methods used to characterize RNA virus populations.

    PubMed

    Kugelman, Jeffrey R; Wiley, Michael R; Nagle, Elyse R; Reyes, Daniel; Pfeffer, Brad P; Kuhn, Jens H; Sanchez-Lockhart, Mariano; Palacios, Gustavo F

    2017-01-01

    Individual RNA viruses typically occur as populations of genomes that differ slightly from each other due to mutations introduced by the error-prone viral polymerase. Understanding the variability of RNA virus genome populations is critical for understanding virus evolution because individual mutant genomes may gain evolutionary selective advantages and give rise to dominant subpopulations, possibly even leading to the emergence of viruses resistant to medical countermeasures. Reverse transcription of virus genome populations followed by next-generation sequencing is the only available method to characterize variation for RNA viruses. However, both steps may lead to the introduction of artificial mutations, thereby skewing the data. To better understand how such errors are introduced during sample preparation, we determined and compared error baseline rates of five different sample preparation methods by analyzing in vitro transcribed Ebola virus RNA from an artificial plasmid-based system. These methods included: shotgun sequencing from plasmid DNA or in vitro transcribed RNA as a basic "no amplification" method, amplicon sequencing from the plasmid DNA or in vitro transcribed RNA as a "targeted" amplification method, sequence-independent single-primer amplification (SISPA) as a "random" amplification method, rolling circle reverse transcription sequencing (CirSeq) as an advanced "no amplification" method, and Illumina TruSeq RNA Access as a "targeted" enrichment method. The measured error frequencies indicate that RNA Access offers the best tradeoff between sensitivity and sample preparation error (1.4-5) of all compared methods.

  19. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Qiu, D.; Tu, Q.; He, Zhili

    2010-05-17

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to bemore » of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.« less

  20. Error baseline rates of five sample preparation methods used to characterize RNA virus populations

    PubMed Central

    Kugelman, Jeffrey R.; Wiley, Michael R.; Nagle, Elyse R.; Reyes, Daniel; Pfeffer, Brad P.; Kuhn, Jens H.; Sanchez-Lockhart, Mariano; Palacios, Gustavo F.

    2017-01-01

    Individual RNA viruses typically occur as populations of genomes that differ slightly from each other due to mutations introduced by the error-prone viral polymerase. Understanding the variability of RNA virus genome populations is critical for understanding virus evolution because individual mutant genomes may gain evolutionary selective advantages and give rise to dominant subpopulations, possibly even leading to the emergence of viruses resistant to medical countermeasures. Reverse transcription of virus genome populations followed by next-generation sequencing is the only available method to characterize variation for RNA viruses. However, both steps may lead to the introduction of artificial mutations, thereby skewing the data. To better understand how such errors are introduced during sample preparation, we determined and compared error baseline rates of five different sample preparation methods by analyzing in vitro transcribed Ebola virus RNA from an artificial plasmid-based system. These methods included: shotgun sequencing from plasmid DNA or in vitro transcribed RNA as a basic “no amplification” method, amplicon sequencing from the plasmid DNA or in vitro transcribed RNA as a “targeted” amplification method, sequence-independent single-primer amplification (SISPA) as a “random” amplification method, rolling circle reverse transcription sequencing (CirSeq) as an advanced “no amplification” method, and Illumina TruSeq RNA Access as a “targeted” enrichment method. The measured error frequencies indicate that RNA Access offers the best tradeoff between sensitivity and sample preparation error (1.4−5) of all compared methods. PMID:28182717

  1. Characterization of a novel Lactobacillus species closely related to Lactobacillus johnsonii using a combination of molecular and comparative genomics methods.

    PubMed

    Sarmiento-Rubiano, Luz-Adriana; Berger, Bernard; Moine, Déborah; Zúñiga, Manuel; Pérez-Martínez, Gaspar; Yebra, María J

    2010-09-17

    Comparative genomic hybridization (CGH) constitutes a powerful tool for identification and characterization of bacterial strains. In this study we have applied this technique for the characterization of a number of Lactobacillus strains isolated from the intestinal content of rats fed with a diet supplemented with sorbitol. Phylogenetic analysis based on 16S rRNA gene, recA, pheS, pyrG and tuf sequences identified five bacterial strains isolated from the intestinal content of rats as belonging to the recently described Lactobacillus taiwanensis species. DNA-DNA hybridization experiments confirmed that these five strains are distinct but closely related to Lactobacillus johnsonii and Lactobacillus gasseri. A whole genome DNA microarray designed for the probiotic L. johnsonii strain NCC533 was used for CGH analysis of L. johnsonii ATCC 33200T, L. johnsonii BL261, L. gasseri ATCC 33323T and L. taiwanensis BL263. In these experiments, the fluorescence ratio distributions obtained with L. taiwanensis and L. gasseri showed characteristic inter-species profiles. The percentage of conserved L. johnsonii NCC533 genes was about 83% in the L. johnsonii strains comparisons and decreased to 51% and 47% for L. taiwanensis and L. gasseri, respectively. These results confirmed the separate status of L. taiwanensis from L. johnsonii at the level of species, and also that L. taiwanensis is closer to L. johnsonii than L. gasseri is to L. johnsonii. Conventional taxonomic analyses and microarray-based CGH analysis have been used for the identification and characterization of the newly species L. taiwanensis. The microarray-based CGH technology has been shown as a remarkable tool for the identification and fine discrimination between phylogenetically close species, and additionally provided insight into the adaptation of the strain L. taiwanensis BL263 to its ecological niche.

  2. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

    PubMed Central

    2012-01-01

    Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331

  3. Social insect genomes exhibit dramatic evolution in gene composition and regulation while preserving regulatory features linked to sociality

    PubMed Central

    Simola, Daniel F.; Wissler, Lothar; Donahue, Greg; Waterhouse, Robert M.; Helmkampf, Martin; Roux, Julien; Nygaard, Sanne; Glastad, Karl M.; Hagen, Darren E.; Viljakainen, Lumi; Reese, Justin T.; Hunt, Brendan G.; Graur, Dan; Elhaik, Eran; Kriventseva, Evgenia V.; Wen, Jiayu; Parker, Brian J.; Cash, Elizabeth; Privman, Eyal; Childers, Christopher P.; Muñoz-Torres, Monica C.; Boomsma, Jacobus J.; Bornberg-Bauer, Erich; Currie, Cameron R.; Elsik, Christine G.; Suen, Garret; Goodisman, Michael A.D.; Keller, Laurent; Liebig, Jürgen; Rawls, Alan; Reinberg, Danny; Smith, Chris D.; Smith, Chris R.; Tsutsui, Neil; Wurm, Yannick; Zdobnov, Evgeny M.; Berger, Shelley L.; Gadau, Jürgen

    2013-01-01

    Genomes of eusocial insects code for dramatic examples of phenotypic plasticity and social organization. We compared the genomes of seven ants, the honeybee, and various solitary insects to examine whether eusocial lineages share distinct features of genomic organization. Each ant lineage contains ∼4000 novel genes, but only 64 of these genes are conserved among all seven ants. Many gene families have been expanded in ants, notably those involved in chemical communication (e.g., desaturases and odorant receptors). Alignment of the ant genomes revealed reduced purifying selection compared with Drosophila without significantly reduced synteny. Correspondingly, ant genomes exhibit dramatic divergence of noncoding regulatory elements; however, extant conserved regions are enriched for novel noncoding RNAs and transcription factor–binding sites. Comparison of orthologous gene promoters between eusocial and solitary species revealed significant regulatory evolution in both cis (e.g., Creb) and trans (e.g., fork head) for nearly 2000 genes, many of which exhibit phenotypic plasticity. Our results emphasize that genomic changes can occur remarkably fast in ants, because two recently diverged leaf-cutter ant species exhibit faster accumulation of species-specific genes and greater divergence in regulatory elements compared with other ants or Drosophila. Thus, while the “socio-genomes” of ants and the honeybee are broadly characterized by a pervasive pattern of divergence in gene composition and regulation, they preserve lineage-specific regulatory features linked to eusociality. We propose that changes in gene regulation played a key role in the origins of insect eusociality, whereas changes in gene composition were more relevant for lineage-specific eusocial adaptations. PMID:23636946

  4. A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication

    PubMed Central

    2014-01-01

    Background Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. Results We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. Conclusions The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel. PMID:24669946

  5. A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication.

    PubMed

    Kai, Wataru; Nomura, Kazuharu; Fujiwara, Atushi; Nakamura, Yoji; Yasuike, Motoshige; Ojima, Nobuhiko; Masaoka, Tetsuji; Ozaki, Akiyuki; Kazeto, Yukinori; Gen, Koichiro; Nagao, Jiro; Tanaka, Hideki; Kobayashi, Takanori; Ototake, Mitsuru

    2014-03-26

    Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel.

  6. Pseudomonas caspiana sp. nov., a citrus pathogen in the Pseudomonas syringae phylogenetic group.

    PubMed

    Busquets, Antonio; Gomila, Margarita; Beiki, Farid; Mulet, Magdalena; Rahimian, Heshmat; García-Valdés, Elena; Lalucat, Jorge

    2017-07-01

    In a screening by multilocus sequence analysis of Pseudomonas strains isolated from diverse origins, 4 phylogenetically closely related strains (FBF58, FBF102 T , FBF103, and FBF122) formed a well-defined cluster in the Pseudomonas syringae phylogenetic group. The strains were isolated from citrus orchards in northern Iran with disease symptoms in the leaves and stems and its pathogenicity against citrus plants was demonstrated. The whole genome of the type strain of the proposed new species (FBF102 T =CECT 9164 T =CCUG 69273 T ) was sequenced and characterized. Comparative genomics with the 14 known Pseudomonas species type strains of the P. syringae phylogenetic group demonstrated that this strain belonged to a new genomic species, different from the species described thus far. Genome analysis detected genes predicted to be involved in pathogenesis, such as an atypical type 3 secretion system and two type 6 secretion systems, together with effectors and virulence factors. A polyphasic taxonomic characterization demonstrated that the 4 plant pathogenic strains represented a new species, for which the name Pseudomonas caspiana sp. nov. is proposed. Copyright © 2017 Elsevier GmbH. All rights reserved.

  7. Genome characterization of a novel binary toxin-positive strain of Clostridium difficile and comparison with the epidemic 027 and 078 strains.

    PubMed

    Peng, Zhong; Liu, Sidi; Meng, Xiujuan; Liang, Wan; Xu, Zhuofei; Tang, Biao; Wang, Yuanguo; Duan, Juping; Fu, Chenchao; Wu, Bin; Wu, Anhua; Li, Chunhui

    2017-01-01

    Clostridium difficile is an anaerobic Gram-positive spore-forming gut pathogen that causes antibiotic-associated diarrhea worldwide. A small number of C. difficile strains express the binary toxin (CDT), which is generally found in C. difficile 027 (ST1) and/or 078 (ST11) in clinic. However, we isolated a binary toxin-positive non-027, non-078 C. difficile LC693 that is associated with severe diarrhea in China. The genotype of this strain was determined as ST201. To understand the pathogenesis-basis of C. difficile ST201, the strain LC693 was chosen for whole genome sequencing, and its genome sequence was analyzed together with the other two ST201 strains VL-0104 and VL-0391 and compared to the epidemic 027/ST1 and 078/ST11 strains. The project finally generated an estimated genome size of approximately 4.07 Mbp for strain LC693. Genome size of the three ST201 strains ranged from 4.07 to 4.16 Mb, with an average GC content between 28.5 and 28.9%. Phylogenetic analysis demonstrated that the ST201 strains belonged to clade 3. The ST201 genomes contained more than 40 antibiotic resistance genes and 15 of them were predicted to be associated with vancomycin-resistance. The ST201 strains contained a larger PaLoc with a Tn6218 element inserted than the 027/ST1 and 078/ST11 strains, and encoded a truncated TcdC. In addition, the ST201 strains contained intact binary toxin coding and regulation genes which are highly homologous to the 027/ST1 strain. Genome comparison of the ST201 strains with the epidemic 027 and 078 strain identified 641 genes specific for C. difficile ST201, and a number of them were predicted as fitness and virulence associated genes. The presence of those genes also contributes to the pathogenesis of the ST201 strains. In this study, the genomic characterization of three binary toxin-positive C. difficile ST201 strains in clade 3 was discussed and compared to the genomes of the epidemic 027 and the 078 strains. Our analysis identified a number fitness and virulence associated genes/loci in the ST201 genomes that contribute to the pathogenesis of C. difficile ST201.

  8. Analysis of infant isolates of Bifidobacterium breve by comparative genome hybridization indicates the existence of new subspecies with marked infant specificity.

    PubMed

    Boesten, Rolf; Schuren, Frank; Wind, Richèle D; Knol, Jan; de Vos, Willem M

    2011-09-01

    A total of 20 Bifidobacterium strains were isolated from fecal samples of 4 breast- and bottle-fed infants and all were characterized as Bifidobacterium breve based on 16S rRNA gene sequence and metabolic analysis. These isolates were further characterized and compared to the type strains of B. breve and 7 other Bifidobacterium spp. by comparative genome hybridization. For this purpose, we constructed and used a DNA-based microarray containing over 2000 randomly cloned DNA fragments from B. breve type strain LMG13208. This molecular analysis revealed a high degree of genomic variation between the isolated strains and allowed the vast majority to be grouped into 4 clusters. One cluster contained a single isolate that was virtually indistinguishable from the B. breve type strain. The 3 other clusters included 19 B. breve strains that differed considerably from all type strains. Remarkably, each of the 4 clusters included strains that were isolated from a single infant, indicating that a niche adaptation may contribute to variation within the B. breve species. Based on genomic hybridization data, the new B. breve isolates were estimated to contain approximately 60-90% of the genes of the B. breve type strain, attesting to the existence of various subspecies within the species B. breve. Further bioinformatic analysis identified several hundred diagnostic clones specific to the genomic clustering of the B. breve isolates. Molecular analysis of representatives of these revealed that annotated genes from the conserved B. breve core encoded mainly housekeeping functions, while the strain-specific genes were predicted to code for functions related to life style, such as carbohydrate metabolism and transport. This is compatible with genetic adaptation of the strains to their niche, a combination of infants and diet. Copyright © 2011 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  9. Sequencing and comparative analyses of the genomes of zoysiagrasses

    PubMed Central

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-01-01

    Zoysia is a warm-season turfgrass, which comprises 11 allotetraploid species (2n = 4x = 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession ‘Nagirizaki’ (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella ‘Wakaba’ and Z. pacifica ‘Zanpa’ were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica ‘Kyoto’, Z. japonica ‘Miyagi’ and Z. matrella ‘Chiba Fair Green’, were accumulated, and aligned against the reference genome of ‘Nagirizaki’ along with those from ‘Wakaba’ and ‘Zanpa’. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the ‘Zoysia Genome Database’ at http://zoysia.kazusa.or.jp. PMID:26975196

  10. Sequencing and comparative analyses of the genomes of zoysiagrasses.

    PubMed

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-04-01

    Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella 'Wakaba' and Z. pacifica 'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica'Kyoto', Z. japonica'Miyagi' and Z. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' at http://zoysia.kazusa.or.jp. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  11. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia

    PubMed Central

    Hou, Shaobin; Makarova, Kira S; Saw, Jimmy HW; Senin, Pavel; Ly, Benjamin V; Zhou, Zhemin; Ren, Yan; Wang, Jianmei; Galperin, Michael Y; Omelchenko, Marina V; Wolf, Yuri I; Yutin, Natalya; Koonin, Eugene V; Stott, Matthew B; Mountain, Bruce W; Crowe, Michelle A; Smirnova, Angela V; Dunfield, Peter F; Feng, Lu; Wang, Lei; Alam, Maqsudul

    2008-01-01

    Background The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift in the isoelectric points of proteins. Conclusion The results of genome analysis of M. infernorum support the monophyly of the PVC superphylum. M. infernorum possesses a streamlined genome but seems to have acquired numerous genes including those for enzymes of methylotrophic pathways via horizontal gene transfer, in particular, from Proteobacteria. Reviewers This article was reviewed by John A. Fuerst, Ludmila Chistoserdova, and Radhey S. Gupta. PMID:18593465

  12. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia.

    PubMed

    Hou, Shaobin; Makarova, Kira S; Saw, Jimmy H W; Senin, Pavel; Ly, Benjamin V; Zhou, Zhemin; Ren, Yan; Wang, Jianmei; Galperin, Michael Y; Omelchenko, Marina V; Wolf, Yuri I; Yutin, Natalya; Koonin, Eugene V; Stott, Matthew B; Mountain, Bruce W; Crowe, Michelle A; Smirnova, Angela V; Dunfield, Peter F; Feng, Lu; Wang, Lei; Alam, Maqsudul

    2008-07-01

    The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift in the isoelectric points of proteins. The results of genome analysis of M. infernorum support the monophyly of the PVC superphylum. M. infernorum possesses a streamlined genome but seems to have acquired numerous genes including those for enzymes of methylotrophic pathways via horizontal gene transfer, in particular, from Proteobacteria. This article was reviewed by John A. Fuerst, Ludmila Chistoserdova, and Radhey S. Gupta.

  13. Characterization of the complete chloroplast genome of Platycarya strobilacea (Juglandaceae)

    Treesearch

    Jing Yan; Kai Han; Shuyun Zeng; Peng Zhao; Keith Woeste; Jianfang Li; Zhan-Lin Liu

    2017-01-01

    The whole chloroplast genome (cp genome) sequence of Platycarya strobilacea was characterized from Illumina pair-end sequencing data. The complete cp genome was 160,994 bp in length and contained a large single copy region (LSC) of 90,225 bp and a small single copy region (SSC) of 18,371 bp, which were separated by a pair of inverted repeat regions...

  14. The Essential Genome of Escherichia coli K-12

    PubMed Central

    2018-01-01

    ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657

  15. Pan-genome analysis of human gastric pathogen H. pylori: comparative genomics and pathogenomics approaches to identify regions associated with pathogenicity and prediction of potential core therapeutic targets.

    PubMed

    Ali, Amjad; Naz, Anam; Soares, Siomar C; Bakhtiar, Marriam; Tiwari, Sandeep; Hassan, Syed S; Hanan, Fazal; Ramos, Rommel; Pereira, Ulisses; Barh, Debmalya; Figueiredo, Henrique César Pereira; Ussery, David W; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2015-01-01

    Helicobacter pylori is a human gastric pathogen implicated as the major cause of peptic ulcer and second leading cause of gastric cancer (~70%) around the world. Conversely, an increased resistance to antibiotics and hindrances in the development of vaccines against H. pylori are observed. Pan-genome analyses of the global representative H. pylori isolates consisting of 39 complete genomes are presented in this paper. Phylogenetic analyses have revealed close relationships among geographically diverse strains of H. pylori. The conservation among these genomes was further analyzed by pan-genome approach; the predicted conserved gene families (1,193) constitute ~77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost homolog proteins were characterized as universal therapeutic targets against H. pylori based on their functional annotation and protein-protein interaction. Finally, pathogenomics and genome plasticity analysis revealed 3 highly conserved and 2 highly variable putative pathogenicity islands in all of the H. pylori genomes been analyzed.

  16. Draft genome sequence of the silver pomfret fish, Pampus argenteus.

    PubMed

    AlMomin, Sabah; Kumar, Vinod; Al-Amad, Sami; Al-Hussaini, Mohsen; Dashti, Talal; Al-Enezi, Khaznah; Akbar, Abrar

    2016-01-01

    Silver pomfret, Pampus argenteus, is a fish species from coastal waters. Despite its high commercial value, this edible fish has not been sequenced. Hence, its genetic and genomic studies have been limited. We report the first draft genome sequence of the silver pomfret obtained using a Next Generation Sequencing (NGS) technology. We assembled 38.7 Gb of nucleotides into scaffolds of 350 Mb with N50 of about 1.5 kb, using high quality paired end reads. These scaffolds represent 63.7% of the estimated silver pomfret genome length. The newly sequenced and assembled genome has 11.06% repetitive DNA regions, and this percentage is comparable to that of the tilapia genome. The genome analysis predicted 16 322 genes. About 91% of these genes showed homology with known proteins. Many gene clusters were annotated to protein and fatty-acid metabolism pathways that may be important in the context of the meat texture and immune system developmental processes. The reference genome can pave the way for the identification of many other genomic features that could improve breeding and population-management strategies, and it can also help characterize the genetic diversity of P. argenteus.

  17. Metagenomic Analysis of Therapeutic PYO Phage Cocktails from 1997 to 2014

    PubMed Central

    Larsen, Mette Voldby

    2017-01-01

    Phage therapy has regained interest in recent years due to the alarming spread of antibiotic resistance. Whilst phage cocktails are commonly sold in pharmacies in countries such as Georgia and Russia, this is not the case in western countries due to western regulatory agencies requiring a thorough characterization of the drug. Here, DNA sequencing of constituent biological entities constitutes a first step. The pyophage (PYO) cocktail is one of the main commercial products of the Georgian Eliava Institute of Bacteriophage, Microbiology and Virology and is used to cure skin infections. Since its first production in the 1930s, the composition of the cocktail has been periodically modified to add phages effective against emerging pathogenic strains. In this paper, we compared the composition of three PYO cocktails from 1997 (PYO97), 2000 (PYO2000) and 2014 (PYO2014). Based on next generation sequencing, de novo assembly and binning of contigs into draft genomes based on tetranucleotide distance, thirty and twenty-nine phage draft genomes were predicted in PYO97 and PYO2014, respectively. Of these, thirteen and fifteen shared high similarity to known phages. Eleven draft genomes were found to be common in the two cocktails. One of these showed no similarity to publicly available phage genomes. Representatives of phages targeting E. faecalis, E. faecium, E. coli, Proteus, P. aeruginosa and S. aureus were found in both cocktails. Finally, we estimated larger overlap of the PYO2000 cocktail to PYO97 compared to PYO2014. Using next generation sequencing and metagenomics analysis, we were able to characterize and compare the content of PYO cocktails separated by 17 years in time. Even though the cocktail composition is upgraded every six months, we found it to remain relatively stable over the years. PMID:29099783

  18. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

    PubMed

    Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

    2011-11-01

    Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

  19. A Trichosporonales genome tree based on 27 haploid and three evolutionarily conserved 'natural' hybrid genomes.

    PubMed

    Takashima, Masako; Sriswasdi, Sira; Manabe, Ri-Ichiroh; Ohkuma, Moriya; Sugita, Takashi; Iwasaki, Wataru

    2018-01-01

    To construct a backbone tree consisting of basidiomycetous yeasts, draft genome sequences from 25 species of Trichosporonales (Tremellomycetes, Basidiomycota) were generated. In addition to the hybrid genomes of Trichosporon coremiiforme and Trichosporon ovoides that we described previously, we identified an interspecies hybrid genome in Cutaneotrichosporon mucoides (formerly Trichosporon mucoides). This hybrid genome had a gene retention rate of ~55%, and its closest haploid relative was Cutaneotrichosporon dermatis. After constructing the C. mucoides subgenomes, we generated a phylogenetic tree using genome data from the 27 haploid species and the subgenome data from the three hybrid genome species. It was a high-quality tree with 100% bootstrap support for all of the branches. The genome-based tree provided superior resolution compared with previous multi-gene analyses. Although our backbone tree does not include all Trichosporonales genera (e.g. Cryptotrichosporon), it will be valuable for future analyses of genome data. Interest in interspecies hybrid fungal genomes has recently increased because they may provide a basis for new technologies. The three Trichosporonales hybrid genomes described in this study are different from well-characterized hybrid genomes (e.g. those of Saccharomyces pastorianus and Saccharomyces bayanus) because these hybridization events probably occurred in the distant evolutionary past. Hence, they will be useful for studying genome stability following hybridization and speciation events. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  20. Characterization of the rainbow trout transcriptome using Sanger and 454-Pyrosequencing approaches

    USDA-ARS?s Scientific Manuscript database

    BACKGROUND: Rainbow trout is an important fish species for aquaculture and a model species for research investigations associated with carcinogenesis, comparative immunology, toxicology and the evolutionary biology. However, to date there is no genome reference sequence to facilitate the development...

  1. Characterization of the rainbow trout transcriptome using Sanger and 454-pyrosequencing approaches

    USDA-ARS?s Scientific Manuscript database

    Background: Rainbow trout is an important fish for aquaculture and recreational fisheries and serves as a model species for research investigations associated with carcinogenesis, comparative immunology, toxicology and the evolutionary biology. However, to date there is no genome reference sequence...

  2. Physiological and molecular characterization of Si uptake in wild rice species.

    PubMed

    Mitani-Ueno, Namiki; Ogai, Hisao; Yamaji, Naoki; Ma, Jian Feng

    2014-07-01

    Cultivated rice (Oryza sativa) accumulates high concentration of silicon (Si), which is required for its high and sustainable production. High Si accumulation in cultivated rice is achieved by a high expression of both influx (Lsi1) and efflux (Lsi2) Si transporters in roots. Herein, we physiologically investigated Si uptake, isolated and functionally characterized Si transporters in six wild rice species with different genome types. Si uptake by the roots was lower in Oryza rufipogon, Oryza barthii (AA genome), Oryza australiensis (EE genome) and Oryza punctata (BB genome), but similar in Oryza glumaepatula and Oryza meridionalis (AA genome) compared with the cultivated rice (cv. Nipponbare). However, all wild rice species and the cultivated rice showed similar concentration of Si in the shoots when grown in a field. All species with AA genome showed the same amino acid sequence of both Lsi1 and Lsi2 as O. sativa, whereas species with EE and BB genome showed several nucleotide differences in both Lsi1 and Lsi2. However, proteins encoded by these genes also showed transport activity for Si in Xenopus oocyte. The mRNA expression of Lsi1 in all wild rice species was lower than that in the cultivated rice, whereas the expression of Lsi2 was lower in O. rufipogon and O. barthii but similar in other species. Similar cellular localization of Lsi1 and Lsi2 was observed in all wild rice as the cultivated rice. These results indicate that superior Si uptake, the important trait for rice growth, is basically conserved in wild and cultivated rice species. © 2013 Scandinavian Plant Physiology Society.

  3. Comparative Genomics of Listeria Sensu Lato: Genus-Wide Differences in Evolutionary Dynamics and the Progressive Gain of Complex, Potentially Pathogenicity-Related Traits through Lateral Gene Transfer

    PubMed Central

    Chiara, Matteo; Caruso, Marta; D’Erchia, Anna Maria; Manzari, Caterina; Fraccalvieri, Rosa; Goffredo, Elisa; Latorre, Laura; Miccolupo, Angela; Padalino, Iolanda; Santagada, Gianfranco; Chiocco, Doriano; Pesole, Graziano; Horner, David S.; Parisi, Antonio

    2015-01-01

    Historically, genome-wide and molecular characterization of the genus Listeria has concentrated on the important human pathogen Listeria monocytogenes and a small number of closely related species, together termed Listeria sensu strictu. More recently, a number of genome sequences for more basal, and nonpathogenic, members of the Listeria genus have become available, facilitating a wider perspective on the evolution of pathogenicity and genome level evolutionary dynamics within the entire genus (termed Listeria sensu lato). Here, we have sequenced the genomes of additional Listeria fleischmannii and Listeria newyorkensis isolates and explored the dynamics of genome evolution in Listeria sensu lato. Our analyses suggest that acquisition of genetic material through gene duplication and divergence as well as through lateral gene transfer (mostly from outside Listeria) is widespread throughout the genus. Novel genetic material is apparently subject to rapid turnover. Multiple lines of evidence point to significant differences in evolutionary dynamics between the most basal Listeria subclade and all other congeners, including both sensu strictu and other sensu lato isolates. Strikingly, these differences are likely attributable to stochastic, population-level processes and contribute to observed variation in genome size across the genus. Notably, our analyses indicate that the common ancestor of Listeria sensu lato lacked flagella, which were acquired by lateral gene transfer by a common ancestor of Listeria grayi and Listeria sensu strictu, whereas a recently functionally characterized pathogenicity island, responsible for the capacity to produce cobalamin and utilize ethanolamine/propane-2-diol, was acquired in an ancestor of Listeria sensu strictu. PMID:26185097

  4. Isolation, characterization and comparative genomics of bacteriophage SfIV: a novel serotype converting phage from Shigella flexneri

    PubMed Central

    2013-01-01

    Background Shigella flexneri is the major cause of shigellosis in the developing countries. The O-antigen component of the lipopolysaccharide is one of the key virulence determinants required for the pathogenesis of S. flexneri. The glucosyltransferase and/or acetyltransferase genes responsible for the modification of the O-antigen are encoded by temperate serotype converting bacteriophage present in the S. flexneri genome. Several serotype converting phages have previously been isolated and characterized, however, attempts to isolate a serotype converting phage which encodes the modification genes of serotypes 4a strain have not been successful. Results In this study, a novel temperate serotype converting bacteriophage SfIV was isolated. Lysogenisation of phage SfIV converted serotype Y strain to serotype 4a. Electron microscopy indicated that SfIV belongs to Myoviridae family. The 39,758 bp genome of phage SfIV encompasses 54 open reading frames (orfs). Protein level comparison of SfIV with other serotype converting phages of S. flexneri revealed that SfIV is similar to phage SfII and SfV. The comparative analysis also revealed that SfIV phage contained five proteins which were not found in any other phages of S. flexneri. These proteins were: a tail fiber assembly protein, two hypothetical proteins with no clear function, and two other unknown proteins which were encoded by orfs present on a moron, that presumably got introduced in SfIV genome from another species via a transposon. These unique proteins of SfIV may play a role in the pathogenesis of the host. Conclusions This study reports the isolation and complete genome sequence analysis of bacteriophage SfIV. The SfIV phage has a host range significantly different from the other phages of Shigella. Comparative genome analysis identified several proteins unique to SfIV, which may potentially be involved in the survival and pathogenesis of its host. These findings will further our understanding on the evolution of these phages, and will also facilitate studies on development of new phage vectors and therapeutic agents to control infections caused by S. flexneri. PMID:24090466

  5. The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.

    2005-02-01

    We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similarmore » to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.« less

  6. Conserved Gene Order and Expanded Inverted Repeats Characterize Plastid Genomes of Thalassiosirales

    PubMed Central

    Ashworth, Matt P.; Baeshen, Nabih A.; Baeshen, Mohammad N.; Bahieldin, Ahmed; Theriot, Edward C.; Jansen, Robert K.

    2014-01-01

    Diatoms are mostly photosynthetic eukaryotes within the heterokont lineage. Variable plastid genome sizes and extensive genome rearrangements have been observed across the diatom phylogeny, but little is known about plastid genome evolution within order- or family-level clades. The Thalassiosirales is one of the more comprehensively studied orders in terms of both genetics and morphology. Seven complete diatom plastid genomes are reported here including four Thalassiosirales: Thalassiosira weissflogii, Roundia cardiophora, Cyclotella sp. WC03_2, Cyclotella sp. L04_2, and three additional non-Thalassiosirales species Chaetoceros simplex, Cerataulina daemon, and Rhizosolenia imbricata. The sizes of the seven genomes vary from 116,459 to 129,498 bp, and their genomes are compact and lack introns. The larger size of the plastid genomes of Thalassiosirales compared to other diatoms is due primarily to expansion of the inverted repeat. Gene content within Thalassiosirales is more conserved compared to other diatom lineages. Gene order within Thalassiosirales is highly conserved except for the extensive genome rearrangement in Thalassiosira oceanica. Cyclotella nana, Thalassiosira weissflogii and Roundia cardiophora share an identical gene order, which is inferred to be the ancestral order for the Thalassiosirales, differing from that of the other two Cyclotella species by a single inversion. The genes ilvB and ilvH are missing in all six diatom plastid genomes except for Cerataulina daemon, suggesting an independent gain of these genes in this species. The acpP1 gene is missing in all Thalassiosirales, suggesting that its loss may be a synapomorphy for the order and this gene may have been functionally transferred to the nucleus. Three genes involved in photosynthesis, psaE, psaI, psaM, are missing in Rhizosolenia imbricata, which represents the first documented instance of the loss of photosynthetic genes in diatom plastid genomes. PMID:25233465

  7. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  8. Complete genome analysis of highly pathogenic bovine ephemeral fever virus isolated in Turkey in 2012.

    PubMed

    Abayli, Hasan; Tonbak, Sukru; Azkur, Ahmet Kursat; Bulut, Hakan

    2017-10-01

    Relatively high prevalence and mortality rates of bovine ephemeral fever (BEF) have been reported in recent epidemics in some countries, including Turkey, when compared with previous outbreaks. A limited number of complete genome sequences of BEF virus (BEFV) are available in the GenBank Database. In this study, the complete genome of highly pathogenic BEFV isolated during an outbreak in Turkey in 2012 was analyzed for genetic characterization. The complete genome of the Turkish BEFV isolate was amplified by reverse transcription-polymerase chain reaction (RT-PCR) and sequenced. It was found that the complete genome of the Turkish BEFV isolate was 14,901 nt in length. The complete genome sequence obtained from the study showed 91-92% identity at nucleotide level to Australian (BB7721) and Chinese (Bovine/China/Henan1/2012) BEFV isolates. Phylogenetic analysis of the glycoprotein gene of the Turkish BEFV isolate also showed that Turkish isolates were closely related to Israeli isolates. Because of the limited number of complete BEFV genome sequences, the results from this study will be useful for understanding the global molecular epidemiology and geodynamics of BEF.

  9. Automated array-based genomic profiling in chronic lymphocytic leukemia: Development of a clinical tool and discovery of recurrent genomic alterations

    PubMed Central

    Schwaenen, Carsten; Nessling, Michelle; Wessendorf, Swen; Salvi, Tatjana; Wrobel, Gunnar; Radlwimmer, Bernhard; Kestler, Hans A.; Haslinger, Christian; Stilgenbauer, Stephan; Döhner, Hartmut; Bentz, Martin; Lichter, Peter

    2004-01-01

    B cell chronic lymphocytic leukemia (B-CLL) is characterized by a highly variable clinical course. Recurrent chromosomal imbalances provide significant prognostic markers. Risk-adapted therapy based on genomic alterations has become an option that is currently being tested in clinical trials. To supply a robust tool for such large scale studies, we developed a comprehensive DNA microarray dedicated to the automated analysis of recurrent genomic imbalances in B-CLL by array-based comparative genomic hybridization (matrix–CGH). Validation of this chip in a series of 106 B-CLL cases revealed a high specificity and sensitivity that fulfils the criteria for application in clinical oncology. This chip is immediately applicable within clinical B-CLL treatment trials that evaluate whether B-CLL cases with distinct chromosomal abnormalities should be treated with chemotherapy of different intensities and/or stem cell transplantation. Through the control set of DNA fragments equally distributed over the genome, recurrent genomic imbalances were discovered: trisomy of chromosome 19 and gain of the MYCN oncogene correlating with an elevation of MYCN mRNA expression. PMID:14730057

  10. The Genome of Ganderma lucidum Provide Insights into Triterpense Biosynthesis and Wood Degradation

    PubMed Central

    Huang, Zhuo; Zhang, Hong-Mei; Liu, Wei; Liu, Le; Ma, Junping; Xia, Zhilan; Chen, Yuxin; Chen, Yuewen; Wang, Depeng; Ni, Peixiang; Guo, An-Yuan; Xiong, Xingyao

    2012-01-01

    Background Ganoderma lucidum (Reishi or Ling Zhi) is one of the most famous Traditional Chinese Medicines and has been widely used in the treatment of various human diseases in Asia countries. It is also a fungus with strong wood degradation ability with potential in bioenergy production. However, genes, pathways and mechanisms of these functions are still unknown. Methodology/Principal Findings The genome of G. lucidum was sequenced and assembled into a 39.9 megabases (Mb) draft genome, which encoded 12,080 protein-coding genes and ∼83% of them were similar to public sequences. We performed comprehensive annotation for G. lucidum genes and made comparisons with genes in other fungi genomes. Genes in the biosynthesis of the main G. lucidum active ingredients, ganoderic acids (GAs), were characterized. Among the GAs synthases, we identified a fusion gene, the N and C terminal of which are homologous to two different enzymes. Moreover, the fusion gene was only found in basidiomycetes. As a white rot fungus with wood degradation ability, abundant carbohydrate-active enzymes and ligninolytic enzymes were identified in the G. lucidum genome and were compared with other fungi. Conclusions/Significance The genome sequence and well annotation of G. lucidum will provide new insights in function analyses including its medicinal mechanism. The characterization of genes in the triterpene biosynthesis and wood degradation will facilitate bio-engineering research in the production of its active ingredients and bioenergy. PMID:22567134

  11. Comparative genome-wide polymorphic microsatellite markers in Antarctic penguins through next generation sequencing

    PubMed Central

    Vianna, Juliana A.; Noll, Daly; Mura-Jornet, Isidora; Valenzuela-Guerra, Paulina; González-Acuña, Daniel; Navarro, Cristell; Loyola, David E.; Dantas, Gisele P. M.

    2017-01-01

    Abstract Microsatellites are valuable molecular markers for evolutionary and ecological studies. Next generation sequencing is responsible for the increasing number of microsatellites for non-model species. Penguins of the Pygoscelis genus are comprised of three species: Adélie (P. adeliae), Chinstrap (P. antarcticus) and Gentoo penguin (P. papua), all distributed around Antarctica and the sub-Antarctic. The species have been affected differently by climate change, and the use of microsatellite markers will be crucial to monitor population dynamics. We characterized a large set of genome-wide microsatellites and evaluated polymorphisms in all three species. SOLiD reads were generated from the libraries of each species, identifying a large amount of microsatellite loci: 33,677, 35,265 and 42,057 for P. adeliae, P. antarcticus and P. papua, respectively. A large number of dinucleotide (66,139), trinucleotide (29,490) and tetranucleotide (11,849) microsatellites are described. Microsatellite abundance, diversity and orthology were characterized in penguin genomes. We evaluated polymorphisms in 170 tetranucleotide loci, obtaining 34 polymorphic loci in at least one species and 15 polymorphic loci in all three species, which allow to perform comparative studies. Polymorphic markers presented here enable a number of ecological, population, individual identification, parentage and evolutionary studies of Pygoscelis, with potential use in other penguin species. PMID:28898354

  12. Comparative molecular dynamics studies of heterozygous open reading frames of DNA polymerase eta (η) in pathogenic yeast Candida albicans

    NASA Astrophysics Data System (ADS)

    Satpati, Suresh; Manohar, Kodavati; Acharya, Narottam; Dixit, Anshuman

    2017-01-01

    Genomic instability in Candida albicans is believed to play a crucial role in fungal pathogenesis. DNA polymerases contribute significantly to stability of any genome. Although Candida Genome database predicts presence of S. cerevisiae DNA polymerase orthologs; functional and structural characterizations of Candida DNA polymerases are still unexplored. DNA polymerase eta (Polη) is unique as it promotes efficient bypass of cyclobutane pyrimidine dimers. Interestingly, C. albicans is heterozygous in carrying two Polη genes and the nucleotide substitutions were found only in the ORFs. As allelic differences often result in functional differences of the encoded proteins, comparative analyses of structural models and molecular dynamic simulations were performed to characterize these orthologs of DNA Polη. Overall structures of both the ORFs remain conserved except subtle differences in the palm and PAD domains. The complementation analysis showed that both the ORFs equally suppressed UV sensitivity of yeast rad30 deletion strain. Our study has predicted two novel molecular interactions, a highly conserved molecular tetrad of salt bridges and a series of π-π interactions spanning from thumb to PAD. This study suggests these ORFs as the homologues of yeast Polη, and due to its heterogeneity in C. albicans they may play a significant role in pathogenicity.

  13. Molecular characterization of the virulent infectious hematopoietic necrosis virus (IHNV) strain 220-90

    PubMed Central

    2010-01-01

    Background Infectious hematopoietic necrosis virus (IHNV) is the type species of the genus Novirhabdovirus, within the family Rhabdoviridae, infecting several species of wild and hatchery reared salmonids. Similar to other rhabdoviruses, IHNV has a linear single-stranded, negative-sense RNA genome of approximately 11,000 nucleotides. The IHNV genome encodes six genes; the nucleocapsid, phosphoprotein, matrix protein, glycoprotein, non-virion protein and polymerase protein genes, respectively. This study describes molecular characterization of the virulent IHNV strain 220-90, belonging to the M genogroup, and its phylogenetic relationships with available sequences of IHNV isolates worldwide. Results The complete genomic sequence of IHNV strain 220-90 was determined from the DNA of six overlapping clones obtained by RT-PCR amplification of genomic RNA. The complete genome sequence of 220-90 comprises 11,133 nucleotides (GenBank GQ413939) with the gene order of 3'-N-P-M-G-NV-L-5'. These genes are separated by conserved gene junctions, with di-nucleotide gene spacers. An additional uracil nucleotide was found at the end of the 5'-trailer region, which was not reported before in other IHNV strains. The first 15 of the 16 nucleotides at the 3'- and 5'-termini of the genome are complementary, and the first 4 nucleotides at 3'-ends of the IHNV are identical to other novirhadoviruses. Sequence homology and phylogenetic analysis of the glycoprotein genes show that 220-90 strain is 97% identical to most of the IHNV strains. Comparison of the virulent 220-90 genomic sequences with less virulent WRAC isolate shows more than 300 nucleotides changes in the genome, which doesn't allow one to speculate putative residues involved in the virulence of IHNV. Conclusion We have molecularly characterized one of the well studied IHNV isolates, 220-90 of genogroup M, which is virulent for rainbow trout, and compared phylogenetic relationship with North American and other strains. Determination of the complete nucleotide sequence is essential for future studies on pathogenesis of IHNV using a reverse genetics approach and developing efficient control strategies. PMID:20085652

  14. Characterization, sequencing and comparative genomic analysis of vB_AbaM-IME-AB2, a novel lytic bacteriophage that infects multidrug-resistant Acinetobacter baumannii clinical isolates.

    PubMed

    Peng, Fan; Mi, Zhiqiang; Huang, Yong; Yuan, Xin; Niu, Wenkai; Wang, Yahui; Hua, Yuhui; Fan, Huahao; Bai, Changqing; Tong, Yigang

    2014-07-05

    With the use of broad-spectrum antibiotics, immunosuppressive drugs, and glucocorticoids, multidrug-resistant Acinetobacter baumannii (MDR-AB) has become a major nosocomial pathogen species. The recent renaissance of bacteriophage therapy may provide new treatment strategies for combatting drug-resistant bacterial infections. In this study, we isolated a lytic bacteriophage vB_AbaM-IME-AB2 has a short latent period and a small burst size, which clear its host's suspension quickly, was selected for characterization and a complete genomic comparative study. The isolated bacteriophage vB_AbaM-IME-AB2 has an icosahedral head and displays morphology resembling Myoviridae family. Gel separation assays showed that the phage particle contains at least nine protein bands with molecular weights ranging 15-100 kDa. vB_AbaM-IME-AB2 could adsorb its host cells in 9 min with an adsorption rate more than 99% and showed a short latent period (20 min) and a small burst size (62 pfu/cell). It could form clear plaques in the double-layer assay and clear its host's suspension in just 4 hours. Whole genome of vB_AbaM-IME-AB2 was sequenced and annotated and the results showed that its genome is a double-stranded DNA molecule consisting of 43,665 nucleotides. The genome has a G + C content of 37.5% and 82 putative coding sequences (CDSs). We compared the characteristics and complete genome sequence of all known Acinetobacter baumannii bacteriophages. There are only three that have been sequenced Acinetobacter baumannii phages AB1, AP22, and phiAC-1, which have a relatively high similarity and own a coverage of 65%, 50%, 8% respectively when compared with our phage vB_AbaM-IME-AB2. A nucleotide alignment of the four Acinetobacter baumannii phages showed that some CDSs are similar, with no significant rearrangements observed. Yet some sections of these strains of phage are nonhomologous. vB_AbaM-IME-AB2 was a novel and unique A. baumannii bacteriophage. These findings suggest a common ancestry and microbial diversity and evolution. A clear understanding of its characteristics and genes is conducive to the treatment of multidrug-resistant A. baumannii in the future.

  15. QTLomics in Soybean: A Way Forward for Translational Genomics and Breeding

    PubMed Central

    Kumawat, Giriraj; Gupta, Sanjay; Ratnaparkhe, Milind B.; Maranna, Shivakumar; Satpute, Gyanesh K.

    2016-01-01

    Food legumes play an important role in attaining both food and nutritional security along with sustainable agricultural production for the well-being of humans globally. The various traits of economic importance in legume crops are complex and quantitative in nature, which are governed by quantitative trait loci (QTLs). Mapping of quantitative traits is a tedious and costly process, however, a large number of QTLs has been mapped in soybean for various traits albeit their utilization in breeding programmes is poorly reported. For their effective use in breeding programme it is imperative to narrow down the confidence interval of QTLs, to identify the underlying genes, and most importantly allelic characterization of these genes for identifying superior variants. In the field of functional genomics, especially in the identification and characterization of gene responsible for quantitative traits, soybean is far ahead from other legume crops. The availability of genic information about quantitative traits is more significant because it is easy and effective to identify homologs than identifying shared syntenic regions in other crop species. In soybean, genes underlying QTLs have been identified and functionally characterized for phosphorous efficiency, flowering and maturity, pod dehiscence, hard-seededness, α-Tocopherol content, soybean cyst nematode, sudden death syndrome, and salt tolerance. Candidate genes have also been identified for many other quantitative traits for which functional validation is required. Using the sequence information of identified genes from soybean, comparative genomic analysis of homologs in other legume crops could discover novel structural variants and useful alleles for functional marker development. The functional markers may be very useful for molecular breeding in soybean and harnessing benefit of translational research from soybean to other leguminous crops. Thus, soybean crop can act as a model crop for translational genomics and breeding of quantitative traits in legume crops. In this review, we summarize current status of identification and characterization of genes underlying QTLs for various quantitative traits in soybean and their significance in translational genomics and breeding of other legume crops. PMID:28066449

  16. Cold adaptive traits revealed by comparative genomic analysis of the eurypsychrophile Rhodococcus sp. JG3 isolated from high elevation McMurdo Dry Valley permafrost, Antarctica.

    PubMed

    Goordial, Jacqueline; Raymond-Bouchard, Isabelle; Zolotarov, Yevgen; de Bethencourt, Luis; Ronholm, Jennifer; Shapiro, Nicole; Woyke, Tanja; Stromvik, Martina; Greer, Charles W; Bakermans, Corien; Whyte, Lyle

    2016-02-01

    The permafrost soils of the high elevation McMurdo Dry Valleys are the most cold, desiccating and oligotrophic on Earth. Rhodococcus sp. JG3 is one of very few bacterial isolates from Antarctic Dry Valley permafrost, and displays subzero growth down to -5°C. To understand how Rhodococcus sp. JG3 is able to survive extreme permafrost conditions and be metabolically active at subzero temperatures, we sequenced its genome and compared it to the genomes of 14 mesophilic rhodococci. Rhodococcus sp. JG3 possessed a higher copy number of genes for general stress response, UV protection and protection from cold shock, osmotic stress and oxidative stress. We characterized genome wide molecular adaptations to cold, and identified genes that had amino acid compositions favourable for increased flexibility and functionality at low temperatures. Rhodococcus sp. JG3 possesses multiple complimentary strategies which may enable its survival in some of the harshest permafrost on Earth. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. The Genome Sequence of the psychrophilic archaeon, Methanococcoides burtonii: the Role of Genome Evolution in Cold-adaptation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Allen, Michelle A.; Lauro, Federico M.; Williams, Timothy J.

    2009-04-01

    Psychrophilic archaea are abundant and perform critical roles throughout the Earth's expansive cold biosphere. Here we report the first complete genome sequence for a psychrophilic methanogenic archaeon, Methanococcoides burtonii. The genome sequence was manually annotated including the use of a five tiered Evidence Rating system that ranked annotations from Evidence Rating (ER) 1 (gene product experimentally characterized from the parent organism) to ER5 (hypothetical gene product) to provide a rapid means of assessing the certainty of gene function predictions. The genome is characterized by a higher level of aberrant sequence composition (51%) than any other archaeon. In comparison to hyper/thermophilicmore » archaea which are subject to selection of synonymous codon usage, M. burtonii has evolved cold adaptation through a genomic capacity to accommodate highly skewed amino acid content, while retaining codon usage in common with its mesophilic Methanosarcina cousins. Polysaccharide biosynthesis genes comprise at least 3.3% of protein coding genes in the genome, and Cell wall/membrane/envelope biogenesis COG genes are over-represented. Likewise, signal transduction (COG category T) genes are over-represented and M. burtonii has a high 'IQ' (a measure of adaptive potential) compared to many methanogens. Numerous genes in these two over-represented COG categories appear to have been acquired from {var_epsilon}- and {delta}-proteobacteria, as do specific genes involved in central metabolism such as a novel B form of aconitase. Transposases also distinguish M. burtonii from other archaea, and their genomic characteristics indicate they play an important role in evolving the M. burtonii genome. Our study reveals a capacity for this model psychrophile to evolve through genome plasticity (including nucleotide skew, horizontal gene transfer and transposase activity) that enables adaptation to the cold, and to the biological and physical changes that have occurred over the last several thousand years as it adapted from a marine, to an Antarctic lake environment.« less

  18. The genome sequence of the psychrophilic archaeon, Methanococcoides burtonii: the role of genome evolution in cold adaptation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Allen, Michele A; Lauro, Federico M; Williams, Timothy J

    2009-01-01

    Psychrophilic archaea are abundant and perform critical roles throughout the Earth's expansive cold biosphere. Here we report the first complete genome sequence for a psychrophilic methanogenic archaeon, Methanococcoides burtonii. The genome sequence was manually annotated including the use of a five-tiered evidence rating (ER) system that ranked annotations from ER1 (gene product experimentally characterized from the parent organism) to ER5 (hypothetical gene product) to provide a rapid means of assessing the certainty of gene function predictions. The genome is characterized by a higher level of aberrant sequence composition (51%) than any other archaeon. In comparison to hyper/thermophilic archaea, which aremore » subject to selection of synonymous codon usage, M. burtonii has evolved cold adaptation through a genomic capacity to accommodate highly skewed amino-acid content, while retaining codon usage in common with its mesophilic Methanosarcina cousins. Polysaccharide biosynthesis genes comprise at least 3.3% of protein coding genes in the genome, and Cell wall, membrane, envelope biogenesis COG genes are overrepresented. Likewise, signal transduction (COG category T) genes are overrepresented and M. burtonii has a high 'IQ' (a measure of adaptive potential) compared to many methanogens. Numerous genes in these two overrepresented COG categories appear to have been acquired from - and -Proteobacteria, as do specific genes involved in central metabolism such as a novel B form of aconitase. Transposases also distinguish M. burtonii from other archaea, and their genomic characteristics indicate they have an important role in evolving the M. burtonii genome. Our study reveals a capacity for this model psychrophile to evolve through genome plasticity (including nucleotide skew, horizontal gene transfer and transposase activity) that enables adaptation to the cold, and to the biological and physical changes that have occurred over the last several thousand years as it adapted from a marine to an Antarctic lake environment.« less

  19. Comparative virulence and genomic analysis of 10 strains of Haemophilus parasuis

    USDA-ARS?s Scientific Manuscript database

    Haemophilus parasuis is the cause of Glasser's disease in swine, which is characterized by systemic infection resulting in polyserositis, meningitis, and arthritis. An enormous difference exists in the severity of disease caused by H. parasuis strains, ranging from lethal systemic disease to asympto...

  20. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    PubMed

    Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

    2015-01-01

    The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  1. Parents' Experience with Pediatric Microarray: Transferrable Lessons in the Era of Genomic Counseling.

    PubMed

    Hayeems, R Z; Babul-Hirji, R; Hoang, N; Weksberg, R; Shuman, C

    2016-04-01

    Advances in genome-based microarray and sequencing technologies hold tremendous promise for understanding, better-managing and/or preventing disease and disease-related risk. Chromosome microarray technology (array based comparative genomic hybridization [aCGH]) is widely utilized in pediatric care to inform diagnostic etiology and medical management. Less clear is how parents experience and perceive the value of this technology. This study explored parents' experiences with aCGH in the pediatric setting, focusing on how they make meaning of various types of test results. We conducted in-person or telephone-based semi-structured interviews with parents of 21 children who underwent aCGH testing in 2010. Transcripts were coded and analyzed thematically according to the principles of interpretive description. We learned that parents expect genomic tests to be of personal use; their experiences with aCGH results characterize this use as intrinsic in the test's ability to provide a much sought-after answer for their child's condition, and instrumental in its ability to guide care, access to services, and family planning. In addition, parents experience uncertainty regardless of whether aCGH results are of pathogenic, uncertain, or benign significance; this triggers frustration, fear, and hope. Findings reported herein better characterize the notion of personal utility and highlight the pervasive nature of uncertainty in the context of genomic testing. Empiric research that links pre-test counseling content and psychosocial outcomes is warranted to optimize patient care.

  2. Complete Taiwanese Macaque (Macaca cyclopis) Mitochondrial Genome: Reference-Assisted de novo Assembly with Multiple k-mer Strategy.

    PubMed

    Huang, Yu-Feng; Midha, Mohit; Chen, Tzu-Han; Wang, Yu-Tai; Smith, David Glenn; Pei, Kurtis Jai-Chyi; Chiu, Kuo Ping

    2015-01-01

    The Taiwanese (Formosan) macaque (Macaca cyclopis) is the only nonhuman primate endemic to Taiwan. This primate species is valuable for evolutionary studies and as subjects in medical research. However, only partial fragments of the mitochondrial genome (mitogenome) of this primate species have been sequenced, not mentioning its nuclear genome. We employed next-generation sequencing to generate 2 x 90 bp paired-end reads, followed by reference-assisted de novo assembly with multiple k-mer strategy to characterize the M. cyclopis mitogenome. We compared the assembled mitogenome with that of other macaque species for phylogenetic analysis. Our results show that, the M. cyclopis mitogenome consists of 16,563 nucleotides encoding for 13 protein-coding genes, 2 ribosomal RNAs and 22 transfer RNAs. Phylogenetic analysis indicates that M. cyclopis is most closely related to M. mulatta lasiota (Chinese rhesus macaque), supporting the notion of Asia-continental origin of M. cyclopis proposed in previous studies based on partial mitochondrial sequences. Our work presents a novel approach for assembling a mitogenome that utilizes the capabilities of de novo genome assembly with assistance of a reference genome. The availability of the complete Taiwanese macaque mitogenome will facilitate the study of primate evolution and the characterization of genetic variations for the potential usage of this species as a non-human primate model for medical research.

  3. Genetic resources offer efficient tools for rice functional genomics research.

    PubMed

    Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May

    2016-05-01

    Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.

  4. Genomic landscape of ovarian clear cell carcinoma via whole exome sequencing.

    PubMed

    Kim, Se Ik; Lee, Ji Won; Lee, Maria; Kim, Hee Seung; Chung, Hyun Hoon; Kim, Jae-Weon; Park, Noh Hyun; Song, Yong-Sang; Seo, Jeong-Sun

    2018-02-01

    To analyze whole exome sequencing (WES) data on ovarian clear cell carcinoma (OCCC) in Korean patients via the technique of next generation sequencing (NGS). Genomic profiles were compared between endometriosis-associated OCCC (EMS-OCCC) and Non-EMS-OCCC. We used serum samples and cancer tissues, stored at the Seoul National University Hospital Human Biobank, that were initially collected from women diagnosed with OCCC between 2012 and 2016. In total, 15 patients were enrolled: 5 with pathologically confirmed EMS-OCCC and 10 with Non-EMS-OCCC. We performed NGS WES on 15 fresh frozen OCCC tissues and matched serum samples, enabling comprehensive genomic characterization of OCCC. OCCC was characterized by complex genomic alterations, with a median of 178 exonic mutations (range, 111-25,798) and a median of 343 somatic copy number variations (range, 43-1,820) per tumor sample. In all, 54 somatic mutations were discovered across 14 genes, including PIK3CA (40%), ARID1A (40%), and KRAS (20%) in the 15 Korean OCCCs. Copy number gains in NTRK1 (33%), MYC (40%), and GNAS (47%) and copy number losses in TET2 (73%), TSC1 (67%), BRCA2 (60%), and SMAD4 (47%) were frequent. The significantly altered pathways were associated with proliferation and survival (including the PI3K/AKT, TP53, and ERBB2 pathways) in 87% of OCCCs and with chromatin remodeling in 47% of OCCCs. No significant differences in frequencies of genetic alterations were detected between EMS-OCCC and Non-EMS-OCCC groups. We successfully characterized the genomic landscape of 15 Korean patients with OCCC. We identified potential therapeutic targets for the treatment of this malignancy. Copyright © 2017. Published by Elsevier Inc.

  5. An intronic open reading frame was released from one of group II introns in the mitochondrial genome of the haptophyte Chrysochromulina sp. NIES-1333

    PubMed Central

    Nishimura, Yuki; Kamikawa, Ryoma; Hashimoto, Tetsuo; Inagaki, Yuji

    2014-01-01

    Mitochondrial (mt) genome sequences, which often bear introns, have been sampled from phylogenetically diverse eukaryotes. Thus, we can anticipate novel insights into intron evolution from previously unstudied mt genomes. We here investigated the origins and evolution of three introns in the mt genome of the haptophyte Chrysochromulina sp. NIES-1333, which was sequenced completely in this study. All the three introns were characterized as group II, on the basis of predicted secondary structure, and the conserved sequence motifs at the 5′ and 3′ termini. Our comparative studies on diverse mt genomes prompt us to propose that the Chrysochromulina mt genome laterally acquired the introns from mt genomes in distantly related eukaryotes. Many group II introns harbor intronic open reading frames for the proteins (intron-encoded proteins or IEPs), which likely facilitate the splicing of their host introns. However, we propose that a “free-standing,” IEP-like protein, which is not encoded within any introns in the Chrysochromulina mt genome, is involved in the splicing of the first cox1 intron that lacks any open reading frames. PMID:25054084

  6. Genomics of Methylotrophy in Gram-Positive Methylamine-Utilizing Bacteria

    PubMed Central

    McTaggart, Tami L.; Beck, David A. C.; Setboonsarng, Usanisa; Shapiro, Nicole; Woyke, Tanja; Lidstrom, Mary E.; Kalyuzhnaya, Marina G.; Chistoserdova, Ludmila

    2015-01-01

    Gram-positive methylotrophic bacteria have been known for a long period of time, some serving as model organisms for characterizing the specific details of methylotrophy pathways/enzymes within this group. However, genome-based knowledge of methylotrophy within this group has been so far limited to a single species, Bacillus methanolicus (Firmicutes). The paucity of whole-genome data for Gram-positive methylotrophs limits our global understanding of methylotrophy within this group, including their roles in specific biogeochemical cycles, as well as their biotechnological potential. Here, we describe the isolation of seven novel strains of Gram-positive methylotrophs that include two strains of Bacillus and five representatives of Actinobacteria classified within two genera, Arthrobacter and Mycobacterium. We report whole-genome sequences for these isolates and present comparative analysis of the methylotrophy functional modules within these genomes. The genomic sequences of these seven novel organisms, all capable of growth on methylated amines, present an important reference dataset for understanding the genomic basis of methylotrophy in Gram-positive methylotrophic bacteria. This study is a major contribution to the field of methylotrophy, aimed at closing the gap in the genomic knowledge of methylotrophy within this diverse group of bacteria. PMID:27682081

  7. Phylogenetic relationship and virulence inference of Streptococcus Anginosus Group: curated annotation and whole-genome comparative analysis support distinct species designation

    PubMed Central

    2013-01-01

    Background The Streptococcus Anginosus Group (SAG) represents three closely related species of the viridans group streptococci recognized as commensal bacteria of the oral, gastrointestinal and urogenital tracts. The SAG also cause severe invasive infections, and are pathogens during cystic fibrosis (CF) pulmonary exacerbation. Little genomic information or description of virulence mechanisms is currently available for SAG. We conducted intra and inter species whole-genome comparative analyses with 59 publically available Streptococcus genomes and seven in-house closed high quality finished SAG genomes; S. constellatus (3), S. intermedius (2), and S. anginosus (2). For each SAG species, we sequenced at least one numerically dominant strain from CF airways recovered during acute exacerbation and an invasive, non-lung isolate. We also evaluated microevolution that occurred within two isolates that were cultured from one individual one year apart. Results The SAG genomes were most closely related to S. gordonii and S. sanguinis, based on shared orthologs and harbor a similar number of proteins within each COG category as other Streptococcus species. Numerous characterized streptococcus virulence factor homologs were identified within the SAG genomes including; adherence, invasion, spreading factors, LPxTG cell wall proteins, and two component histidine kinases known to be involved in virulence gene regulation. Mobile elements, primarily integrative conjugative elements and bacteriophage, account for greater than 10% of the SAG genomes. S. anginosus was the most variable species sequenced in this study, yielding both the smallest and the largest SAG genomes containing multiple genomic rearrangements, insertions and deletions. In contrast, within the S. constellatus and S. intermedius species, there was extensive continuous synteny, with only slight differences in genome size between strains. Within S. constellatus we were able to determine important SNPs and changes in VNTR numbers that occurred over the course of one year. Conclusions The comparative genomic analysis of the SAG clarifies the phylogenetics of these bacteria and supports the distinct species classification. Numerous potential virulence determinants were identified and provide a foundation for further studies into SAG pathogenesis. Furthermore, the data may be used to enable the development of rapid diagnostic assays and therapeutics for these pathogens. PMID:24341328

  8. Genomic, Proteomic and Morphological Characterization of Two Novel Broad Host Lytic Bacteriophages ΦPD10.3 and ΦPD23.1 Infecting Pectinolytic Pectobacterium spp. and Dickeya spp.

    PubMed Central

    Czajkowski, Robert; Ozymko, Zofia; de Jager, Victor; Siwinska, Joanna; Smolarska, Anna; Ossowicki, Adam; Narajczyk, Magdalena; Lojkowska, Ewa

    2015-01-01

    Pectinolytic Pectobacterium spp. and Dickeya spp. are necrotrophic bacterial pathogens of many important crops, including potato, worldwide. This study reports on the isolation and characterization of broad host lytic bacteriophages able to infect the dominant Pectobacterium spp. and Dickeya spp. affecting potato in Europe viz. Pectobacterium carotovorum subsp. carotovorum (Pcc), P. wasabiae (Pwa) and Dickeya solani (Dso) with the objective to assess their potential as biological disease control agents. Two lytic bacteriophages infecting stains of Pcc, Pwa and Dso were isolated from potato samples collected from two potato fields in central Poland. The ΦPD10.3 and ΦPD23.1 phages have morphology similar to other members of the Myoviridae family and the Caudovirales order, with a head diameter of 85 and 86 nm and length of tails of 117 and 121 nm, respectively. They were characterized for optimal multiplicity of infection, the rate of adsorption to the Pcc, Pwa and Dso cells, the latent period and the burst size. The phages were genotypically characterized with RAPD-PCR and RFLP techniques. The structural proteomes of both phages were obtained by fractionation of phage proteins by SDS-PAGE. Phage protein identification was performed by liquid chromatography-mass spectrometry (LC-MS) analysis. Pulsed-field gel electrophoresis (PFGE), genome sequencing and comparative genome analysis were used to gain knowledge of the length, organization and function of the ΦPD10.3 and ΦPD23.1 genomes. The potential use of ΦPD10.3 and ΦPD23.1 phages for the biocontrol of Pectobacterium spp. and Dickeya spp. infections in potato is discussed. PMID:25803051

  9. Deorphanizing the human transmembrane genome: A landscape of uncharacterized membrane proteins.

    PubMed

    Babcock, Joseph J; Li, Min

    2014-01-01

    The sequencing of the human genome has fueled the last decade of work to functionally characterize genome content. An important subset of genes encodes membrane proteins, which are the targets of many drugs. They reside in lipid bilayers, restricting their endogenous activity to a relatively specialized biochemical environment. Without a reference phenotype, the application of systematic screens to profile candidate membrane proteins is not immediately possible. Bioinformatics has begun to show its effectiveness in focusing the functional characterization of orphan proteins of a particular functional class, such as channels or receptors. Here we discuss integration of experimental and bioinformatics approaches for characterizing the orphan membrane proteome. By analyzing the human genome, a landscape reference for the human transmembrane genome is provided.

  10. Genotypic and phenotypic characterization of multidrug resistant Salmonella Typhimurium and Salmonella Kentucky strains recovered from chicken carcasses

    PubMed Central

    Grant, Ar’Quette; Choi, Seon Young; Alam, M. Samiul; Bell, Rebecca; Cavanaugh, Christopher; Balan, Kannan V.; Babu, Uma S.

    2017-01-01

    Abstract Salmonella Typhimurium is the leading cause of human non-typhoidal gastroenteritis in the US. S. Kentucky is one the most commonly recovered serovars from commercially processed poultry carcasses. This study compared the genotypic and phenotypic properties of two Salmonella enterica strains Typhimurium (ST221_31B) and Kentucky (SK222_32B) recovered from commercially processed chicken carcasses using whole genome sequencing, phenotype characterizations and an intracellular killing assay. Illumina MiSeq platform was used for sequencing of two Salmonella genomes. Phylogenetic analysis employing homologous alignment of a 1,185 non-duplicated protein-coding gene in the Salmonella core genome demonstrated fully resolved bifurcating patterns with varying levels of diversity that separated ST221_31B and SK222_32B genomes into distinct monophyletic serovar clades. Single nucleotide polymorphism (SNP) analysis identified 2,432 (ST19) SNPs within 13 Typhimurium genomes including ST221_31B representing Sequence Type ST19 and 650 (ST152) SNPs were detected within 13 Kentucky genomes including SK222_32B representing Sequence Type ST152. In addition to serovar-specific conserved coding sequences, the genomes of ST221_31B and SK222_32B harbor several genomic regions with significant genetic differences. These included phage and phage-like elements, carbon utilization or transport operons, fimbriae operons, putative membrane associated protein-encoding genes, antibiotic resistance genes, siderophore operons, and numerous hypothetical protein-encoding genes. Phenotype microarray results demonstrated that ST221_31B is capable of utilizing certain carbon compounds more efficiently as compared to SK222_3B; namely, 1,2-propanediol, M-inositol, L-threonine, α-D-lactose, D-tagatose, adonitol, formic acid, acetoacetic acid, and L-tartaric acid. ST221_31B survived for 48 h in macrophages, while SK222_32B was mostly eliminated. Further, a 3-fold growth of ST221_31B was observed at 24 hours post-infection in chicken granulosa cells while SK222_32B was unable to replicate in these cells. These results suggest that Salmonella Typhimurium can survive host defenses better and could be more invasive than Salmonella Kentucky and provide some insights into the genomic determinants responsible for these differences. PMID:28481935

  11. Genotypic and phenotypic characterization of multidrug resistant Salmonella Typhimurium and Salmonella Kentucky strains recovered from chicken carcasses.

    PubMed

    Tasmin, Rizwana; Hasan, Nur A; Grim, Christopher J; Grant, Ar'Quette; Choi, Seon Young; Alam, M Samiul; Bell, Rebecca; Cavanaugh, Christopher; Balan, Kannan V; Babu, Uma S; Parveen, Salina

    2017-01-01

    Salmonella Typhimurium is the leading cause of human non-typhoidal gastroenteritis in the US. S. Kentucky is one the most commonly recovered serovars from commercially processed poultry carcasses. This study compared the genotypic and phenotypic properties of two Salmonella enterica strains Typhimurium (ST221_31B) and Kentucky (SK222_32B) recovered from commercially processed chicken carcasses using whole genome sequencing, phenotype characterizations and an intracellular killing assay. Illumina MiSeq platform was used for sequencing of two Salmonella genomes. Phylogenetic analysis employing homologous alignment of a 1,185 non-duplicated protein-coding gene in the Salmonella core genome demonstrated fully resolved bifurcating patterns with varying levels of diversity that separated ST221_31B and SK222_32B genomes into distinct monophyletic serovar clades. Single nucleotide polymorphism (SNP) analysis identified 2,432 (ST19) SNPs within 13 Typhimurium genomes including ST221_31B representing Sequence Type ST19 and 650 (ST152) SNPs were detected within 13 Kentucky genomes including SK222_32B representing Sequence Type ST152. In addition to serovar-specific conserved coding sequences, the genomes of ST221_31B and SK222_32B harbor several genomic regions with significant genetic differences. These included phage and phage-like elements, carbon utilization or transport operons, fimbriae operons, putative membrane associated protein-encoding genes, antibiotic resistance genes, siderophore operons, and numerous hypothetical protein-encoding genes. Phenotype microarray results demonstrated that ST221_31B is capable of utilizing certain carbon compounds more efficiently as compared to SK222_3B; namely, 1,2-propanediol, M-inositol, L-threonine, α-D-lactose, D-tagatose, adonitol, formic acid, acetoacetic acid, and L-tartaric acid. ST221_31B survived for 48 h in macrophages, while SK222_32B was mostly eliminated. Further, a 3-fold growth of ST221_31B was observed at 24 hours post-infection in chicken granulosa cells while SK222_32B was unable to replicate in these cells. These results suggest that Salmonella Typhimurium can survive host defenses better and could be more invasive than Salmonella Kentucky and provide some insights into the genomic determinants responsible for these differences.

  12. Comparative genomics of two super-shedder isolates of Escherichia coli O157:H7

    PubMed Central

    Katani, Robab; Cote, Rebecca; Kudva, Indira T.; DebRoy, Chitrita; Arthur, Terrance M.

    2017-01-01

    Shiga toxin-producing Escherichia coli O157:H7 (O157) are zoonotic foodborne pathogens and of major public health concern that cause considerable intestinal and extra-intestinal illnesses in humans. O157 colonize the recto-anal junction (RAJ) of asymptomatic cattle who shed the bacterium into the environment through fecal matter. A small subset of cattle, termed super-shedders (SS), excrete O157 at a rate (≥ 104 CFU/g of feces) that is several orders of magnitude greater than other colonized cattle and play a major role in the prevalence and transmission of O157. To better understand microbial factors contributing to super-shedding we have recently sequenced two SS isolates, SS17 (GenBank accession no. CP008805) and SS52 (GenBank accession no. CP010304) and shown that SS isolates display a distinctive strongly adherent phenotype on bovine rectal squamous epithelial cells. Here we present a detailed comparative genomics analysis of SS17 and SS52 with other previously characterized O157 strains (EC4115, EDL933, Sakai, TW14359). The results highlight specific polymorphisms and genomic features shared amongst SS strains, and reveal several SNPs that are shared amongst SS isolates, including in genes involved in motility, adherence, and metabolism. Finally, our analyses reveal distinctive patterns of distribution of phage-associated genes amongst the two SS and other isolates. Together, the results of our comparative genomics studies suggest that while SS17 and SS52 share genomic features with other lineage I/II isolates, they likely have distinct recent evolutionary histories. Future comparative and functional genomic studies are needed to decipher the precise molecular basis for super shedding in O157. PMID:28797098

  13. Comparative genomics of two super-shedder isolates of Escherichia coli O157:H7.

    PubMed

    Katani, Robab; Cote, Rebecca; Kudva, Indira T; DebRoy, Chitrita; Arthur, Terrance M; Kapur, Vivek

    2017-01-01

    Shiga toxin-producing Escherichia coli O157:H7 (O157) are zoonotic foodborne pathogens and of major public health concern that cause considerable intestinal and extra-intestinal illnesses in humans. O157 colonize the recto-anal junction (RAJ) of asymptomatic cattle who shed the bacterium into the environment through fecal matter. A small subset of cattle, termed super-shedders (SS), excrete O157 at a rate (≥ 104 CFU/g of feces) that is several orders of magnitude greater than other colonized cattle and play a major role in the prevalence and transmission of O157. To better understand microbial factors contributing to super-shedding we have recently sequenced two SS isolates, SS17 (GenBank accession no. CP008805) and SS52 (GenBank accession no. CP010304) and shown that SS isolates display a distinctive strongly adherent phenotype on bovine rectal squamous epithelial cells. Here we present a detailed comparative genomics analysis of SS17 and SS52 with other previously characterized O157 strains (EC4115, EDL933, Sakai, TW14359). The results highlight specific polymorphisms and genomic features shared amongst SS strains, and reveal several SNPs that are shared amongst SS isolates, including in genes involved in motility, adherence, and metabolism. Finally, our analyses reveal distinctive patterns of distribution of phage-associated genes amongst the two SS and other isolates. Together, the results of our comparative genomics studies suggest that while SS17 and SS52 share genomic features with other lineage I/II isolates, they likely have distinct recent evolutionary histories. Future comparative and functional genomic studies are needed to decipher the precise molecular basis for super shedding in O157.

  14. Sequence and functional characterization of MIRNA164 promoters from Brassica shows copy number dependent regulatory diversification among homeologs.

    PubMed

    Jain, Aditi; Anand, Saurabh; Singh, Neer K; Das, Sandip

    2018-03-12

    The impact of polyploidy on functional diversification of cis-regulatory elements is poorly understood. This is primarily on account of lack of well-defined structure of cis-elements and a universal regulatory code. To the best of our knowledge, this is the first report on characterization of sequence and functional diversification of paralogous and homeologous promoter elements associated with MIR164 from Brassica. The availability of whole genome sequence allowed us to identify and isolate a total of 42 homologous copies of MIR164 from diploid species-Brassica rapa (A-genome), Brassica nigra (B-genome), Brassica oleracea (C-genome), and allopolyploids-Brassica juncea (AB-genome), Brassica carinata (BC-genome) and Brassica napus (AC-genome). Additionally, we retrieved homologous sequences based on comparative genomics from Arabidopsis lyrata, Capsella rubella, and Thellungiella halophila, spanning ca. 45 million years of evolutionary history of Brassicaceae. Sequence comparison across Brassicaceae revealed lineage-, karyotype, species-, and sub-genome specific changes providing a snapshot of evolutionary dynamics of miRNA promoters in polyploids. Tree topology of cis-elements associated with MIR164 was found to re-capitulate the species and family evolutionary history. Phylogenetic shadowing identified transcription factor binding sites (TFBS) conserved across Brassicaceae, of which, some are already known as regulators of MIR164 expression. Some of the TFBS were found to be distributed in a sub-genome specific (e.g., SOX specific to promoter of MIR164c from MF2 sub-genome), lineage-specific (YABBY binding motif, specific to C. rubella in MIR164b), or species-specific (e.g., VOZ in A. thaliana MIR164a) manner which might contribute towards genetic and adaptive variation. Reporter activity driven by promoters associated with MIR164 paralogs and homeologs was majorly in agreement with known role of miR164 in leaf shaping, regulation of lateral root development and senescence, and one previously un-described novel role in trichome. The impact of polyploidy was most profound when reporter activity across three MIR164c homeologs were compared that revealed negligible overlap, whereas reporter activity among two homeologs of MIR164a displays significant overlap. A copy number dependent cis-regulatory divergence thus exists in MIR164 genes in Brassica juncea. The full extent of regulatory diversification towards adaptive strategies will only be known when future endeavors analyze the promoter function under duress of stress and hormonal regimes.

  15. Characterization of genome in tetraploid StY species of Elymus (Triticeae: Poaceae) using sequential FISH and GISH.

    PubMed

    Liu, Ruijuan; Wang, Richard R-C; Yu, Feng; Lu, Xingwang; Dou, Quanwen

    2017-08-01

    Genomes of ten species of Elymus, either presumed or known as tetraploid StY, were characterized using fluorescence in situ hybridization (FISH) and genomic in situ hybridization (GISH). These tetraploid species could be grouped into three categories. Type I included StY genome reported species-Roegneria pendulina, R. nutans, R. glaberrima, R. ciliaris, and Elymus nevskii, and StY genome presumed species-R. sinica, R. breviglumis, and R. dura, whose genome could be separated into two sets based on different GISH intensities. Type I genome constitution was deemed as putative StY. The St genome were mainly characterized with intense hybridization with pAs1, fewer AAG sites, and linked distribution of 5S rDNA and 18S-26S rDNA, while the Y genome with less intense hybridization with pAs1, more varied AAG sites, and isolated distribution of 5S rDNA and 18S-26S rDNA. Nevertheless, further genomic variations were detected among the different StY species. Type II included E. alashanicus, whose genome could be easily separated based on GISH pattern. FISH and GISH patterns suggested that E. alashanicus comprised a modified St genome and an unknown genome. Type III included E. longearistatus, whose genome could not be separated by GISH and was designated as St l Y l . Notably, a close relationship between S l and Y l genomes was observed.

  16. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis

    PubMed Central

    Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del

    2015-01-01

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096

  17. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis.

    PubMed

    Facey, Paul D; Méric, Guillaume; Hitchings, Matthew D; Pachebat, Justin A; Hegarty, Matt J; Chen, Xiaorui; Morgan, Laura V A; Hoeppner, James E; Whitten, Miranda M A; Kirk, William D J; Dyson, Paul J; Sheppard, Sam K; Del Sol, Ricardo

    2015-07-15

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. Transposable Element Genomic Fissuring in Pyrenophora teres Is Associated With Genome Expansion and Dynamics of Host–Pathogen Genetic Interactions

    PubMed Central

    Syme, Robert A.; Martin, Anke; Wyatt, Nathan A.; Lawrence, Julie A.; Muria-Gonzalez, Mariano J.; Friesen, Timothy L.; Ellwood, Simon R.

    2018-01-01

    Pyrenophora teres, P. teres f. teres (PTT) and P. teres f. maculata (PTM) cause significant diseases in barley, but little is known about the large-scale genomic differences that may distinguish the two forms. Comprehensive genome assemblies were constructed from long DNA reads, optical and genetic maps. As repeat masking in fungal genomes influences the final gene annotations, an accurate and reproducible pipeline was developed to ensure comparability between isolates. The genomes of the two forms are highly collinear, each composed of 12 chromosomes. Genome evolution in P. teres is characterized by genome fissuring through the insertion and expansion of transposable elements (TEs), a process that isolates blocks of genic sequence. The phenomenon is particularly pronounced in PTT, which has a larger, more repetitive genome than PTM and more recent transposon activity measured by the frequency and size of genome fissures. PTT has a longer cultivated host association and, notably, a greater range of host–pathogen genetic interactions compared to other Pyrenophora spp., a property which associates better with genome size than pathogen lifestyle. The two forms possess similar complements of TE families with Tc1/Mariner and LINE-like Tad-1 elements more abundant in PTT. Tad-1 was only detectable as vestigial fragments in PTM and, within the forms, differences in genome sizes and the presence and absence of several TE families indicated recent lineage invasions. Gene differences between P. teres forms are mainly associated with gene-sparse regions near or within TE-rich regions, with many genes possessing characteristics of fungal effectors. Instances of gene interruption by transposons resulting in pseudogenization were detected in PTT. In addition, both forms have a large complement of secondary metabolite gene clusters indicating significant capacity to produce an array of different molecules. This study provides genomic resources for functional genetics to help dissect factors underlying the host–pathogen interactions. PMID:29720997

  19. Whole genome sequencing of multidrug-resistant Salmonella enterica serovar Typhimurium isolated from humans and poultry in Burkina Faso

    USDA-ARS?s Scientific Manuscript database

    Background. Multidrug-resistant Salmonella is an important cause of morbidity and mortality in developing countries. The aim of this study was to characterize and compare multidrug-resistant Salmonella enterica serovar Typhimurium isolates from patients and poultry feces. Methods. Salmonella strains...

  20. Comparative studies of the genome, virulence, and protection of 10 Haemophilus parasuis strains

    USDA-ARS?s Scientific Manuscript database

    Haemophilus parasuis is the cause of Glässer’s disease in swine, which is characterized by systemic infection resulting in polyserositis, meningitis, and arthritis. An enormous difference exists in the severity of disease caused by H. parasuis strains, ranging from lethal systemic disease to asympto...

  1. Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes.

    PubMed

    Przytycki, Pawel F; Singh, Mona

    2017-08-25

    A major aim of cancer genomics is to pinpoint which somatically mutated genes are involved in tumor initiation and progression. We introduce a new framework for uncovering cancer genes, differential mutation analysis, which compares the mutational profiles of genes across cancer genomes with their natural germline variation across healthy individuals. We present DiffMut, a fast and simple approach for differential mutational analysis, and demonstrate that it is more effective in discovering cancer genes than considerably more sophisticated approaches. We conclude that germline variation across healthy human genomes provides a powerful means for characterizing somatic mutation frequency and identifying cancer driver genes. DiffMut is available at https://github.com/Singh-Lab/Differential-Mutation-Analysis .

  2. A comparative in silico linear B-cell epitope prediction and characterization for South American and African Trypanosoma vivax strains.

    PubMed

    Guedes, Rafael Lucas Muniz; Rodrigues, Carla Monadeli Filgueira; Coatnoan, Nicolas; Cosson, Alain; Cadioli, Fabiano Antonio; Garcia, Herakles Antonio; Gerber, Alexandra Lehmkuhl; Machado, Rosangela Zacarias; Minoprio, Paola Marcella Camargo; Teixeira, Marta Maria Geraldes; de Vasconcelos, Ana Tereza Ribeiro

    2018-02-27

    Trypanosoma vivax is a parasite widespread across Africa and South America. Immunological methods using recombinant antigens have been developed aiming at specific and sensitive detection of infections caused by T. vivax. Here, we sequenced for the first time the transcriptome of a virulent T. vivax strain (Lins), isolated from an outbreak of severe disease in South America (Brazil) and performed a computational integrated analysis of genome, transcriptome and in silico predictions to identify and characterize putative linear B-cell epitopes from African and South American T. vivax. A total of 2278, 3936 and 4062 linear B-cell epitopes were respectively characterized for the transcriptomes of T. vivax LIEM-176 (Venezuela), T. vivax IL1392 (Nigeria) and T. vivax Lins (Brazil) and 4684 for the genome of T. vivax Y486 (Nigeria). The results presented are a valuable theoretical source that may pave the way for highly sensitive and specific diagnostic tools. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Comparative genomic analysis of two isolates of Vibrio cholerae O1 Ogawa El Tor isolated during outbreak in Mariupol in 2011.

    PubMed

    Kuleshov, Konstantin V; Kostikova, Anna; Pisarenko, Sergey V; Kovalev, Dmitry A; Tikhonov, Sergey N; Savelievа, Irina V; Saveliev, Vilory N; Vasilieva, Oksana V; Zinich, Liliia S; Pidchenko, Nadiia N; Kulichenko, Alexander N; Shipulin, German A

    2016-10-01

    Cholera is a water-borne, severe enteric infection essentially caused by toxigenic strains of Vibrio cholera O1 and O139 serogroups. An outbreak of cholera was registered during May-July 2011 in Mariupol, Ukraine, with 33 cholera cases and 25 carriers of cholera. Following this outbreak, the toxigenic strain of V. cholerae 2011EL-301 was isolated from seawater in the recreation area of Taganrog city on the territory of Russia. The aim of our study was to understand genomic features of Mariupol isolates as well as to evaluate hypothesis about possible interconnection between the outbreak of cholera in Mariupol and the single case of isolation of V. cholerae from the Sea of Azov in Russia. Mariupol isolates were phenotypically characterized and subsequently subjected to whole genome sequencing procedure. Phylogenetic analysis based on high-quality SNPs of V. cholera O1 El Tor isolates of the 7th pandemic clade from different regions showed that clinical and environmental isolates from Mariupol outbreak were attributable to a unique phylogenetic clade within wave 3 of V. cholera O1 El Tor isolates and characterized by six clade-specific SNPs. Whereas Taganrog isolate belonged to distantly related clade which allows us to reject the hypothesis of transmission the outbreak strain of V. cholerae O1 from Ukraine to Russia in 2011. Mariupol isolates shared a common ancestor with Haiti\\Nepal-4\\India clade indicating that outbreak progenitor strain most likely originated in the South Asia region and later was introduced to Ukraine. Moreover, genomic data both based on hqSNPs and similarity of virulence-associated mobile genomic elements of Mariupol isolates suggests that environmental and clinical isolates are a part of joint outbreak which confirms the role of contaminated domestic sewage, as an element of the complex chain of infection spread during cholera outbreak. In general, the genome-wide comparative analysis of both genes and genomic regions of epidemiological importance indicates accessory of this isolates to 'new' clone of toxigenic multiple drug resistance atypical variant of V. cholerae O1 El Tor. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Comparative analysis and supragenome modeling of twelve Moraxella catarrhalis clinical isolates

    PubMed Central

    2011-01-01

    Background M. catarrhalis is a gram-negative, gamma-proteobacterium and an opportunistic human pathogen associated with otitis media (OM) and exacerbations of chronic obstructive pulmonary disease (COPD). With direct and indirect costs for treating these conditions annually exceeding $33 billion in the United States alone, and nearly ubiquitous resistance to beta-lactam antibiotics among M. catarrhalis clinical isolates, a greater understanding of this pathogen's genome and its variability among isolates is needed. Results The genomic sequences of ten geographically and phenotypically diverse clinical isolates of M. catarrhalis were determined and analyzed together with two publicly available genomes. These twelve genomes were subjected to detailed comparative and predictive analyses aimed at characterizing the supragenome and understanding the metabolic and pathogenic potential of this species. A total of 2383 gene clusters were identified, of which 1755 are core with the remaining 628 clusters unevenly distributed among the twelve isolates. These findings are consistent with the distributed genome hypothesis (DGH), which posits that the species genome possesses a far greater number of genes than any single isolate. Multiple and pair-wise whole genome alignments highlight limited chromosomal re-arrangement. Conclusions M. catarrhalis gene content and chromosomal organization data, although supportive of the DGH, show modest overall genic diversity. These findings are in stark contrast with the reported heterogeneity of the species as a whole, as wells as to other bacterial pathogens mediating OM and COPD, providing important insight into M. catarrhalis pathogenesis that will aid in the development of novel therapeutic regimens. PMID:21269504

  5. Tips and tricks for the assembly of a Corynebacterium pseudotuberculosis genome using a semiconductor sequencer.

    PubMed

    Ramos, Rommel Thiago Jucá; Carneiro, Adriana Ribeiro; Soares, Siomar de Castro; dos Santos, Anderson Rodrigues; Almeida, Sintia; Guimarães, Luis; Figueira, Flávia; Barbosa, Eudes; Tauch, Andreas; Azevedo, Vasco; Silva, Artur

    2013-03-01

    New sequencing platforms have enabled rapid decoding of complete prokaryotic genomes at relatively low cost. The Ion Torrent platform is an example of these technologies, characterized by lower coverage, generating challenges for the genome assembly. One particular problem is the lack of genomes that enable reference-based assembly, such as the one used in the present study, Corynebacterium pseudotuberculosis biovar equi, which causes high economic losses in the US equine industry. The quality treatment strategy incorporated into the assembly pipeline enabled a 16-fold greater use of the sequencing data obtained compared with traditional quality filter approaches. Data preprocessing prior to the de novo assembly enabled the use of known methodologies in the next-generation sequencing data assembly. Moreover, manual curation was proved to be essential for ensuring a quality assembly, which was validated by comparative genomics with other species of the genus Corynebacterium. The present study presents a modus operandi that enables a greater and better use of data obtained from semiconductor sequencing for obtaining the complete genome from a prokaryotic microorganism, C. pseudotuberculosis, which is not a traditional biological model such as Escherichia coli. © 2012 The Authors. Published by Society for Applied Microbiology and Blackwell Publishing Ltd. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

  6. Horizontal transfer of potential mobile units in phytoplasmas

    PubMed Central

    Ku, Chuan; Lo, Wen-Sui; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are uncultivated phytopathogenic bacteria that cause diseases in a wide range of economically important plants. Through secretion of effector proteins, they are able to manipulate their plant hosts to facilitate their multiplication and dispersal by insect vectors. The genome sequences of several phytoplasmas have been characterized to date and a group of putative composite transposons called potential mobile units (PMUs) are found in these highly reduced genomes. Recently, our team reported the genome sequence and comparative analysis of a peanut witches’ broom (PnWB) phytoplasma, the first representative of the phytoplasma 16SrII group. Comparisons between the species phylogeny and the phylogenies of the PMU genes revealed that the PnWB PMU is likely to have been transferred from the 16SrI group. This indicates that PMUs are not only the DNA unit for transposition within a genome, but also for horizontal transfer among divergent phytoplasma lineages. Given the association of PMUs with effector genes, the mobility of PMUs across genomes has important implications for phytoplasma ecology and evolution. PMID:24251068

  7. Elucidating the genomic architecture of Asian EGFR-mutant lung adenocarcinoma through multi-region exome sequencing.

    PubMed

    Nahar, Rahul; Zhai, Weiwei; Zhang, Tong; Takano, Angela; Khng, Alexis J; Lee, Yin Yeng; Liu, Xingliang; Lim, Chong Hee; Koh, Tina P T; Aung, Zaw Win; Lim, Tony Kiat Hon; Veeravalli, Lavanya; Yuan, Ju; Teo, Audrey S M; Chan, Cheryl X; Poh, Huay Mei; Chua, Ivan M L; Liew, Audrey Ann; Lau, Dawn Ping Xi; Kwang, Xue Lin; Toh, Chee Keong; Lim, Wan-Teck; Lim, Bing; Tam, Wai Leong; Tan, Eng-Huat; Hillmer, Axel M; Tan, Daniel S W

    2018-01-15

    EGFR-mutant lung adenocarcinomas (LUAD) display diverse clinical trajectories and are characterized by rapid but short-lived responses to EGFR tyrosine kinase inhibitors (TKIs). Through sequencing of 79 spatially distinct regions from 16 early stage tumors, we show that despite low mutation burdens, EGFR-mutant Asian LUADs unexpectedly exhibit a complex genomic landscape with frequent and early whole-genome doubling, aneuploidy, and high clonal diversity. Multiple truncal alterations, including TP53 mutations and loss of CDKN2A and RB1, converge on cell cycle dysregulation, with late sector-specific high-amplitude amplifications and deletions that potentially beget drug resistant clones. We highlight the association between genomic architecture and clinical phenotypes, such as co-occurring truncal drivers and primary TKI resistance. Through comparative analysis with published smoking-related LUAD, we postulate that the high intra-tumor heterogeneity observed in Asian EGFR-mutant LUAD may be contributed by an early dominant driver, genomic instability, and low background mutation rates.

  8. Evolutionary dynamics of 3D genome architecture following polyploidization in cotton.

    PubMed

    Wang, Maojun; Wang, Pengcheng; Lin, Min; Ye, Zhengxiu; Li, Guoliang; Tu, Lili; Shen, Chao; Li, Jianying; Yang, Qingyong; Zhang, Xianlong

    2018-02-01

    The formation of polyploids significantly increases the complexity of transcriptional regulation, which is expected to be reflected in sophisticated higher-order chromatin structures. However, knowledge of three-dimensional (3D) genome structure and its dynamics during polyploidization remains poor. Here, we characterize 3D genome architectures for diploid and tetraploid cotton, and find the existence of A/B compartments and topologically associated domains (TADs). By comparing each subgenome in tetraploids with its extant diploid progenitor, we find that genome allopolyploidization has contributed to the switching of A/B compartments and the reorganization of TADs in both subgenomes. We also show that the formation of TAD boundaries during polyploidization preferentially occurs in open chromatin, coinciding with the deposition of active chromatin modification. Furthermore, analysis of inter-subgenomic chromatin interactions has revealed the spatial proximity of homoeologous genes, possibly associated with their coordinated expression. This study advances our understanding of chromatin organization in plants and sheds new light on the relationship between 3D genome evolution and transcriptional regulation.

  9. The annotation of repetitive elements in the genome of channel catfish (Ictalurus punctatus).

    PubMed

    Yuan, Zihao; Zhou, Tao; Bao, Lisui; Liu, Shikai; Shi, Huitong; Yang, Yujia; Gao, Dongya; Dunham, Rex; Waldbieser, Geoff; Liu, Zhanjiang

    2018-01-01

    Channel catfish (Ictalurus punctatus) is a highly adaptive species and has been used as a research model for comparative immunology, physiology, and toxicology among ectothermic vertebrates. It is also economically important for aquaculture. As such, its reference genome was generated and annotated with protein coding genes. However, the repetitive elements in the catfish genome are less well understood. In this study, over 417.8 Megabase (MB) of repetitive elements were identified and characterized in the channel catfish genome. Among them, the DNA/TcMar-Tc1 transposons are the most abundant type, making up ~20% of the total repetitive elements, followed by the microsatellites (14%). The prevalence of repetitive elements, especially the mobile elements, may have provided a driving force for the evolution of the catfish genome. A number of catfish-specific repetitive elements were identified including the previously reported Xba elements whose divergence rate was relatively low, slower than that in untranslated regions of genes but faster than the protein coding sequences, suggesting its evolutionary restrictions.

  10. The annotation of repetitive elements in the genome of channel catfish (Ictalurus punctatus)

    PubMed Central

    Yuan, Zihao; Zhou, Tao; Bao, Lisui; Liu, Shikai; Shi, Huitong; Yang, Yujia; Gao, Dongya; Dunham, Rex; Waldbieser, Geoff

    2018-01-01

    Channel catfish (Ictalurus punctatus) is a highly adaptive species and has been used as a research model for comparative immunology, physiology, and toxicology among ectothermic vertebrates. It is also economically important for aquaculture. As such, its reference genome was generated and annotated with protein coding genes. However, the repetitive elements in the catfish genome are less well understood. In this study, over 417.8 Megabase (MB) of repetitive elements were identified and characterized in the channel catfish genome. Among them, the DNA/TcMar-Tc1 transposons are the most abundant type, making up ~20% of the total repetitive elements, followed by the microsatellites (14%). The prevalence of repetitive elements, especially the mobile elements, may have provided a driving force for the evolution of the catfish genome. A number of catfish-specific repetitive elements were identified including the previously reported Xba elements whose divergence rate was relatively low, slower than that in untranslated regions of genes but faster than the protein coding sequences, suggesting its evolutionary restrictions. PMID:29763462

  11. Bacterial genomes in epidemiology—present and future

    PubMed Central

    Croucher, Nicholas J.; Harris, Simon R.; Grad, Yonatan H.; Hanage, William P.

    2013-01-01

    Sequence data are well established in the reconstruction of the phylogenetic and demographic scenarios that have given rise to outbreaks of viral pathogens. The application of similar methods to bacteria has been hindered in the main by the lack of high-resolution nucleotide sequence data from quality samples. Developing and already available genomic methods have greatly increased the amount of data that can be used to characterize an isolate and its relationship to others. However, differences in sequencing platforms and data analysis mean that these enhanced data come with a cost in terms of portability: results from one laboratory may not be directly comparable with those from another. Moreover, genomic data for many bacteria bear the mark of a history including extensive recombination, which has the potential to greatly confound phylogenetic and coalescent analyses. Here, we discuss the exacting requirements of genomic epidemiology, and means by which the distorting signal of recombination can be minimized to permit the leverage of growing datasets of genomic data from bacterial pathogens. PMID:23382424

  12. Horizontal transfer of potential mobile units in phytoplasmas.

    PubMed

    Ku, Chuan; Lo, Wen-Sui; Kuo, Chih-Horng

    2013-09-01

    Phytoplasmas are uncultivated phytopathogenic bacteria that cause diseases in a wide range of economically important plants. Through secretion of effector proteins, they are able to manipulate their plant hosts to facilitate their multiplication and dispersal by insect vectors. The genome sequences of several phytoplasmas have been characterized to date and a group of putative composite transposons called potential mobile units (PMUs) are found in these highly reduced genomes. Recently, our team reported the genome sequence and comparative analysis of a peanut witches' broom (PnWB) phytoplasma, the first representative of the phytoplasma 16SrII group. Comparisons between the species phylogeny and the phylogenies of the PMU genes revealed that the PnWB PMU is likely to have been transferred from the 16SrI group. This indicates that PMUs are not only the DNA unit for transposition within a genome, but also for horizontal transfer among divergent phytoplasma lineages. Given the association of PMUs with effector genes, the mobility of PMUs across genomes has important implications for phytoplasma ecology and evolution.

  13. Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.

    PubMed

    Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F

    2017-08-01

    Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Comparative genome analysis of entomopathogenic fungi reveals a complex set of secreted proteins.

    PubMed

    Staats, Charley Christian; Junges, Angela; Guedes, Rafael Lucas Muniz; Thompson, Claudia Elizabeth; de Morais, Guilherme Loss; Boldo, Juliano Tomazzoni; de Almeida, Luiz Gonzaga Paula; Andreis, Fábio Carrer; Gerber, Alexandra Lehmkuhl; Sbaraini, Nicolau; da Paixão, Rana Louise de Andrade; Broetto, Leonardo; Landell, Melissa; Santi, Lucélia; Beys-da-Silva, Walter Orlando; Silveira, Carolina Pereira; Serrano, Thaiane Rispoli; de Oliveira, Eder Silva; Kmetzsch, Lívia; Vainstein, Marilene Henning; de Vasconcelos, Ana Tereza Ribeiro; Schrank, Augusto

    2014-09-29

    Metarhizium anisopliae is an entomopathogenic fungus used in the biological control of some agricultural insect pests, and efforts are underway to use this fungus in the control of insect-borne human diseases. A large repertoire of proteins must be secreted by M. anisopliae to cope with the various available nutrients as this fungus switches through different lifestyles, i.e., from a saprophytic, to an infectious, to a plant endophytic stage. To further evaluate the predicted secretome of M. anisopliae, we employed genomic and transcriptomic analyses, coupled with phylogenomic analysis, focusing on the identification and characterization of secreted proteins. We determined the M. anisopliae E6 genome sequence and compared this sequence to other entomopathogenic fungi genomes. A robust pipeline was generated to evaluate the predicted secretomes of M. anisopliae and 15 other filamentous fungi, leading to the identification of a core of secreted proteins. Transcriptomic analysis using the tick Rhipicephalus microplus cuticle as an infection model during two periods of infection (48 and 144 h) allowed the identification of several differentially expressed genes. This analysis concluded that a large proportion of the predicted secretome coding genes contained altered transcript levels in the conditions analyzed in this study. In addition, some specific secreted proteins from Metarhizium have an evolutionary history similar to orthologs found in Beauveria/Cordyceps. This similarity suggests that a set of secreted proteins has evolved to participate in entomopathogenicity. The data presented represents an important step to the characterization of the role of secreted proteins in the virulence and pathogenicity of M. anisopliae.

  15. Comparative genomics of 9 novel Paenibacillus larvae bacteriophages

    PubMed Central

    Stamereilers, Casey; LeBlanc, Lucy; Yost, Diane; Amy, Penny S.; Tsourkas, Philippos K.

    2016-01-01

    ABSTRACT American Foulbrood Disease, caused by the bacterium Paenibacillus larvae, is one of the most destructive diseases of the honeybee, Apis mellifera. Our group recently published the sequences of 9 new phages with the ability to infect and lyse P. larvae. Here, we characterize the genomes of these P. larvae phages, compare them to each other and to other sequenced P. larvae phages, and putatively identify protein function. The phage genomes are 38–45 kb in size and contain 68–86 genes, most of which appear to be unique to P. larvae phages. We classify P. larvae phages into 2 main clusters and one singleton based on nucleotide sequence identity. Three of the new phages show sequence similarity to other sequenced P. larvae phages, while the remaining 6 do not. We identified functions for roughly half of the P. larvae phage proteins, including structural, assembly, host lysis, DNA replication/metabolism, regulatory, and host-related functions. Structural and assembly proteins are highly conserved among our phages and are located at the start of the genome. DNA replication/metabolism, regulatory, and host-related proteins are located in the middle and end of the genome, and are not conserved, with many of these genes found in some of our phages but not others. All nine phages code for a conserved N-acetylmuramoyl-L-alanine amidase. Comparative analysis showed the phages use the “cohesive ends with 3′ overhang” DNA packaging strategy. This work is the first in-depth study of P. larvae phage genomics, and serves as a marker for future work in this area. PMID:27738559

  16. Comparative Genomics of Listeria Sensu Lato: Genus-Wide Differences in Evolutionary Dynamics and the Progressive Gain of Complex, Potentially Pathogenicity-Related Traits through Lateral Gene Transfer.

    PubMed

    Chiara, Matteo; Caruso, Marta; D'Erchia, Anna Maria; Manzari, Caterina; Fraccalvieri, Rosa; Goffredo, Elisa; Latorre, Laura; Miccolupo, Angela; Padalino, Iolanda; Santagada, Gianfranco; Chiocco, Doriano; Pesole, Graziano; Horner, David S; Parisi, Antonio

    2015-07-15

    Historically, genome-wide and molecular characterization of the genus Listeria has concentrated on the important human pathogen Listeria monocytogenes and a small number of closely related species, together termed Listeria sensu strictu. More recently, a number of genome sequences for more basal, and nonpathogenic, members of the Listeria genus have become available, facilitating a wider perspective on the evolution of pathogenicity and genome level evolutionary dynamics within the entire genus (termed Listeria sensu lato). Here, we have sequenced the genomes of additional Listeria fleischmannii and Listeria newyorkensis isolates and explored the dynamics of genome evolution in Listeria sensu lato. Our analyses suggest that acquisition of genetic material through gene duplication and divergence as well as through lateral gene transfer (mostly from outside Listeria) is widespread throughout the genus. Novel genetic material is apparently subject to rapid turnover. Multiple lines of evidence point to significant differences in evolutionary dynamics between the most basal Listeria subclade and all other congeners, including both sensu strictu and other sensu lato isolates. Strikingly, these differences are likely attributable to stochastic, population-level processes and contribute to observed variation in genome size across the genus. Notably, our analyses indicate that the common ancestor of Listeria sensu lato lacked flagella, which were acquired by lateral gene transfer by a common ancestor of Listeria grayi and Listeria sensu strictu, whereas a recently functionally characterized pathogenicity island, responsible for the capacity to produce cobalamin and utilize ethanolamine/propane-2-diol, was acquired in an ancestor of Listeria sensu strictu. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Biology and Genomics of an Historic Therapeutic Escherichia coli Bacteriophage Collection.

    PubMed

    Baig, Abiyad; Colom, Joan; Barrow, Paul; Schouler, Catherine; Moodley, Arshnee; Lavigne, Rob; Atterbury, Robert

    2017-01-01

    We have performed microbiological and genomic characterization of an historic collection of nine bacteriophages, specifically infecting a K1 E. coli O18:K1:H7 ColV + strain. These phages were isolated from sewage and tested for their efficacy in vivo for the treatment of systemic E. coli infection in a mouse infection model by Smith and Huggins (1982). The aim of the study was to identify common microbiological and genomic characteristics, which co-relate to the performance of these phages in in vivo study. These features will allow an informed selection of phages for use as therapeutic agents. Transmission electron microscopy showed that six of the nine phages were Podoviridae and the remaining three were Siphoviridae . The four best performing phages in vivo belonged to the Podoviridae family. In vitro , these phages exhibited very short latent and rise periods in our study. In agreement with their microbiological profiles, characterization by genome sequencing showed that all six podoviruses belong to the Autographivirinae subfamily. Of these, four were isolates of the same species (99% identity), whereas two had divergent genomes compared to other podoviruses. The Siphoviridae phages, which were moderate to poor performers in vivo , exhibited longer latent and rise periods in vitro . Two of the three siphoviruses were closely related to each other (99% identity), but all can be associated with the Guernseyvirinae subfamily. Genome sequence comparison of both types of phages showed that a gene encoding for DNA-dependent RNA polymerase was only present in phages with faster replication cycle, which may account for their better performance in vivo . These data define a combination of microbiological, genomic and in vivo characteristics which allow a more rational evaluation of the original in vivo data and pave the way for the selection of phages for future phage therapy trails.

  18. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.

  19. A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

    PubMed Central

    Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

    2006-01-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  20. The draft genome of a socially polymorphic halictid bee, Lasioglossum albipes

    PubMed Central

    2013-01-01

    Background Taxa that harbor natural phenotypic variation are ideal for ecological genomic approaches aimed at understanding how the interplay between genetic and environmental factors can lead to the evolution of complex traits. Lasioglossum albipes is a polymorphic halictid bee that expresses variation in social behavior among populations, and common-garden experiments have suggested that this variation is likely to have a genetic component. Results We present the L. albipes genome assembly to characterize the genetic and ecological factors associated with the evolution of social behavior. The de novo assembly is comparable to other published social insect genomes, with an N50 scaffold length of 602 kb. Gene families unique to L. albipes are associated with integrin-mediated signaling and DNA-binding domains, and several appear to be expanded in this species, including the glutathione-s-transferases and the inositol monophosphatases. L. albipes has an intact DNA methylation system, and in silico analyses suggest that methylation occurs primarily in exons. Comparisons to other insect genomes indicate that genes associated with metabolism and nucleotide binding undergo accelerated evolution in the halictid lineage. Whole-genome resequencing data from one solitary and one social L. albipes female identify six genes that appear to be rapidly diverging between social forms, including a putative odorant receptor and a cuticular protein. Conclusions L. albipes represents a novel genetic model system for understanding the evolution of social behavior. It represents the first published genome sequence of a primitively social insect, thereby facilitating comparative genomic studies across the Hymenoptera as a whole. PMID:24359881

  1. Comparative Genomics of Flatworms (Platyhelminthes) Reveals Shared Genomic Features of Ecto- and Endoparastic Neodermata

    PubMed Central

    Hahn, Christoph; Fromm, Bastian; Bachmann, Lutz

    2014-01-01

    The ectoparasitic Monogenea comprise a major part of the obligate parasitic flatworm diversity. Although genomic adaptations to parasitism have been studied in the endoparasitic tapeworms (Cestoda) and flukes (Trematoda), no representative of the Monogenea has been investigated yet. We present the high-quality draft genome of Gyrodactylus salaris, an economically important monogenean ectoparasite of wild Atlantic salmon (Salmo salar). A total of 15,488 gene models were identified, of which 7,102 were functionally annotated. The controversial phylogenetic relationships within the obligate parasitic Neodermata were resolved in a phylogenomic analysis using 1,719 gene models (alignment length of >500,000 amino acids) for a set of 16 metazoan taxa. The Monogenea were found basal to the Cestoda and Trematoda, which implies ectoparasitism being plesiomorphic within the Neodermata and strongly supports a common origin of complex life cycles. Comparative analysis of seven parasitic flatworm genomes identified shared genomic features for the ecto- and endoparasitic lineages, such as a substantial reduction of the core bilaterian gene complement, including the homeodomain-containing genes, and a loss of the piwi and vasa genes, which are considered essential for animal development. Furthermore, the shared loss of functional fatty acid biosynthesis pathways and the absence of peroxisomes, the latter organelles presumed ubiquitous in eukaryotes except for parasitic protozoans, were inferred. The draft genome of G. salaris opens for future in-depth analyses of pathogenicity and host specificity of poorly characterized G. salaris strains, and will enhance studies addressing the genomics of host–parasite interactions and speciation in the highly diverse monogenean flatworms. PMID:24732282

  2. Comparative genome-wide analysis reveals that Burkholderia contaminans MS14 possesses multiple antimicrobial biosynthesis genes but not major genetic loci required for pathogenesis.

    PubMed

    Deng, Peng; Wang, Xiaoqiang; Baird, Sonya M; Showmaker, Kurt C; Smith, Leif; Peterson, Daniel G; Lu, Shien

    2016-06-01

    Burkholderia contaminans MS14 shows significant antimicrobial activities against plant and animal pathogenic fungi and bacteria. The antifungal agent occidiofungin produced by MS14 has great potential for development of biopesticides and pharmaceutical drugs. However, the use of Burkholderia species as biocontrol agent in agriculture is restricted due to the difficulties in distinguishing between plant growth-promoting bacteria and the pathogenic bacteria. The complete MS14 genome was sequenced and analyzed to find what beneficial and virulence-related genes it harbors. The phylogenetic relatedness of B. contaminans MS14 and other 17 Burkholderia species was also analyzed. To research MS14's potential virulence, the gene regions related to the antibiotic production, antibiotic resistance, and virulence were compared between MS14 and other Burkholderia genomes. The genome of B. contaminans MS14 was sequenced and annotated. The genomic analyses reveal the presence of multiple gene sets for antimicrobial biosynthesis, which contribute to its antimicrobial activities. BLAST results indicate that the MS14 genome harbors a large number of unique regions. MS14 is closely related to another plant growth-promoting Burkholderia strain B. lata 383 according to the average nucleotide identity data. Moreover, according to the phylogenetic analysis, plant growth-promoting species isolated from soils and mammalian pathogenic species are clustered together, respectively. MS14 has multiple antimicrobial activity-related genes identified from the genome, but it lacks key virulence-related gene loci found in the pathogenic strains. Additionally, plant growth-promoting Burkholderia species have one or more antimicrobial biosynthesis genes in their genomes as compared with nonplant growth-promoting soil-isolated Burkholderia species. On the other hand, pathogenic species harbor multiple virulence-associated gene loci that are not present in nonpathogenic Burkholderia species. The MS14 genome as well as Burkholderia species genome show considerable diversity. Multiple antimicrobial agent biosynthesis genes were identified in the genome of plant growth-promoting species of Burkholderia. In addition, by comparing to nonpathogenic Burkholderia species, pathogenic Burkholderia species have more characterized homologs of the gene loci known to contribute to pathogenicity and virulence to plant and animals. © 2016 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.

  3. Plastome Sequence Determination and Comparative Analysis for Members of the Lolium-Festuca Grass Species Complex

    PubMed Central

    Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.

    2013-01-01

    Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121

  4. Transposable element evolution in Heliconius suggests genome diversity within Lepidoptera

    PubMed Central

    2013-01-01

    Background Transposable elements (TEs) have the potential to impact genome structure, function and evolution in profound ways. In order to understand the contribution of transposable elements (TEs) to Heliconius melpomene, we queried the H. melpomene draft sequence to identify repetitive sequences. Results We determined that TEs comprise ~25% of the genome. The predominant class of TEs (~12% of the genome) was the non-long terminal repeat (non-LTR) retrotransposons, including a novel SINE family. However, this was only slightly higher than content derived from DNA transposons, which are diverse, with several families having mobilized in the recent past. Compared to the only other well-studied lepidopteran genome, Bombyx mori, H. melpomene exhibits a higher DNA transposon content and a distinct repertoire of retrotransposons. We also found that H. melpomene exhibits a high rate of TE turnover with few older elements accumulating in the genome. Conclusions Our analysis represents the first complete, de novo characterization of TE content in a butterfly genome and suggests that, while TEs are able to invade and multiply, TEs have an overall deleterious effect and/or that maintaining a small genome is advantageous. Our results also hint that analysis of additional lepidopteran genomes will reveal substantial TE diversity within the group. PMID:24088337

  5. Insights into the Evolution of Mitochondrial Genome Size from Complete Sequences of Citrullus lanatus and Cucurbita pepo (Cucurbitaceae)

    PubMed Central

    Alverson, Andrew J.; Wei, XiaoXin; Rice, Danny W.; Stern, David B.; Barry, Kerrie; Palmer, Jeffrey D.

    2010-01-01

    The mitochondrial genomes of seed plants are unusually large and vary in size by at least an order of magnitude. Much of this variation occurs within a single family, the Cucurbitaceae, whose genomes range from an estimated 390 to 2,900 kb in size. We sequenced the mitochondrial genomes of Citrullus lanatus (watermelon: 379,236 nt) and Cucurbita pepo (zucchini: 982,833 nt)—the two smallest characterized cucurbit mitochondrial genomes—and determined their RNA editing content. The relatively compact Citrullus mitochondrial genome actually contains more and longer genes and introns, longer segmental duplications, and more discernibly nuclear-derived DNA. The large size of the Cucurbita mitochondrial genome reflects the accumulation of unprecedented amounts of both chloroplast sequences (>113 kb) and short repeated sequences (>370 kb). A low mutation rate has been hypothesized to underlie increases in both genome size and RNA editing frequency in plant mitochondria. However, despite its much larger genome, Cucurbita has a significantly higher synonymous substitution rate (and presumably mutation rate) than Citrullus but comparable levels of RNA editing. The evolution of mutation rate, genome size, and RNA editing are apparently decoupled in Cucurbitaceae, reflecting either simple stochastic variation or governance by different factors. PMID:20118192

  6. Complete mitochondrial genome sequence of a phytophagous ladybird beetle, Henosepilachna pusillanima (Mulsant) (Coleoptera: Coccinellidae).

    PubMed

    Behere, G T; Firake, D M; Tay, W T; Azad Thakur, N S; Ngachan, S V

    2016-01-01

    Ladybird beetles are generally considered as agriculturally beneficial insects, but the ladybird beetles in the coleopteran subfamily Epilachninae are phytophagous and major plant feeding pest species which causes severe economic losses to cucurbitaceous and solanaceous crops. Henosepilachna pusillanima (Mulsant) is one of the important pest species of ladybird beetle. In this report, we sequenced and characterized the complete mitochondrial genome of H. pusillanima. For sequencing of the complete mitochondrial genome, we used the Ion Torrent sequencing platform. The complete circular mitochondrial genome of the H. pusillanima was determined to be 16,216 bp long. There were totally 13 protein coding genes, 22 transfer RNA, 2 ribosomal RNA and a control (A + T-rich) region estimated to be 1690 bp. The gene arrangement and orientations of assembled mitogenome were identical to the reported predatory ladybird beetle Coccinella septempunctata L. This is the first completely sequenced coleopteran mitochondrial genome from the beetle subfamily Epilachninae from India. Data generated in this study will benefit future comparative genomics studies for understanding the evolutionary relationships between predatory and phytophagous coccinellid beetles.

  7. Genome Content and Phylogenomics Reveal both Ancestral and Lateral Evolutionary Pathways in Plant-Pathogenic Streptomyces Species

    PubMed Central

    Huguet-Tapia, Jose C.; Lefebure, Tristan; Badger, Jonathan H.; Guan, Dongli; Stanhope, Michael J.

    2016-01-01

    Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer. PMID:26826232

  8. Genome analysis of medicinal Ganoderma spp. with plant-pathogenic and saprotrophic life-styles.

    PubMed

    Kües, Ursula; Nelson, David R; Liu, Chang; Yu, Guo-Jun; Zhang, Jianhui; Li, Jianqin; Wang, Xin-Cun; Sun, Hui

    2015-06-01

    Ganoderma is a fungal genus belonging to the Ganodermataceae family and Polyporales order. Plant-pathogenic species in this genus can cause severe diseases (stem, butt, and root rot) in economically important trees and perennial crops, especially in tropical countries. Ganoderma species are white rot fungi and have ecological importance in the breakdown of woody plants for nutrient mobilization. They possess effective machineries of lignocellulose-decomposing enzymes useful for bioenergy production and bioremediation. In addition, the genus contains many important species that produce pharmacologically active compounds used in health food and medicine. With the rapid adoption of next-generation DNA sequencing technologies, whole genome sequencing and systematic transcriptome analyses become affordable approaches to identify an organism's genes. In the last few years, numerous projects have been initiated to identify the genetic contents of several Ganoderma species, particularly in different strains of Ganoderma lucidum. In November 2013, eleven whole genome sequencing projects for Ganoderma species were registered in international databases, three of which were already completed with genomes being assembled to high quality. In addition to the nuclear genome, two mitochondrial genomes for Ganoderma species have also been reported. Complementing genome analysis, four transcriptome studies on various developmental stages of Ganoderma species have been performed. Information obtained from these studies has laid the foundation for the identification of genes involved in biological pathways that are critical for understanding the biology of Ganoderma, such as the mechanism of pathogenesis, the biosynthesis of active components, life cycle and cellular development, etc. With abundant genetic information becoming available, a few centralized resources have been established to disseminate the knowledge and integrate relevant data to support comparative genomic analyses of Ganoderma species. The current review carries out a detailed comparison of the nuclear genomes, mitochondrial genomes and transcriptomes from several Ganoderma species. Genes involved in biosynthetic pathways such as CYP450 genes and in cellular development such as matA and matB genes are characterized and compared in detail, as examples to demonstrate the usefulness of comparative genomic analyses for the identification of critical genes. Resources needed for future data integration and exploitation are also discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Local admixture of amplified and diversified secreted pathogenesis determinants shapes mosaic Toxoplasma gondii genomes

    PubMed Central

    Lorenzi, Hernan; Khan, Asis; Behnke, Michael S.; Namasivayam, Sivaranjani; Swapna, Lakshmipuram S.; Hadjithomas, Michalis; Karamycheva, Svetlana; Pinney, Deborah; Brunk, Brian P.; Ajioka, James W.; Ajzenberg, Daniel; Boothroyd, John C.; Boyle, Jon P.; Dardé, Marie L.; Diaz-Miranda, Maria A.; Dubey, Jitender P.; Fritz, Heather M.; Gennari, Solange M.; Gregory, Brian D.; Kim, Kami; Saeij, Jeroen P. J.; Su, Chunlei; White, Michael W.; Zhu, Xing-Quan; Howe, Daniel K.; Rosenthal, Benjamin M.; Grigg, Michael E.; Parkinson, John; Liu, Liang; Kissinger, Jessica C.; Roos, David S.; David Sibley, L

    2016-01-01

    Toxoplasma gondii is among the most prevalent parasites worldwide, infecting many wild and domestic animals and causing zoonotic infections in humans. T. gondii differs substantially in its broad distribution from closely related parasites that typically have narrow, specialized host ranges. To elucidate the genetic basis for these differences, we compared the genomes of 62 globally distributed T. gondii isolates to several closely related coccidian parasites. Our findings reveal that tandem amplification and diversification of secretory pathogenesis determinants is the primary feature that distinguishes the closely related genomes of these biologically diverse parasites. We further show that the unusual population structure of T. gondii is characterized by clade-specific inheritance of large conserved haploblocks that are significantly enriched in tandemly clustered secretory pathogenesis determinants. The shared inheritance of these conserved haploblocks, which show a different ancestry than the genome as a whole, may thus influence transmission, host range and pathogenicity. PMID:26738725

  10. RatMap--rat genome tools and data.

    PubMed

    Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik

    2005-01-01

    The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.

  11. RatMap—rat genome tools and data

    PubMed Central

    Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M.; Ståhl, Fredrik

    2005-01-01

    The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB–Genetics at Göteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided. PMID:15608244

  12. An Overlooked Paleotetraploidization in Cucurbitaceae

    PubMed Central

    Wang, Jinpeng; Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yang, Nanshan; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Xia, Ruiyan; Liu, Xiaojian; Ge, Dongcen; Luo, Sainan; Liu, Yinmeng; Kong, Youting; Cui, Xiaobo; Lei, Tianyu; Wang, Li; Wang, Zhenyi; Ge, Weina; Zhang, Lan; Song, Xiaoming; Yuan, Min; Guo, Di; Jin, Dianchuan; Chen, Wei; Pan, Yuxin; Liu, Tao; Yang, Guixian; Xiao, Yue; Sun, Jinshuai; Zhang, Cong; Li, Zhibo; Xu, Haiqing; Duan, Xueqian; Shen, Shaoqi; Zhang, Zhonghua; Huang, Sanwen; Wang, Xiyin

    2018-01-01

    Abstract Cucurbitaceae plants are of considerable biological and economic importance, and genomes of cucumber, watermelon, and melon have been sequenced. However, a comparative genomics exploration of their genome structures and evolution has not been available. Here, we aimed at performing a hierarchical inference of genomic homology resulted from recursive paleopolyploidizations. Unexpectedly, we found that, shortly after a core-eudicot-common hexaploidy, a cucurbit-common tetraploidization (CCT) occurred, overlooked by previous reports. Moreover, we characterized gene loss (and retention) after these respective events, which were significantly unbalanced between inferred subgenomes, and between plants after their split. The inference of a dominant subgenome and a sensitive one suggested an allotetraploid nature of the CCT. Besides, we found divergent evolutionary rates among cucurbits, and after doing rate correction, we dated the CCT to be 90–102 Ma, likely common to all Cucurbitaceae plants, showing its important role in the establishment of the plant family. PMID:29029269

  13. The Draft Genome Sequence of a Novel High-Efficient Butanol-Producing Bacterium Clostridium Diolis Strain WST.

    PubMed

    Chen, Chaoyang; Sun, Chongran; Wu, Yi-Rui

    2018-03-21

    A wild-type solventogenic strain Clostridium diolis WST, isolated from mangrove sediments, was characterized to produce high amount of butanol and acetone with negligible level of ethanol and acids from glucose via a unique acetone-butanol (AB) fermentation pathway. Through the genomic sequencing, the assembled draft genome of strain WST is calculated to be 5.85 Mb with a GC content of 29.69% and contains 5263 genes that contribute to the annotation of 5049 protein-coding sequences. Within these annotated genes, the butanol dehydrogenase gene (bdh) was determined to be in a higher amount from strain WST compared to other Clostridial strains, which is positively related to its high-efficient production of butanol. Therefore, we present a draft genome sequence analysis of strain WST in this article that should facilitate to further understand the solventogenic mechanism of this special microorganism.

  14. Final Technical Report for Award # ER64999

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Metcalf, William W.

    2014-10-08

    This report provides a summary of activities for Award # ER64999, a Genomes to Life Project funded by the Office of Science, Basic Energy Research. The project was entitled "Methanogenic archaea and the global carbon cycle: a systems biology approach to the study of Methanosarcina species". The long-term goal of this multi-investigator project was the creation of integrated, multiscale models that accurately and quantitatively predict the role of Methanosarcina species in the global carbon cycle under dynamic environmental conditions. To achieve these goals we pursed four specific aims: (1) genome sequencing of numerous members of the Order Methanosarcinales, (2) identificationmore » of genomic sources of phenotypic variation through in silico comparative genomics, (3) elucidation of the transcriptional networks of two Methanosarcina species, and (4) development of comprehensive metabolic network models for characterized strains to address the question of how metabolic models scale with genetic distance.« less

  15. Novel phage group infecting Lactobacillus delbrueckii subsp. lactis, as revealed by genomic and proteomic analysis of bacteriophage Ldl1.

    PubMed

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio; van Sinderen, Douwe

    2015-02-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 +/- 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species.

  16. The Genome of a Tortoise Herpesvirus (Testudinid Herpesvirus 3) Has a Novel Structure and Contains a Large Region That Is Not Required for Replication In Vitro or Virulence In Vivo

    PubMed Central

    Gandar, Frédéric; Wilkie, Gavin S.; Gatherer, Derek; Kerr, Karen; Marlier, Didier; Diez, Marianne; Marschang, Rachel E.; Mast, Jan; Dewals, Benjamin G.

    2015-01-01

    ABSTRACT Testudinid herpesvirus 3 (TeHV-3) is the causative agent of a lethal disease affecting several tortoise species. The threat that this virus poses to endangered animals is focusing efforts on characterizing its properties, in order to enable the development of prophylactic methods. We have sequenced the genomes of the two most studied TeHV-3 strains (1976 and 4295). TeHV-3 strain 1976 has a novel genome structure and is most closely related to a turtle herpesvirus, thus supporting its classification into genus Scutavirus, subfamily Alphaherpesvirinae, family Herpesviridae. The sequence of strain 1976 also revealed viral counterparts of cellular interleukin-10 and semaphorin, which have not been described previously in members of subfamily Alphaherpesvirinae. TeHV-3 strain 4295 is a mixture of three forms (m1, m2, and M), in which, in comparison to strain 1976, the genomes exhibit large, partially overlapping deletions of 12.5 to 22.4 kb. Viral subclones representing these forms were isolated by limiting dilution assays, and each replicated in cell culture comparably to strain 1976. With the goal of testing the potential of the three forms as attenuated vaccine candidates, strain 4295 was inoculated intranasally into Hermann's tortoises (Testudo hermanni). All inoculated subjects died, and PCR analyses demonstrated the ability of the m2 and M forms to spread and invade the brain. In contrast, the m1 form was detected in none of the organs tested, suggesting its potential as the basis of an attenuated vaccine candidate. Our findings represent a major step toward characterizing TeHV-3 and developing prophylactic methods against it. IMPORTANCE Testudinid herpesvirus 3 (TeHV-3) causes a lethal disease in tortoises, several species of which are endangered. We have characterized the viral genome and used this information to take steps toward developing an attenuated vaccine. We have sequenced the genomes of two strains (1976 and 4295), compared their growth in vitro, and investigated the pathogenesis of strain 4295, which consists of three deletion mutants. The major findings are that (i) TeHV-3 has a novel genome structure, (ii) its closest relative is a turtle herpesvirus, (iii) it contains interleukin-10 and semaphorin genes (the first time these have been reported in an alphaherpesvirus), (iv) a sizeable region of the genome is not required for viral replication in vitro or virulence in vivo, and (v) one of the components of strain 4295, which has a deletion of 22.4 kb, exhibits properties indicating that it may serve as the starting point for an attenuated vaccine. PMID:26339050

  17. Isolation and characterization of vB_ArS-ArV2 - first Arthrobacter sp. infecting bacteriophage with completely sequenced genome.

    PubMed

    Šimoliūnas, Eugenijus; Kaliniene, Laura; Stasilo, Miroslav; Truncaitė, Lidija; Zajančkauskaitė, Aurelija; Staniulis, Juozas; Nainys, Juozas; Kaupinis, Algirdas; Valius, Mindaugas; Meškys, Rolandas

    2014-01-01

    This is the first report on a complete genome sequence and biological characterization of the phage that infects Arthrobacter. A novel virus vB_ArS-ArV2 (ArV2) was isolated from soil using Arthrobacter sp. 68b strain for phage propagation. Based on transmission electron microscopy, ArV2 belongs to the family Siphoviridae and has an isometric head (∼63 nm in diameter) with a non-contractile flexible tail (∼194×10 nm) and six short tail fibers. ArV2 possesses a linear, double-stranded DNA genome (37,372 bp) with a G+C content of 62.73%. The genome contains 68 ORFs yet encodes no tRNA genes. A total of 28 ArV2 ORFs have no known functions and lack any reliable database matches. Proteomic analysis led to the experimental identification of 14 virion proteins, including 9 that were predicted by bioinformatics approaches. Comparative phylogenetic analysis, based on the amino acid sequence alignment of conserved proteins, set ArV2 apart from other siphoviruses. The data presented here will help to advance our understanding of Arthrobacter phage population and will extend our knowledge about the interaction between this particular host and its phages.

  18. Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

    PubMed

    Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard

    2015-10-19

    DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.

  19. Ninety-nine de novo assembled genomes from the moose (Alces alces) rumen microbiome provide new insights into microbial plant biomass degradation.

    PubMed

    Svartström, Olov; Alneberg, Johannes; Terrapon, Nicolas; Lombard, Vincent; de Bruijn, Ino; Malmsten, Jonas; Dalin, Ann-Marie; El Muller, Emilie; Shah, Pranjul; Wilmes, Paul; Henrissat, Bernard; Aspeborg, Henrik; Andersson, Anders F

    2017-11-01

    The moose (Alces alces) is a ruminant that harvests energy from fiber-rich lignocellulose material through carbohydrate-active enzymes (CAZymes) produced by its rumen microbes. We applied shotgun metagenomics to rumen contents from six moose to obtain insights into this microbiome. Following binning, 99 metagenome-assembled genomes (MAGs) belonging to 11 prokaryotic phyla were reconstructed and characterized based on phylogeny and CAZyme profile. The taxonomy of these MAGs reflected the overall composition of the metagenome, with dominance of the phyla Bacteroidetes and Firmicutes. Unlike in other ruminants, Spirochaetes constituted a significant proportion of the community and our analyses indicate that the corresponding strains are primarily pectin digesters. Pectin-degrading genes were also common in MAGs of Ruminococcus, Fibrobacteres and Bacteroidetes and were overall overrepresented in the moose microbiome compared with other ruminants. Phylogenomic analyses revealed several clades within the Bacteriodetes without previously characterized genomes. Several of these MAGs encoded a large numbers of dockerins, a module usually associated with cellulosomes. The Bacteroidetes dockerins were often linked to CAZymes and sometimes encoded inside polysaccharide utilization loci, which has never been reported before. The almost 100 CAZyme-annotated genomes reconstructed in this study provide an in-depth view of an efficient lignocellulose-degrading microbiome and prospects for developing enzyme technology for biorefineries.

  20. Clone and genomic repositories at the American Type Culture Collection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maglott, D.R.; Nierman, W.C.

    1990-01-01

    The American Type Culture Collection (ATCC) has a long history of characterizing, preserving, and distributing biological resource materials for the scientific community. Starting in 1925 as a repository for standard bacterial and fungal strains, its collections have diversified with technologic advances and in response to the requirements of its users. To serve the needs of the human genetics community, the National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), established an international Repository of Human DNA Probes and Libraries at the ATCC in 1985. This repository expanded the existing collections of recombinant clones and librariesmore » at the ATCC, with the specific purposes of (1) obtaining, amplifying, and distribution probes detecting restriction fragment length polymorphisms (RFLPs); (2) obtaining, amplifying, and distributing genomic and cDNA clones from known genes independent of RFLP detection; (3) distributing the chromosome-specific libraries generated by the National Laboratory Gene Library Project at the Lawrence Livermore and Los Alamos National Laboratories and (4) maintaining a public, online database describing the repository materials. Because it was recognized that animal models and comparative mapping can be crucial to genomic characterization, the scope of the repository was broadened in February 1989 to include probes from the mouse genome.« less

  1. A First Generation Comparative Chromosome Map between Guinea Pig (Cavia porcellus) and Humans.

    PubMed

    Romanenko, Svetlana A; Perelman, Polina L; Trifonov, Vladimir A; Serdyukova, Natalia A; Li, Tangliang; Fu, Beiyuan; O'Brien, Patricia C M; Ng, Bee L; Nie, Wenhui; Liehr, Thomas; Stanyon, Roscoe; Graphodatsky, Alexander S; Yang, Fengtang

    2015-01-01

    The domesticated guinea pig, Cavia porcellus (Hystricomorpha, Rodentia), is an important laboratory species and a model for a number of human diseases. Nevertheless, genomic tools for this species are lacking; even its karyotype is poorly characterized. The guinea pig belongs to Hystricomorpha, a widespread and important group of rodents; so far the chromosomes of guinea pigs have not been compared with that of other hystricomorph species or with any other mammals. We generated full sets of chromosome-specific painting probes for the guinea pig by flow sorting and microdissection, and for the first time, mapped the chromosomal homologies between guinea pig and human by reciprocal chromosome painting. Our data demonstrate that the guinea pig karyotype has undergone extensive rearrangements: 78 synteny-conserved human autosomal segments were delimited in the guinea pig genome. The high rate of genome evolution in the guinea pig may explain why the HSA7/16 and HSA16/19 associations presumed ancestral for eutherians and the three syntenic associations (HSA1/10, 3/19, and 9/11) considered ancestral for rodents were not found in C. porcellus. The comparative chromosome map presented here is a starting point for further development of physical and genetic maps of the guinea pig as well as an aid for genome assembly assignment to specific chromosomes. Furthermore, the comparative mapping will allow a transfer of gene map data from other species. The probes developed here provide a genomic toolkit, which will make the guinea pig a key species to unravel the evolutionary biology of the Hystricomorph rodents.

  2. A First Generation Comparative Chromosome Map between Guinea Pig (Cavia porcellus) and Humans

    PubMed Central

    Romanenko, Svetlana A.; Perelman, Polina L.; Trifonov, Vladimir A.; Serdyukova, Natalia A.; Li, Tangliang; Fu, Beiyuan; O’Brien, Patricia C. M.; Ng, Bee L.; Nie, Wenhui; Liehr, Thomas; Stanyon, Roscoe; Graphodatsky, Alexander S.; Yang, Fengtang

    2015-01-01

    The domesticated guinea pig, Cavia porcellus (Hystricomorpha, Rodentia), is an important laboratory species and a model for a number of human diseases. Nevertheless, genomic tools for this species are lacking; even its karyotype is poorly characterized. The guinea pig belongs to Hystricomorpha, a widespread and important group of rodents; so far the chromosomes of guinea pigs have not been compared with that of other hystricomorph species or with any other mammals. We generated full sets of chromosome-specific painting probes for the guinea pig by flow sorting and microdissection, and for the first time, mapped the chromosomal homologies between guinea pig and human by reciprocal chromosome painting. Our data demonstrate that the guinea pig karyotype has undergone extensive rearrangements: 78 synteny-conserved human autosomal segments were delimited in the guinea pig genome. The high rate of genome evolution in the guinea pig may explain why the HSA7/16 and HSA16/19 associations presumed ancestral for eutherians and the three syntenic associations (HSA1/10, 3/19, and 9/11) considered ancestral for rodents were not found in C. porcellus. The comparative chromosome map presented here is a starting point for further development of physical and genetic maps of the guinea pig as well as an aid for genome assembly assignment to specific chromosomes. Furthermore, the comparative mapping will allow a transfer of gene map data from other species. The probes developed here provide a genomic toolkit, which will make the guinea pig a key species to unravel the evolutionary biology of the Hystricomorph rodents. PMID:26010445

  3. Cyber infrastructure for Fusarium: three integrated platforms supporting strain identification, phylogenetics, comparative genomics and knowledge sharing.

    PubMed

    Park, Bongsoo; Park, Jongsun; Cheong, Kyeong-Chae; Choi, Jaeyoung; Jung, Kyongyong; Kim, Donghan; Lee, Yong-Hwan; Ward, Todd J; O'Donnell, Kerry; Geiser, David M; Kang, Seogchan

    2011-01-01

    The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate species identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on well-preserved culture collections, have established a robust foundation for Fusarium classification. Genomes of four Fusarium species have been published with more being currently sequenced. The Cyber infrastructure for Fusarium (CiF; http://www.fusariumdb.org/) was built to support archiving and utilization of rapidly increasing data and knowledge and consists of Fusarium-ID, Fusarium Comparative Genomics Platform (FCGP) and Fusarium Community Platform (FCP). The Fusarium-ID archives phylogenetic marker sequences from most known species along with information associated with characterized isolates and supports strain identification and phylogenetic analyses. The FCGP currently archives five genomes from four species. Besides supporting genome browsing and analysis, the FCGP presents computed characteristics of multiple gene families and functional groups. The Cart/Favorite function allows users to collect sequences from Fusarium-ID and the FCGP and analyze them later using multiple tools without requiring repeated copying-and-pasting of sequences. The FCP is designed to serve as an online community forum for sharing and preserving accumulated experience and knowledge to support future research and education.

  4. Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

    DOE PAGES

    Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto; ...

    2015-12-29

    In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less

  5. Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto

    In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less

  6. The First Genomic and Proteomic Characterization of a Deep-Sea Sulfate Reducer: Insights into the Piezophilic Lifestyle of Desulfovibrio piezophilus

    PubMed Central

    Pradel, Nathalie; Ji, Boyang; Gimenez, Grégory; Talla, Emmanuel; Lenoble, Patricia; Garel, Marc; Tamburini, Christian; Fourquet, Patrick; Lebrun, Régine; Bertin, Philippe; Denis, Yann; Pophillat, Matthieu; Barbe, Valérie; Ollivier, Bernard; Dolla, Alain

    2013-01-01

    Desulfovibrio piezophilus strain C1TLV30T is a piezophilic anaerobe that was isolated from wood falls in the Mediterranean deep-sea. D. piezophilus represents a unique model for studying the adaptation of sulfate-reducing bacteria to hydrostatic pressure. Here, we report the 3.6 Mbp genome sequence of this piezophilic bacterium. An analysis of the genome revealed the presence of seven genomic islands as well as gene clusters that are most likely linked to life at a high hydrostatic pressure. Comparative genomics and differential proteomics identified the transport of solutes and amino acids as well as amino acid metabolism as major cellular processes for the adaptation of this bacterium to hydrostatic pressure. In addition, the proteome profiles showed that the abundance of key enzymes that are involved in sulfate reduction was dependent on hydrostatic pressure. A comparative analysis of orthologs from the non-piezophilic marine bacterium D. salexigens and D. piezophilus identified aspartic acid, glutamic acid, lysine, asparagine, serine and tyrosine as the amino acids preferentially replaced by arginine, histidine, alanine and threonine in the piezophilic strain. This work reveals the adaptation strategies developed by a sulfate reducer to a deep-sea lifestyle. PMID:23383081

  7. A Novel Method for Accurate Operon Predictions in All SequencedProkaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Huang, Katherine H.; Alm, Eric J.

    2004-12-01

    We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacterpylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, andmore » its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from sixphylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.« less

  8. Comparative genome-based identification of a cell wall-anchored protein from Lactobacillus plantarum increases adhesion of Lactococcus lactis to human epithelial cells

    PubMed Central

    Zhang, Bo; Zuo, Fanglei; Yu, Rui; Zeng, Zhu; Ma, Huiqin; Chen, Shangwu

    2015-01-01

    Adhesion to host cells is considered important for Lactobacillus plantarum as well as other lactic acid bacteria (LAB) to persist in human gut and thus exert probiotic effects. Here, we sequenced the genome of Lt. plantarum strain NL42 originating from a traditional Chinese dairy product, performed comparative genomic analysis and characterized a novel adhesion factor. The genome of NL42 was highly divergent from its closest neighbors, especially in six large genomic regions. NL42 harbors a total of 42 genes encoding adhesion-associated proteins; among them, cwaA encodes a protein containing multiple domains, including five cell wall surface anchor repeat domains and an LPxTG-like cell wall anchor motif. Expression of cwaA in Lactococcus lactis significantly increased its autoaggregation and hydrophobicity, and conferred the new ability to adhere to human colonic epithelial HT-29 cells by targeting cellular surface proteins, and not carbohydrate moieties, for CwaA adhesion. In addition, the recombinant Lc. lactis inhibited adhesion of Staphylococcus aureus and Escherichia coli to HT-29 cells, mainly by exclusion. We conclude that CwaA is a novel adhesion factor in Lt. plantarum and a potential candidate for improving the adhesion ability of probiotics or other bacteria of interest. PMID:26370773

  9. Cyber infrastructure for Fusarium: three integrated platforms supporting strain identification, phylogenetics, comparative genomics and knowledge sharing

    PubMed Central

    Park, Bongsoo; Park, Jongsun; Cheong, Kyeong-Chae; Choi, Jaeyoung; Jung, Kyongyong; Kim, Donghan; Lee, Yong-Hwan; Ward, Todd J.; O'Donnell, Kerry; Geiser, David M.; Kang, Seogchan

    2011-01-01

    The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate species identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on well-preserved culture collections, have established a robust foundation for Fusarium classification. Genomes of four Fusarium species have been published with more being currently sequenced. The Cyber infrastructure for Fusarium (CiF; http://www.fusariumdb.org/) was built to support archiving and utilization of rapidly increasing data and knowledge and consists of Fusarium-ID, Fusarium Comparative Genomics Platform (FCGP) and Fusarium Community Platform (FCP). The Fusarium-ID archives phylogenetic marker sequences from most known species along with information associated with characterized isolates and supports strain identification and phylogenetic analyses. The FCGP currently archives five genomes from four species. Besides supporting genome browsing and analysis, the FCGP presents computed characteristics of multiple gene families and functional groups. The Cart/Favorite function allows users to collect sequences from Fusarium-ID and the FCGP and analyze them later using multiple tools without requiring repeated copying-and-pasting of sequences. The FCP is designed to serve as an online community forum for sharing and preserving accumulated experience and knowledge to support future research and education. PMID:21087991

  10. Comparative genome-based identification of a cell wall-anchored protein from Lactobacillus plantarum increases adhesion of Lactococcus lactis to human epithelial cells.

    PubMed

    Zhang, Bo; Zuo, Fanglei; Yu, Rui; Zeng, Zhu; Ma, Huiqin; Chen, Shangwu

    2015-09-15

    Adhesion to host cells is considered important for Lactobacillus plantarum as well as other lactic acid bacteria (LAB) to persist in human gut and thus exert probiotic effects. Here, we sequenced the genome of Lt. plantarum strain NL42 originating from a traditional Chinese dairy product, performed comparative genomic analysis and characterized a novel adhesion factor. The genome of NL42 was highly divergent from its closest neighbors, especially in six large genomic regions. NL42 harbors a total of 42 genes encoding adhesion-associated proteins; among them, cwaA encodes a protein containing multiple domains, including five cell wall surface anchor repeat domains and an LPxTG-like cell wall anchor motif. Expression of cwaA in Lactococcus lactis significantly increased its autoaggregation and hydrophobicity, and conferred the new ability to adhere to human colonic epithelial HT-29 cells by targeting cellular surface proteins, and not carbohydrate moieties, for CwaA adhesion. In addition, the recombinant Lc. lactis inhibited adhesion of Staphylococcus aureus and Escherichia coli to HT-29 cells, mainly by exclusion. We conclude that CwaA is a novel adhesion factor in Lt. plantarum and a potential candidate for improving the adhesion ability of probiotics or other bacteria of interest.

  11. Metatranscriptomics of N2-fixing cyanobacteria in the Amazon River plume

    PubMed Central

    Hilton, Jason A; Satinsky, Brandon M; Doherty, Mary; Zielinski, Brian; Zehr, Jonathan P

    2015-01-01

    Biological N2 fixation is an important nitrogen source for surface ocean microbial communities. However, nearly all information on the diversity and gene expression of organisms responsible for oceanic N2 fixation in the environment has come from targeted approaches that assay only a small number of genes and organisms. Using genomes of diazotrophic cyanobacteria to extract reads from extensive meta-genomic and -transcriptomic libraries, we examined diazotroph diversity and gene expression from the Amazon River plume, an area characterized by salinity and nutrient gradients. Diazotroph genome and transcript sequences were most abundant in the transitional waters compared with lower salinity or oceanic water masses. We were able to distinguish two genetically divergent phylotypes within the Hemiaulus-associated Richelia sequences, which were the most abundant diazotroph sequences in the data set. Photosystem (PS)-II transcripts in Richelia populations were much less abundant than those in Trichodesmium, and transcripts from several Richelia PS-II genes were absent, indicating a prominent role for cyclic electron transport in Richelia. In addition, there were several abundant regulatory transcripts, including one that targets a gene involved in PS-I cyclic electron transport in Richelia. High sequence coverage of the Richelia transcripts, as well as those from Trichodesmium populations, allowed us to identify expressed regions of the genomes that had been overlooked by genome annotations. High-coverage genomic and transcription analysis enabled the characterization of distinct phylotypes within diazotrophic populations, revealed a distinction in a core process between dominant populations and provided evidence for a prominent role for noncoding RNAs in microbial communities. PMID:25514535

  12. Segmental Duplications and Copy-Number Variation in the Human Genome

    PubMed Central

    Sharp, Andrew J. ; Locke, Devin P. ; McGrath, Sean D. ; Cheng, Ze ; Bailey, Jeffrey A. ; Vallente, Rhea U. ; Pertz, Lisa M. ; Clark, Royden A. ; Schwartz, Stuart ; Segraves, Rick ; Oseroff, Vanessa V. ; Albertson, Donna G. ; Pinkel, Daniel ; Eichler, Evan E. 

    2005-01-01

    The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders. PMID:15918152

  13. Bacteriophages of Gordonia spp. Display a Spectrum of Diversity and Genetic Relationships.

    PubMed

    Pope, Welkin H; Mavrich, Travis N; Garlena, Rebecca A; Guerrero-Bustamante, Carlos A; Jacobs-Sera, Deborah; Montgomery, Matthew T; Russell, Daniel A; Warner, Marcie H; Hatfull, Graham F

    2017-08-15

    The global bacteriophage population is large, dynamic, old, and highly diverse genetically. Many phages are tailed and contain double-stranded DNA, but these remain poorly characterized genomically. A collection of over 1,000 phages infecting Mycobacterium smegmatis reveals the diversity of phages of a common bacterial host, but their relationships to phages of phylogenetically proximal hosts are not known. Comparative sequence analysis of 79 phages isolated on Gordonia shows these also to be diverse and that the phages can be grouped into 14 clusters of related genomes, with an additional 14 phages that are "singletons" with no closely related genomes. One group of six phages is closely related to Cluster A mycobacteriophages, but the other Gordonia phages are distant relatives and share only 10% of their genes with the mycobacteriophages. The Gordonia phage genomes vary in genome length (17.1 to 103.4 kb), percentage of GC content (47 to 68.8%), and genome architecture and contain a variety of features not seen in other phage genomes. Like the mycobacteriophages, the highly mosaic Gordonia phages demonstrate a spectrum of genetic relationships. We show this is a general property of bacteriophages and suggest that any barriers to genetic exchange are soft and readily violable. IMPORTANCE Despite the numerical dominance of bacteriophages in the biosphere, there is a dearth of complete genomic sequences. Current genomic information reveals that phages are highly diverse genomically and have mosaic architectures formed by extensive horizontal genetic exchange. Comparative analysis of 79 phages of Gordonia shows them to not only be highly diverse, but to present a spectrum of relatedness. Most are distantly related to phages of the phylogenetically proximal host Mycobacterium smegmatis , although one group of Gordonia phages is more closely related to mycobacteriophages than to the other Gordonia phages. Phage genome sequence space remains largely unexplored, but further isolation and genomic comparison of phages targeted at related groups of hosts promise to reveal pathways of bacteriophage evolution. Copyright © 2017 Pope et al.

  14. Comparative sequence analysis revealed altered chromosomal organization and a novel insertion sequence encoding DNA modification and potentially stress-related functions in an Escherichia coli O157:H7 foodborne isolate

    USDA-ARS?s Scientific Manuscript database

    We recently described the complete genome of enterohemorrhagic Escherichia coli (EHEC) O157:H7 strain NADC 6564, an isolate of strain 86-24 linked to the 1986 disease outbreak. In the current study, we compared the chromosomal sequence of NADC 6564 to the well-characterized chromosomal sequences of ...

  15. Rapid CRISPR/Cas9-Mediated Cloning of Full-Length Epstein-Barr Virus Genomes from Latently Infected Cells.

    PubMed

    Yajima, Misako; Ikuta, Kazufumi; Kanda, Teru

    2018-04-03

    Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.

  16. Genome-wide Annotation and Comparative Analysis of Long Terminal Repeat Retrotransposons between Pear Species of P. bretschneideri and P. Communis

    PubMed Central

    Yin, Hao; Du, Jianchang; Wu, Jun; Wei, Shuwei; Xu, Yingxiu; Tao, Shutian; Wu, Juyou; Zhang, Shaoling

    2015-01-01

    Recent sequencing of the Oriental pear (P. bretschneideri Rehd.) genome and the availability of the draft genome sequence of Occidental pear (P. communis L.), has provided a good opportunity to characterize the abundance, distribution, timing, and evolution of long terminal repeat retrotransposons (LTR-RTs) in these two important fruit plants. Here, a total of 7247 LTR-RTs, which can be classified into 148 families, have been identified in the assembled Oriental pear genome. Unlike in other plant genomes, approximately 90% of these elements were found to be randomly distributed along the pear chromosomes. Further analysis revealed that the amplification timeframe of elements varies dramatically in different families, super-families and lineages, and the Copia-like elements have highest activity in the recent 0.5 million years (Mys). The data also showed that two genomes evolved with similar evolutionary rates after their split from the common ancestor ~0.77–1.66 million years ago (Mya). Overall, the data provided here will be a valuable resource for further investigating the impact of transposable elements on gene structure, expression, and epigenetic modification in the pear genomes. PMID:26631625

  17. Rapid CRISPR/Cas9-Mediated Cloning of Full-Length Epstein-Barr Virus Genomes from Latently Infected Cells

    PubMed Central

    Ikuta, Kazufumi; Kanda, Teru

    2018-01-01

    Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically. PMID:29614006

  18. Tales of diversity: Genomic and morphological characteristics of forty-six Arthrobacter phages

    PubMed Central

    Adair, Tamarah L.; Afram, Patricia; Allen, Katherine G.; Archambault, Megan L.; Aziz, Rahat M.; Bagnasco, Filippa G.; Ball, Sarah L.; Barrett, Natalie A.; Benjamin, Robert C.; Blasi, Christopher J.; Borst, Katherine; Braun, Mary A.; Broomell, Haley; Brown, Conner B.; Brynell, Zachary S.; Bue, Ashley B.; Burke, Sydney O.; Casazza, William; Cautela, Julia A.; Chen, Kevin; Chimalakonda, Nitish S.; Chudoff, Dylan; Connor, Jade A.; Cross, Trevor S.; Curtis, Kyra N.; Dahlke, Jessica A.; Deaton, Bethany M.; Degroote, Sarah J.; DeNigris, Danielle M.; DeRuff, Katherine C.; Dolan, Milan; Dunbar, David; Egan, Marisa S.; Evans, Daniel R.; Fahnestock, Abby K.; Farooq, Amal; Finn, Garrett; Fratus, Christopher R.; Gaffney, Bobby L.; Garlena, Rebecca A.; Garrigan, Kelly E.; Gibbon, Bryan C.; Goedde, Michael A.; Guerrero Bustamante, Carlos A.; Harrison, Melinda; Hartwell, Megan C.; Heckman, Emily L.; Huang, Jennifer; Hughes, Lee E.; Hyduchak, Kathryn M.; Jacob, Aswathi E.; Kaku, Machika; Karstens, Allen W.; Kenna, Margaret A.; Khetarpal, Susheel; King, Rodney A.; Kobokovich, Amanda L.; Kolev, Hannah; Konde, Sai A.; Kriese, Elizabeth; Lamey, Morgan E.; Lantz, Carter N.; Lapin, Jonathan S.; Lawson, Temiloluwa O.; Lee, In Young; Lee, Scott M.; Lee-Soety, Julia Y.; Lehmann, Emily M.; London, Shawn C.; Lopez, A. Javier; Lynch, Kelly C.; Mageeney, Catherine M.; Martynyuk, Tetyana; Mathew, Kevin J.; Mavrich, Travis N.; McDaniel, Christopher M.; McDonald, Hannah; McManus, C. Joel; Medrano, Jessica E.; Mele, Francis E.; Menninger, Jennifer E.; Miller, Sierra N.; Minick, Josephine E.; Nabua, Courtney T.; Napoli, Caroline K.; Nkangabwa, Martha; Oates, Elizabeth A.; Ott, Cassandra T.; Pellerino, Sarah K.; Pinamont, William J.; Pirnie, Ross T.; Pizzorno, Marie C.; Plautz, Emilee J.; Pope, Welkin H.; Pruett, Katelyn M.; Rickstrew, Gabbi; Rimple, Patrick A.; Rinehart, Claire A.; Robinson, Kayla M.; Rose, Victoria A.; Russell, Daniel A.; Schick, Amelia M.; Schlossman, Julia; Schneider, Victoria M.; Sells, Chloe A.; Sieker, Jeremy W.; Silva, Morgan P.; Silvi, Marissa M.; Simon, Stephanie E.; Staples, Amanda K.; Steed, Isabelle L.; Stowe, Emily L.; Stueven, Noah A.; Swartz, Porter T.; Sweet, Emma A.; Sweetman, Abigail T.; Tender, Corrina; Terry, Katrina; Thomas, Chrystal; Thomas, Daniel S.; Thompson, Allison R.; Vanderveen, Lorianna; Varma, Rohan; Vaught, Hannah L.; Vo, Quynh D.; Vonberg, Zachary T.; Ware, Vassie C.; Warrad, Yasmene M.; Wathen, Kaitlyn E.; Weinstein, Jonathan L.; Wyper, Jacqueline F.; Yankauskas, Jakob R.; Zhang, Christine

    2017-01-01

    The vast bacteriophage population harbors an immense reservoir of genetic information. Almost 2000 phage genomes have been sequenced from phages infecting hosts in the phylum Actinobacteria, and analysis of these genomes reveals substantial diversity, pervasive mosaicism, and novel mechanisms for phage replication and lysogeny. Here, we describe the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain. These phages include representatives of all three virion morphologies, and Jasmine is the first sequenced podovirus of an actinobacterial host. The phages also span considerable sequence diversity, and can be grouped into 10 clusters according to their nucleotide diversity, and two singletons each with no close relatives. However, the clusters/singletons appear to be genomically well separated from each other, and relatively few genes are shared between clusters. Genome size varies from among the smallest of siphoviral phages (15,319 bp) to over 70 kbp, and G+C contents range from 45–68%, compared to 63.4% for the host genome. Although temperate phages are common among other actinobacterial hosts, these Arthrobacter phages are primarily lytic, and only the singleton Galaxy is likely temperate. PMID:28715480

  19. Diversity, distribution and dynamics of full-length Copia and Gypsy LTR retroelements in Solanum lycopersicum.

    PubMed

    Paz, Rosalía Cristina; Kozaczek, Melisa Eliana; Rosli, Hernán Guillermo; Andino, Natalia Pilar; Sanchez-Puerta, Maria Virginia

    2017-10-01

    Transposable elements are the most abundant components of plant genomes and can dramatically induce genetic changes and impact genome evolution. In the recently sequenced genome of tomato (Solanum lycopersicum), the estimated fraction of elements corresponding to retrotransposons is nearly 62%. Given that tomato is one of the most important vegetable crop cultivated and consumed worldwide, understanding retrotransposon dynamics can provide insight into its evolution and domestication processes. In this study, we performed a genome-wide in silico search of full-length LTR retroelements in the tomato nuclear genome and annotated 736 full-length Gypsy and Copia retroelements. The dispersion level across the 12 chromosomes, the diversity and tissue-specific expression of those elements were estimated. Phylogenetic analysis based on the retrotranscriptase region revealed the presence of 12 major lineages of LTR retroelements in the tomato genome. We identified 97 families, of which 77 and 20 belong to the superfamilies Copia and Gypsy, respectively. Each retroelement family was characterized according to their element size, relative frequencies and insertion time. These analyses represent a valuable resource for comparative genomics within the Solanaceae, transposon-tagging and for the design of cultivar-specific molecular markers in tomato.

  20. Molecular characterization of dihydroneopterin aldolase and aminodeoxychorismate synthase in common bean-genes coding for enzymes in the folate synthesis pathway.

    PubMed

    Xie, Weilong; Perry, Gregory; Martin, C Joe; Shim, Youn-Seb; Navabi, Alireza; Pauls, K Peter

    2017-07-01

    Common beans (Phaseolus vulgaris) are excellent sources of dietary folates, but different varieties contain different amounts of these compounds. Genes coding for dihydroneopterin aldolase (DHNA) and aminodeoxychorismate synthase (ADCS) of the folate synthesis pathway were characterized by PCR amplification, BAC clone sequencing, and whole genome sequencing. All DHNA and ADCS genes in the Mesoamerican cultivar OAC Rex were isolated and compared with those genes in the genome of Andean genotype G19833. Both genotypes have two functional DHNA genes and one pseudo gene. PvDHNA1 and PvDHNA2 proteins have similar secondary structures and conserved residues as DHNA homologs in Staphylococcus aureus and Arabidopsis. Sequence analysis and synteny mapping indicated that PvDHNA1 might be a duplicated and transposed copy of PvDHNA2. There is only one ADCS gene (PvADCS) identified in the bean genome and it is identical in OAC Rex and G19833. PvADCS has the conserved motifs required for catalytic activity similar to other plant ADCS homologs. DHNA and ADCS gene-specific markers were developed, mapped, and compared to their physical locations on chromosomes 1 and 7, respectively. The gene-specific markers developed in this study should be useful for detection and selection of varieties with enhanced folate contents in bean breeding programs.

  1. Pan-Genomic Analysis Permits Differentiation of Virulent and Non-virulent Strains of Xanthomonas arboricola That Cohabit Prunus spp. and Elucidate Bacterial Virulence Factors

    PubMed Central

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.; Cubero, Jaime

    2017-01-01

    Xanthomonas arboricola is a plant-associated bacterial species that causes diseases on several plant hosts. One of the most virulent pathovars within this species is X. arboricola pv. pruni (Xap), the causal agent of bacterial spot disease of stone fruit trees and almond. Recently, a non-virulent Xap-look-a-like strain isolated from Prunus was characterized and its genome compared to pathogenic strains of Xap, revealing differences in the profile of virulence factors, such as the genes related to the type III secretion system (T3SS) and type III effectors (T3Es). The existence of this atypical strain arouses several questions associated with the abundance, the pathogenicity, and the evolutionary context of X. arboricola on Prunus hosts. After an initial characterization of a collection of Xanthomonas strains isolated from Prunus bacterial spot outbreaks in Spain during the past decade, six Xap-look-a-like strains, that did not clustered with the pathogenic strains of Xap according to a multi locus sequence analysis, were identified. Pathogenicity of these strains was analyzed and the genome sequences of two Xap-look-a-like strains, CITA 14 and CITA 124, non-virulent to Prunus spp., were obtained and compared to those available genomes of X. arboricola associated with this host plant. Differences were found among the genomes of the virulent and the Prunus non-virulent strains in several characters related to the pathogenesis process. Additionally, a pan-genomic analysis that included the available genomes of X. arboricola, revealed that the atypical strains associated with Prunus were related to a group of non-virulent or low virulent strains isolated from a wide host range. The repertoire of the genes related to T3SS and T3Es varied among the strains of this cluster and those strains related to the most virulent pathovars of the species, corylina, juglandis, and pruni. This variability provides information about the potential evolutionary process associated to the acquisition of pathogenicity and host specificity in X. arboricola. Finally, based in the genomic differences observed between the virulent and the non-virulent strains isolated from Prunus, a sensitive and specific real-time PCR protocol was designed to detect and identify Xap strains. This method avoids miss-identifications due to atypical strains of X. arboricola that can cohabit Prunus. PMID:28450852

  2. Detailed Characterization of Human Induced Pluripotent Stem Cells Manufactured for Therapeutic Applications.

    PubMed

    Baghbaderani, Behnam Ahmadian; Syama, Adhikarla; Sivapatham, Renuka; Pei, Ying; Mukherjee, Odity; Fellner, Thomas; Zeng, Xianmin; Rao, Mahendra S

    2016-08-01

    We have recently described manufacturing of human induced pluripotent stem cells (iPSC) master cell banks (MCB) generated by a clinically compliant process using cord blood as a starting material (Baghbaderani et al. in Stem Cell Reports, 5(4), 647-659, 2015). In this manuscript, we describe the detailed characterization of the two iPSC clones generated using this process, including whole genome sequencing (WGS), microarray, and comparative genomic hybridization (aCGH) single nucleotide polymorphism (SNP) analysis. We compare their profiles with a proposed calibration material and with a reporter subclone and lines made by a similar process from different donors. We believe that iPSCs are likely to be used to make multiple clinical products. We further believe that the lines used as input material will be used at different sites and, given their immortal status, will be used for many years or even decades. Therefore, it will be important to develop assays to monitor the state of the cells and their drift in culture. We suggest that a detailed characterization of the initial status of the cells, a comparison with some calibration material and the development of reporter sublcones will help determine which set of tests will be most useful in monitoring the cells and establishing criteria for discarding a line.

  3. Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

    PubMed

    Chauhan, Sushma; Rahman, Hifzur; Mastan, Shaik G; Pamidimarri, D V N Sudheer; Reddy, Muppala P

    2018-07-20

    Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp-2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Bigfoot. a new family of MITE elements characterized from the Medicago genus.

    PubMed

    Charrier, B; Foucher, F; Kondorosi, E; d'Aubenton-Carafa, Y; Thermes, C; Kondorosi, A; Ratet, P

    1999-05-01

    We have characterized from the legume plant Medicago a new family of miniature inverted-repeat transposable elements (MITE), called the Bigfoot transposable elements. Two of these insertion elements are present only in a single allele of two different M. sativa genes. Using a PCR strategy we have isolated 19 other Bigfoot elements from the M. sativa and M. truncatula genomes. They differ from the previously characterized MITEs by their sequence, a target site of 9 bp and a partially clustered genomic distribution. In addition, we show that they exhibit a significantly stable secondary structure. These elements may represent up to 0.1% of the genome of the outcrossing Medicago sativa but are present at a reduced copy number in the genome of the autogamous M. truncatula plant, revealing major differences in the genome organization of these two plants.

  5. Three tiers of genome evolution in reptiles

    PubMed Central

    Organ, Chris L.; Moreno, Ricardo Godínez; Edwards, Scott V.

    2008-01-01

    Characterization of reptilian genomes is essential for understanding the overall diversity and evolution of amniote genomes, because reptiles, which include birds, constitute a major fraction of the amniote evolutionary tree. To better understand the evolution and diversity of genomic characteristics in Reptilia, we conducted comparative analyses of online sequence data from Alligator mississippiensis (alligator) and Sphenodon punctatus (tuatara) as well as genome size and karyological data from a wide range of reptilian species. At the whole-genome and chromosomal tiers of organization, we find that reptilian genome size distribution is consistent with a model of continuous gradual evolution while genomic compartmentalization, as manifested in the number of microchromosomes and macrochromosomes, appears to have undergone early rapid change. At the sequence level, the third genomic tier, we find that exon size in Alligator is distributed in a pattern matching that of exons in Gallus (chicken), especially in the 101—200 bp size class. A small spike in the fraction of exons in the 301 bp—1 kb size class is also observed for Alligator, but more so for Sphenodon. For introns, we find that members of Reptilia have a larger fraction of introns within the 101 bp–2 kb size class and a lower fraction of introns within the 5–30 kb size class than do mammals. These findings suggest that the mode of reptilian genome evolution varies across three hierarchical levels of the genome, a pattern consistent with a mosaic model of genomic evolution. PMID:21669810

  6. Genomic Comparative Study of Bovine Mastitis Escherichia coli.

    PubMed

    Kempf, Florent; Slugocki, Cindy; Blum, Shlomo E; Leitner, Gabriel; Germon, Pierre

    2016-01-01

    Escherichia coli, one of the main causative agents of bovine mastitis, is responsible for significant losses on dairy farms. In order to better understand the pathogenicity of E. coli mastitis, an accurate characterization of E. coli strains isolated from mastitis cases is required. By using phylogenetic analyses and whole genome comparison of 5 currently available mastitis E. coli genome sequences, we searched for genotypic traits specific for mastitis isolates. Our data confirm that there is a bias in the distribution of mastitis isolates in the different phylogenetic groups of the E. coli species, with the majority of strains belonging to phylogenetic groups A and B1. An interesting feature is that clustering of strains based on their accessory genome is very similar to that obtained using the core genome. This finding illustrates the fact that phenotypic properties of strains from different phylogroups are likely to be different. As a consequence, it is possible that different strategies could be used by mastitis isolates of different phylogroups to trigger mastitis. Our results indicate that mastitis E. coli isolates analyzed in this study carry very few of the virulence genes described in other pathogenic E. coli strains. A more detailed analysis of the presence/absence of genes involved in LPS synthesis, iron acquisition and type 6 secretion systems did not uncover specific properties of mastitis isolates. Altogether, these results indicate that mastitis E. coli isolates are rather characterized by a lack of bona fide currently described virulence genes.

  7. Genomic Comparative Study of Bovine Mastitis Escherichia coli

    PubMed Central

    Kempf, Florent; Slugocki, Cindy; Blum, Shlomo E.; Leitner, Gabriel; Germon, Pierre

    2016-01-01

    Escherichia coli, one of the main causative agents of bovine mastitis, is responsible for significant losses on dairy farms. In order to better understand the pathogenicity of E. coli mastitis, an accurate characterization of E. coli strains isolated from mastitis cases is required. By using phylogenetic analyses and whole genome comparison of 5 currently available mastitis E. coli genome sequences, we searched for genotypic traits specific for mastitis isolates. Our data confirm that there is a bias in the distribution of mastitis isolates in the different phylogenetic groups of the E. coli species, with the majority of strains belonging to phylogenetic groups A and B1. An interesting feature is that clustering of strains based on their accessory genome is very similar to that obtained using the core genome. This finding illustrates the fact that phenotypic properties of strains from different phylogroups are likely to be different. As a consequence, it is possible that different strategies could be used by mastitis isolates of different phylogroups to trigger mastitis. Our results indicate that mastitis E. coli isolates analyzed in this study carry very few of the virulence genes described in other pathogenic E. coli strains. A more detailed analysis of the presence/absence of genes involved in LPS synthesis, iron acquisition and type 6 secretion systems did not uncover specific properties of mastitis isolates. Altogether, these results indicate that mastitis E. coli isolates are rather characterized by a lack of bona fide currently described virulence genes. PMID:26809117

  8. Detection of clonal evolution in hematopoietic malignancies by combining comparative genomic hybridization and single nucleotide polymorphism arrays.

    PubMed

    Hartmann, Luise; Stephenson, Christine F; Verkamp, Stephanie R; Johnson, Krystal R; Burnworth, Bettina; Hammock, Kelle; Brodersen, Lisa Eidenschink; de Baca, Monica E; Wells, Denise A; Loken, Michael R; Zehentner, Barbara K

    2014-12-01

    Array comparative genomic hybridization (aCGH) has become a powerful tool for analyzing hematopoietic neoplasms and identifying genome-wide copy number changes in a single assay. aCGH also has superior resolution compared with fluorescence in situ hybridization (FISH) or conventional cytogenetics. Integration of single nucleotide polymorphism (SNP) probes with microarray analysis allows additional identification of acquired uniparental disomy, a copy neutral aberration with known potential to contribute to tumor pathogenesis. However, a limitation of microarray analysis has been the inability to detect clonal heterogeneity in a sample. This study comprised 16 samples (acute myeloid leukemia, myelodysplastic syndrome, chronic lymphocytic leukemia, plasma cell neoplasm) with complex cytogenetic features and evidence of clonal evolution. We used an integrated manual peak reassignment approach combining analysis of aCGH and SNP microarray data for characterization of subclonal abnormalities. We compared array findings with results obtained from conventional cytogenetic and FISH studies. Clonal heterogeneity was detected in 13 of 16 samples by microarray on the basis of log2 values. Use of the manual peak reassignment analysis approach improved resolution of the sample's clonal composition and genetic heterogeneity in 10 of 13 (77%) patients. Moreover, in 3 patients, clonal disease progression was revealed by array analysis that was not evident by cytogenetic or FISH studies. Genetic abnormalities originating from separate clonal subpopulations can be identified and further characterized by combining aCGH and SNP hybridization results from 1 integrated microarray chip by use of the manual peak reassignment technique. Its clinical utility in comparison to conventional cytogenetic or FISH studies is demonstrated. © 2014 American Association for Clinical Chemistry.

  9. Comparative (Meta)genomic Analysis and Ecological Profiling of Human Gut-Specific Bacteriophage φB124-14

    PubMed Central

    Ogilvie, Lesley A.; Caplin, Jonathan; Dedi, Cinzia; Diston, David; Cheek, Elizabeth; Bowler, Lucas; Taylor, Huw; Ebdon, James; Jones, Brian V.

    2012-01-01

    Bacteriophage associated with the human gut microbiome are likely to have an important impact on community structure and function, and provide a wealth of biotechnological opportunities. Despite this, knowledge of the ecology and composition of bacteriophage in the gut bacterial community remains poor, with few well characterized gut-associated phage genomes currently available. Here we describe the identification and in-depth (meta)genomic, proteomic, and ecological analysis of a human gut-specific bacteriophage (designated φB124-14). In doing so we illuminate a fraction of the biological dark matter extant in this ecosystem and its surrounding eco-genomic landscape, identifying a novel and uncharted bacteriophage gene-space in this community. φB124-14 infects only a subset of closely related gut-associated Bacteroides fragilis strains, and the circular genome encodes functions previously found to be rare in viral genomes and human gut viral metagenome sequences, including those which potentially confer advantages upon phage and/or host bacteria. Comparative genomic analyses revealed φB124-14 is most closely related to φB40-8, the only other publically available Bacteroides sp. phage genome, whilst comparative metagenomic analysis of both phage failed to identify any homologous sequences in 136 non-human gut metagenomic datasets searched, supporting the human gut-specific nature of this phage. Moreover, a potential geographic variation in the carriage of these and related phage was revealed by analysis of their distribution and prevalence within 151 human gut microbiomes and viromes from Europe, America and Japan. Finally, ecological profiling of φB124-14 and φB40-8, using both gene-centric alignment-driven phylogenetic analyses, as well as alignment-free gene-independent approaches was undertaken. This not only verified the human gut-specific nature of both phage, but also indicated that these phage populate a distinct and unexplored ecological landscape within the human gut microbiome. PMID:22558115

  10. Comparative genomic, phylogenetic, and functional investigation of the xenobiotic metabolizing arylamine N-acetyltransferase enzyme family among fungi

    USDA-ARS?s Scientific Manuscript database

    Arylamine N-acetyltransferases (NATs) are xenobiotic metabolizing enzymes well-characterized in several bacteria and higher eukaryotes. The role of NATs in fungal biology has only recently been investigated (Glenn and Bacon, 2009; Glenn et al., 2010). The NAT1 gene of Gibberella moniliformis was the...

  11. Mutations in X-linked PORCN, a putative regulator of Wnt signaling, cause focal dermal hypoplasia

    USDA-ARS?s Scientific Manuscript database

    Focal dermal hypoplasia is an X-linked dominant disorder characterized by patchy hypoplastic skin and digital, ocular, and dental malformations. We used array comparative genomic hybridization to identify a 219-kb deletion in Xp11.23 in two affected females. We sequenced genes in this region and fou...

  12. Characterization of genome-reduced Bacillus subtilis strains and their application for the production of guanosine and thymidine.

    PubMed

    Li, Yang; Zhu, Xujun; Zhang, Xueyu; Fu, Jing; Wang, Zhiwen; Chen, Tao; Zhao, Xueming

    2016-06-03

    Genome streamlining has emerged as an effective strategy to boost the production efficiency of bio-based products. Many efforts have been made to construct desirable chassis cells by reducing the genome size of microbes. It has been reported that the genome-reduced Bacillus subtilis strain MBG874 showed clear advantages for the production of several heterologous enzymes including alkaline cellulase and protease. In addition to enzymes, B. subtilis is also used for the production of chemicals. To our best knowledge, it is still unknown whether genome reduction could be used to optimize the production of chemicals such as nucleoside products. In this study, we constructed a series of genome-reduced strains by deleting non-essential regions in the chromosome of B. subtilis 168. These strains with genome reductions ranging in size from 581.9 to 814.4 kb displayed markedly decreased growth rates, sporulation ratios, transformation efficiencies and maintenance coefficients, as well as increased cell yields. We re-engineered the genome-reduced strains to produce guanosine and thymidine, respectively. The strain BSK814G2, in which purA was knocked out, and prs, purF and guaB were co-overexpressed, produced 115.2 mg/L of guanosine, which was 4.4-fold higher compared to the control strain constructed by introducing the same gene modifications into the parental strain. We also constructed a thymidine producer by deleting the tdk gene and overexpressing the prs, ushA, thyA, dut, and ndk genes from Escherichia coli in strain BSK756, and the resulting strain BSK756T3 accumulated 151.2 mg/L thymidine, showing a 5.2-fold increase compared to the corresponding control strain. Genome-scale genetic manipulation has a variety of effects on the physiological characteristics and cell metabolism of B. subtilis. By introducing specific gene modifications related to guanosine and thymidine accumulation, respectively, we demonstrated that genome-reduced strains had greatly improved properties compared to the wild-type strain as chassis cells for the production of these two products. These strains also have great potential for the production of other nucleosides and similar derived chemicals.

  13. Comparative Genomics of the Balsaminaceae Sister Genera Hydrocera triflora and Impatiens pinfanensis

    PubMed Central

    Li, Zhi-Zhong; Saina, Josphat K.; Gichira, Andrew W.; Kyalo, Cornelius M.; Wang, Qing-Feng

    2018-01-01

    The family Balsaminaceae, which consists of the economically important genus Impatiens and the monotypic genus Hydrocera, lacks a reported or published complete chloroplast genome sequence. Therefore, chloroplast genome sequences of the two sister genera are significant to give insight into the phylogenetic position and understanding the evolution of the Balsaminaceae family among the Ericales. In this study, complete chloroplast (cp) genomes of Impatiens pinfanensis and Hydrocera triflora were characterized and assembled using a high-throughput sequencing method. The complete cp genomes were found to possess the typical quadripartite structure of land plants chloroplast genomes with double-stranded molecules of 154,189 bp (Impatiens pinfanensis) and 152,238 bp (Hydrocera triflora) in length. A total of 115 unique genes were identified in both genomes, of which 80 are protein-coding genes, 31 are distinct transfer RNA (tRNA) and four distinct ribosomal RNA (rRNA). Thirty codons, of which 29 had A/T ending codons, revealed relative synonymous codon usage values of >1, whereas those with G/C ending codons displayed values of <1. The simple sequence repeats comprise mostly the mononucleotide repeats A/T in all examined cp genomes. Phylogenetic analysis based on 51 common protein-coding genes indicated that the Balsaminaceae family formed a lineage with Ebenaceae together with all the other Ericales. PMID:29360746

  14. Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region

    PubMed Central

    Jiang, Zhi J; Castoe, Todd A; Austin, Christopher C; Burbrink, Frank T; Herron, Matthew D; McGuire, Jimmy A; Parkinson, Christopher L; Pollock, David D

    2007-01-01

    Background The mitochondrial genomes of snakes are characterized by an overall evolutionary rate that appears to be one of the most accelerated among vertebrates. They also possess other unusual features, including short tRNAs and other genes, and a duplicated control region that has been stably maintained since it originated more than 70 million years ago. Here, we provide a detailed analysis of evolutionary dynamics in snake mitochondrial genomes to better understand the basis of these extreme characteristics, and to explore the relationship between mitochondrial genome molecular evolution, genome architecture, and molecular function. We sequenced complete mitochondrial genomes from Slowinski's corn snake (Pantherophis slowinskii) and two cottonmouths (Agkistrodon piscivorus) to complement previously existing mitochondrial genomes, and to provide an improved comparative view of how genome architecture affects molecular evolution at contrasting levels of divergence. Results We present a Bayesian genetic approach that suggests that the duplicated control region can function as an additional origin of heavy strand replication. The two control regions also appear to have different intra-specific versus inter-specific evolutionary dynamics that may be associated with complex modes of concerted evolution. We find that different genomic regions have experienced substantial accelerated evolution along early branches in snakes, with different genes having experienced dramatic accelerations along specific branches. Some of these accelerations appear to coincide with, or subsequent to, the shortening of various mitochondrial genes and the duplication of the control region and flanking tRNAs. Conclusion Fluctuations in the strength and pattern of selection during snake evolution have had widely varying gene-specific effects on substitution rates, and these rate accelerations may have been functionally related to unusual changes in genomic architecture. The among-lineage and among-gene variation in rate dynamics observed in snakes is the most extreme thus far observed in animal genomes, and provides an important study system for further evaluating the biochemical and physiological basis of evolutionary pressures in vertebrate mitochondria. PMID:17655768

  15. Genome wide identification, phylogeny, and expression of bone morphogenetic protein genes in tetraploidized common carp (Cyprinus carpio).

    PubMed

    Chen, Lin; Dong, Chuanju; Kong, Shengnan; Zhang, Jiangfan; Li, Xuejun; Xu, Peng

    2017-09-05

    Bone morphogenetic proteins (Bmps) are a group of signaling molecules known to play important roles during formation and maintenance of various organs, not only bone, but also muscle, blood and so on. Common carp (Cyprinus carpio) is one of the most intensively studied fish due to its economic and environmental importance. Besides, common carp has encountered an additional round of whole genome duplication (WGD) compared with many closely related diploid teleost, which make it one of the most important models for genome evolutionary studies in teleost. Comprehensive genome resources of common carp have been developed recently, which facilitate the thorough characterization of bmp gene family in the tetraploidized common carp genome. We identified a total of 44 bmps from the common carp genome, which are twice as many as that of zebrafish. Phylogenetic analysis revealed that most of bmps are highly conserved. Comparative analysis was performed across six typical vertebrate genomes. It appeared that all the bmp genes in common carp were duplicated. Obviously, the expansion of the bmp gene family in common carp was due to the latest additional round of whole genome duplication and made it more abundant than other diploid teleosts. Expression signatures were assessed in major tissues, including gill, intestine, liver, spleen, skin, heart, gonad, muscle, kidney, head kidney, brain and blood, which demonstrated the comprehensive expression profiles of bmp genes in the tetraploidized genome. Significant gene expression divergences were observed which revealed substantial functional divergences of those duplicated bmp genes post the latest WGD event. The conserved synteny blocks of bmp5s revealed the genome rearrangement of common carp post the 4R WGD. The whole set of bmp gene family in common carp provides insight into gene fate of tetraploidized common carp genome post recent WGD. Copyright © 2017. Published by Elsevier B.V.

  16. Molecular profiling reveals frequent gain of MYCN and anaplasia-specific loss of 4q and 14q in Wilms tumor.

    PubMed

    Williams, Richard D; Al-Saadi, Reem; Natrajan, Rachael; Mackay, Alan; Chagtai, Tasnim; Little, Suzanne; Hing, Sandra N; Fenwick, Kerry; Ashworth, Alan; Grundy, Paul; Anderson, James R; Dome, Jeffrey S; Perlman, Elizabeth J; Jones, Chris; Pritchard-Jones, Kathy

    2011-12-01

    Anaplasia in Wilms tumor, a distinctive histology characterized by abnormal mitoses, is associated with poor patient outcome. While anaplastic tumors frequently harbour TP53 mutations, little is otherwise known about their molecular biology. We have used array comparative genomic hybridization (aCGH) and cDNA microarray expression profiling to compare anaplastic and favorable histology Wilms tumors to determine their common and differentiating features. In addition to changes on 17p, consistent with TP53 deletion, recurrent anaplasia-specific genomic loss and under-expression were noted in several other regions, most strikingly 4q and 14q. Further aberrations, including gain of 1q and loss of 16q were common to both histologies. Focal gain of MYCN, initially detected by high resolution aCGH profiling in 6/61 anaplastic samples, was confirmed in a significant proportion of both tumor types by a genomic quantitative PCR survey of over 400 tumors. Overall, these results are consistent with a model where anaplasia, rather than forming an entirely distinct molecular entity, arises from the general continuum of Wilms tumor by the acquisition of additional genomic changes at multiple loci. Copyright © 2011 Wiley Periodicals, Inc.

  17. Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

    PubMed Central

    Noble, Luke M.; Andrianopoulos, Alex

    2013-01-01

    Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226

  18. Molecular cytogenetic definition of the chicken genome: the first complete avian karyotype.

    PubMed Central

    Masabanda, Julio S; Burt, Dave W; O'Brien, Patricia C M; Vignal, Alain; Fillon, Valerie; Walsh, Philippa S; Cox, Helen; Tempest, Helen G; Smith, Jacqueline; Habermann, Felix; Schmid, Michael; Matsuda, Yoichi; Ferguson-Smith, Malcolm A; Crooijmans, Richard P M A; Groenen, Martien A M; Griffin, Darren K

    2004-01-01

    Chicken genome mapping is important for a range of scientific disciplines. The ability to distinguish chromosomes of the chicken and other birds is thus a priority. Here we describe the molecular cytogenetic characterization of each chicken chromosome using chromosome painting and mapping of individual clones by FISH. Where possible, we have assigned the chromosomes to known linkage groups. We propose, on the basis of size, that the NOR chromosome is approximately the size of chromosome 22; however, we suggest that its original assignment of 16 should be retained. We also suggest a definitive chromosome classification system and propose that the probes developed here will find wide utility in the fields of developmental biology, DT40 studies, agriculture, vertebrate genome organization, and comparative mapping of avian species. PMID:15082555

  19. Induction of infectious petunia vein clearing (pararetro) virus from endogenous provirus in petunia

    PubMed Central

    Richert-Pöggeler, Katja R.; Noreen, Faiza; Schwarzacher, Trude; Harper, Glyn; Hohn, Thomas

    2003-01-01

    Infection by an endogenous pararetrovirus using forms of both episomal and chromosomal origin has been demonstrated and characterized, together with evidence that petunia vein clearing virus (PVCV) is a constituent of the Petunia hybrida genome. Our findings allow comparative and direct analysis of horizontally and vertically transmitted virus forms and demonstrate their infectivity using biolistic transformation of a provirus-free petunia species. Some integrants within the genome of P.hybrida are arranged in tandem, allowing direct release of virus by transcription. In addition to known inducers of endogenous pararetroviruses, such as genome hybridization, tissue culture and abiotic stresses, we observed activation of PVCV after wounding. Our data also support the hypothesis that the host plant uses DNA methylation to control the endogenous pararetrovirus. PMID:12970195

  20. Cryptosporidium as a testbed for single cell genome characterization of unicellular eukaryotes.

    PubMed

    Troell, Karin; Hallström, Björn; Divne, Anna-Maria; Alsmark, Cecilia; Arrighi, Romanico; Huss, Mikael; Beser, Jessica; Bertilsson, Stefan

    2016-06-23

    Infectious disease involving multiple genetically distinct populations of pathogens is frequently concurrent, but difficult to detect or describe with current routine methodology. Cryptosporidium sp. is a widespread gastrointestinal protozoan of global significance in both animals and humans. It cannot be easily maintained in culture and infections of multiple strains have been reported. To explore the potential use of single cell genomics methodology for revealing genome-level variation in clinical samples from Cryptosporidium-infected hosts, we sorted individual oocysts for subsequent genome amplification and full-genome sequencing. Cells were identified with fluorescent antibodies with an 80 % success rate for the entire single cell genomics workflow, demonstrating that the methodology can be applied directly to purified fecal samples. Ten amplified genomes from sorted single cells were selected for genome sequencing and compared both to the original population and a reference genome in order to evaluate the accuracy and performance of the method. Single cell genome coverage was on average 81 % even with a moderate sequencing effort and by combining the 10 single cell genomes, the full genome was accounted for. By a comparison to the original sample, biological variation could be distinguished and separated from noise introduced in the amplification. As a proof of principle, we have demonstrated the power of applying single cell genomics to dissect infectious disease caused by closely related parasite species or subtypes. The workflow can easily be expanded and adapted to target other protozoans, and potential applications include mapping genome-encoded traits, virulence, pathogenicity, host specificity and resistance at the level of cells as truly meaningful biological units.

  1. Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics.

    PubMed

    Ding, Li; Bailey, Matthew H; Porta-Pardo, Eduard; Thorsson, Vesteinn; Colaprico, Antonio; Bertrand, Denis; Gibbs, David L; Weerasinghe, Amila; Huang, Kuan-Lin; Tokheim, Collin; Cortés-Ciriano, Isidro; Jayasinghe, Reyka; Chen, Feng; Yu, Lihua; Sun, Sam; Olsen, Catharina; Kim, Jaegil; Taylor, Alison M; Cherniack, Andrew D; Akbani, Rehan; Suphavilai, Chayaporn; Nagarajan, Niranjan; Stuart, Joshua M; Mills, Gordon B; Wyczalkowski, Matthew A; Vincent, Benjamin G; Hutter, Carolyn M; Zenklusen, Jean Claude; Hoadley, Katherine A; Wendl, Michael C; Shmulevich, Llya; Lazar, Alexander J; Wheeler, David A; Getz, Gad

    2018-04-05

    The Cancer Genome Atlas (TCGA) has catalyzed systematic characterization of diverse genomic alterations underlying human cancers. At this historic junction marking the completion of genomic characterization of over 11,000 tumors from 33 cancer types, we present our current understanding of the molecular processes governing oncogenesis. We illustrate our insights into cancer through synthesis of the findings of the TCGA PanCancer Atlas project on three facets of oncogenesis: (1) somatic driver mutations, germline pathogenic variants, and their interactions in the tumor; (2) the influence of the tumor genome and epigenome on transcriptome and proteome; and (3) the relationship between tumor and the microenvironment, including implications for drugs targeting driver events and immunotherapies. These results will anchor future characterization of rare and common tumor types, primary and relapsed tumors, and cancers across ancestry groups and will guide the deployment of clinical genomic sequencing. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck.

    PubMed

    Licciardello, Concetta; D'Agostino, Nunzio; Traini, Alessandra; Recupero, Giuseppe Reforgiato; Frusciante, Luigi; Chiusano, Maria Luisa

    2014-02-03

    Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar.

  3. Plastomes of the green algae Hydrodictyon reticulatum and Pediastrum duplex (Sphaeropleales, Chlorophyceae).

    PubMed

    McManus, Hilary A; Sanchez, Daniel J; Karol, Kenneth G

    2017-01-01

    Comparative studies of chloroplast genomes (plastomes) across the Chlorophyceae are revealing dynamic patterns of size variation, gene content, and genome rearrangements. Phylogenomic analyses are improving resolution of relationships, and uncovering novel lineages as new plastomes continue to be characterized. To gain further insight into the evolution of the chlorophyte plastome and increase the number of representative plastomes for the Sphaeropleales, this study presents two fully sequenced plastomes from the green algal family Hydrodictyaceae (Sphaeropleales, Chlorophyceae), one from Hydrodictyon reticulatum and the other from Pediastrum duplex . Genomic DNA from Hydrodictyon reticulatum and Pediastrum duplex was subjected to Illumina paired-end sequencing and the complete plastomes were assembled for each. Plastome size and gene content were characterized and compared with other plastomes from the Sphaeropleales. Homology searches using BLASTX were used to characterize introns and open reading frames (orfs) ≥ 300 bp. A phylogenetic analysis of gene order across the Sphaeropleales was performed. The plastome of Hydrodictyon reticulatum is 225,641 bp and Pediastrum duplex is 232,554 bp. The plastome structure and gene order of H. reticulatum and P. duplex are more similar to each other than to other members of the Sphaeropleales. Numerous unique open reading frames are found in both plastomes and the plastome of P. duplex contains putative viral protein genes, not found in other Sphaeropleales plastomes. Gene order analyses support the monophyly of the Hydrodictyaceae and their sister relationship to the Neochloridaceae. The complete plastomes of Hydrodictyon reticulatum and Pediastrum duplex , representing the largest of the Sphaeropleales sequenced thus far, once again highlight the variability in size, architecture, gene order and content across the Chlorophyceae. Novel intron insertion sites and unique orfs indicate recent, independent invasions into each plastome, a hypothesis testable with an expanded plastome investigation within the Hydrodictyaceae.

  4. Characterization of Five Novel Brevibacillus Bacteriophages and Genomic Comparison of Brevibacillus Phages

    PubMed Central

    Berg, Jordan A.; Merrill, Bryan D.; Crockett, Justin T.; Esplin, Kyle P.; Evans, Marlee R.; Heaton, Karli E.; Hilton, Jared A.; Hyde, Jonathan R.; McBride, Morgan S.; Schouten, Jordan T.; Simister, Austin R.; Thurgood, Trever L.; Ward, Andrew T.; Breakwell, Donald P.; Hope, Sandra; Grose, Julianne H.

    2016-01-01

    Brevibacillus laterosporus is a spore-forming bacterium that causes a secondary infection in beehives following European Foulbrood disease. To better understand the contributions of Brevibacillus bacteriophages to the evolution of their hosts, five novel phages (Jenst, Osiris, Powder, SecTim467, and Sundance) were isolated and characterized. When compared with the five Brevibacillus phages currently in NCBI, these phages were assigned to clusters based on whole genome and proteome synteny. Powder and Osiris, both myoviruses, were assigned to the previously described Jimmer-like cluster. SecTim467 and Jenst, both siphoviruses, formed a novel phage cluster. Sundance, a siphovirus, was assigned as a singleton phage along with the previously isolated singleton, Emery. In addition to characterizing the basic relationships between these phages, several genomic features were observed. A motif repeated throughout phages Jenst and SecTim467 was frequently upstream of genes predicted to function in DNA replication, nucleotide metabolism, and transcription, suggesting transcriptional co-regulation. In addition, paralogous gene pairs that encode a putative transcriptional regulator were identified in four Brevibacillus phages. These paralogs likely evolved to bind different DNA sequences due to variation at amino acid residues predicted to bind specific nucleotides. Finally, a putative transposable element was identified in SecTim467 and Sundance that carries genes homologous to those found in Brevibacillus chromosomes. Remnants of this transposable element were also identified in phage Jenst. These discoveries provide a greater understanding of the diversity of phages, their behavior, and their evolutionary relationships to one another and to their host. In addition, they provide a foundation with which further Brevibacillus phages can be compared. PMID:27304881

  5. Characterization of a novel Lactobacillus species closely related to Lactobacillus johnsonii using a combination of molecular and comparative genomics methods

    PubMed Central

    2010-01-01

    Background Comparative genomic hybridization (CGH) constitutes a powerful tool for identification and characterization of bacterial strains. In this study we have applied this technique for the characterization of a number of Lactobacillus strains isolated from the intestinal content of rats fed with a diet supplemented with sorbitol. Results Phylogenetic analysis based on 16S rRNA gene, recA, pheS, pyrG and tuf sequences identified five bacterial strains isolated from the intestinal content of rats as belonging to the recently described Lactobacillus taiwanensis species. DNA-DNA hybridization experiments confirmed that these five strains are distinct but closely related to Lactobacillus johnsonii and Lactobacillus gasseri. A whole genome DNA microarray designed for the probiotic L. johnsonii strain NCC533 was used for CGH analysis of L. johnsonii ATCC 33200T, L. johnsonii BL261, L. gasseri ATCC 33323T and L. taiwanensis BL263. In these experiments, the fluorescence ratio distributions obtained with L. taiwanensis and L. gasseri showed characteristic inter-species profiles. The percentage of conserved L. johnsonii NCC533 genes was about 83% in the L. johnsonii strains comparisons and decreased to 51% and 47% for L. taiwanensis and L. gasseri, respectively. These results confirmed the separate status of L. taiwanensis from L. johnsonii at the level of species, and also that L. taiwanensis is closer to L. johnsonii than L. gasseri is to L. johnsonii. Conclusion Conventional taxonomic analyses and microarray-based CGH analysis have been used for the identification and characterization of the newly species L. taiwanensis. The microarray-based CGH technology has been shown as a remarkable tool for the identification and fine discrimination between phylogenetically close species, and additionally provided insight into the adaptation of the strain L. taiwanensis BL263 to its ecological niche. PMID:20849602

  6. Characterization and Complete Genome Sequences of Three N4-Like Roseobacter Phages Isolated from the South China Sea.

    PubMed

    Li, Baolian; Zhang, Si; Long, Lijuan; Huang, Sijun

    2016-09-01

    Three bacteriophages (RD-1410W1-01, RD-1410Ws-07, and DS-1410Ws-06) were isolated from the surface water of Sanya Bay, northern South China Sea, on two marine bacteria type strains of the Roseobacter lineage. These phages have an isometric head and a short tail, morphologically belonging to the Podoviridae family. Two of these phages can infect four of seven marine roseobacter strains tested and the other one can infect three of them, showing relatively broader host ranges compared to known N4-like roseophages. One-step growth curves showed that these phages have similar short latent periods (1-2 h) but highly variable burst sizes (27-341 pfu cell(-1)). Their complete genomes show high level of similarities to known N4-like roseophages in terms of genome size, G + C content, gene content, and arrangement. The morphological and genomic features of these phages indicate that they belong to the N4likevirus genus. Moreover, comparative genomic analysis based on 43 N4-like phages (10 roseobacter phages and 33 phages infecting other lineages of bacteria) revealed a core genome of 18 genes shared by all the 43 phages and 38 genes shared by all the ten roseophages. The 38 core genes of N4-like roseophages nearly make up 70 % of each genome in length. Phylogenetic analysis based on the concatenated core gene products showed that our phage isolates represent two new phyletic branches, suggesting the broad genetic diversity of marine N4-like roseophages remains.

  7. Comparative genomics of the Bifidobacterium breve taxon.

    PubMed

    Bottacini, Francesca; O'Connell Motherway, Mary; Kuczynski, Justin; O'Connell, Kerry Joan; Serafini, Fausta; Duranti, Sabrina; Milani, Christian; Turroni, Francesca; Lugli, Gabriele Andrea; Zomer, Aldert; Zhurina, Daria; Riedel, Christian; Ventura, Marco; van Sinderen, Douwe

    2014-03-01

    Bifidobacteria are commonly found as part of the microbiota of the gastrointestinal tract (GIT) of a broad range of hosts, where their presence is positively correlated with the host's health status. In this study, we assessed the genomes of thirteen representatives of Bifidobacterium breve, which is not only a frequently encountered component of the (adult and infant) human gut microbiota, but can also be isolated from human milk and vagina. In silico analysis of genome sequences from thirteen B. breve strains isolated from different environments (infant and adult faeces, human milk, human vagina) shows that the genetic variability of this species principally consists of hypothetical genes and mobile elements, but, interestingly, also genes correlated with the adaptation to host environment and gut colonization. These latter genes specify the biosynthetic machinery for sortase-dependent pili and exopolysaccharide production, as well as genes that provide protection against invasion of foreign DNA (i.e. CRISPR loci and restriction/modification systems), and genes that encode enzymes responsible for carbohydrate fermentation. Gene-trait matching analysis showed clear correlations between known metabolic capabilities and characterized genes, and it also allowed the identification of a gene cluster involved in the utilization of the alcohol-sugar sorbitol. Genome analysis of thirteen representatives of the B. breve species revealed that the deduced pan-genome exhibits an essentially close trend. For this reason our analyses suggest that this number of B. breve representatives is sufficient to fully describe the pan-genome of this species. Comparative genomics also facilitated the genetic explanation for differential carbon source utilization phenotypes previously observed in different strains of B. breve.

  8. SpinachDB: A Well-Characterized Genomic Database for Gene Family Classification and SNP Information of Spinach.

    PubMed

    Yang, Xue-Dong; Tan, Hua-Wei; Zhu, Wei-Min

    2016-01-01

    Spinach (Spinacia oleracea L.), which originated in central and western Asia, belongs to the family Amaranthaceae. Spinach is one of most important leafy vegetables with a high nutritional value as well as being a perfect research material for plant sex chromosome models. As the completion of genome assembly and gene prediction of spinach, we developed SpinachDB (http://222.73.98.124/spinachdb) to store, annotate, mine and analyze genomics and genetics datasets efficiently. In this study, all of 21702 spinach genes were annotated. A total of 15741 spinach genes were catalogued into 4351 families, including identification of a substantial number of transcription factors. To construct a high-density genetic map, a total of 131592 SSRs and 1125743 potential SNPs located in 548801 loci of spinach genome were identified in 11 cultivated and wild spinach cultivars. The expression profiles were also performed with RNA-seq data using the FPKM method, which could be used to compare the genes. Paralogs in spinach and the orthologous genes in Arabidopsis, grape, sugar beet and rice were identified for comparative genome analysis. Finally, the SpinachDB website contains seven main sections, including the homepage; the GBrowse map that integrates genome, genes, SSR and SNP marker information; the Blast alignment service; the gene family classification search tool; the orthologous and paralogous gene pairs search tool; and the download and useful contact information. SpinachDB will be continually expanded to include newly generated robust genomics and genetics data sets along with the associated data mining and analysis tools.

  9. Sequence analysis of the PIP5K locus in Eimeria maxima provides further evidence for eimerian genome plasticity and segmental organization.

    PubMed

    Song, B K; Pan, M Z; Lau, Y L; Wan, K L

    2014-07-29

    Commercial flocks infected by Eimeria species parasites, including Eimeria maxima, have an increased risk of developing clinical or subclinical coccidiosis; an intestinal enteritis associated with increased mortality rates in poultry. Currently, infection control is largely based on chemotherapy or live vaccines; however, drug resistance is common and vaccines are relatively expensive. The development of new cost-effective intervention measures will benefit from unraveling the complex genetic mechanisms that underlie host-parasite interactions, including the identification and characterization of genes encoding proteins such as phosphatidylinositol 4-phosphate 5-kinase (PIP5K). We previously identified a PIP5K coding sequence within the E. maxima genome. In this study, we analyzed two bacterial artificial chromosome clones presenting a ~145-kb E. maxima (Weybridge strain) genomic region spanning the PIP5K gene locus. Sequence analysis revealed that ~95% of the simple sequence repeats detected were located within regions comparable to the previously described feature-rich segments of the Eimeria tenella genome. Comparative sequence analysis with the orthologous E. maxima (Houghton strain) region revealed a moderate level of conserved synteny. Unique segmental organizations and telomere-like repeats were also observed in both genomes. A number of incomplete transposable elements were detected and further scrutiny of these elements in both orthologous segments revealed interesting nesting events, which may play a role in facilitating genome plasticity in E. maxima. The current analysis provides more detailed information about the genome organization of E. maxima and may help to reveal genotypic differences that are important for expression of traits related to pathogenicity and virulence.

  10. Comparative Genomics of Klebsiella pneumoniae Strains with Different Antibiotic Resistance Profiles▿†

    PubMed Central

    Kumar, Vinod; Sun, Peng; Vamathevan, Jessica; Li, Yong; Ingraham, Karen; Palmer, Leslie; Huang, Jianzhong; Brown, James R.

    2011-01-01

    There is a global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. While the epidemiology of K. pneumoniae strains and occurrences of specific antibiotic resistance genes, such as plasmid-borne extended-spectrum β-lactamases (ESBLs), have been extensively studied, only four complete genomes of K. pneumoniae are available. To better understand the multidrug resistance factors in K. pneumoniae, we determined by pyrosequencing the nearly complete genome DNA sequences of two strains with disparate antibiotic resistance profiles, broadly drug-susceptible strain JH1 and strain 1162281, which is resistant to multiple clinically used antibiotics, including extended-spectrum β-lactams, fluoroquinolones, aminoglycosides, trimethoprim, and sulfamethoxazoles. Comparative genomic analysis of JH1, 1162281, and other published K. pneumoniae genomes revealed a core set of 3,631 conserved orthologous proteins, which were used for reconstruction of whole-genome phylogenetic trees. The close evolutionary relationship between JH1 and 1162281 relative to other K. pneumoniae strains suggests that a large component of the genetic and phenotypic diversity of clinical isolates is due to horizontal gene transfer. Using curated lists of over 400 antibiotic resistance genes, we identified all of the elements that differentiated the antibiotic profile of MDR strain 1162281 from that of susceptible strain JH1, such as the presence of additional efflux pumps, ESBLs, and multiple mechanisms of fluoroquinolone resistance. Our study adds new and significant DNA sequence data on K. pneumoniae strains and demonstrates the value of whole-genome sequencing in characterizing multidrug resistance in clinical isolates. PMID:21746949

  11. Mitochondrial-nuclear genome interactions in nonalcoholic fatty liver disease in mice

    PubMed Central

    Betancourt, Angela M.; King, Adrienne L.; Fetterman, Jessica L.; Millender-Swain, Telisha; Finley, Rachel D.; Oliva, Claudia R.; Crowe, David Ralph; Ballinger, Scott W.; Bailey, Shannon M.

    2014-01-01

    Nonalcoholic fatty liver disease (NAFLD) involves significant changes in liver metabolism characterized by oxidative stress, lipid accumulation, and fibrogenesis. Mitochondrial dysfunction and bioenergetic defects also contribute to NAFLD. Herein, we examined whether differences in mtDNA influence NAFLD. To determine the role of mitochondrial and nuclear genomes in NAFLD, Mitochondrial-Nuclear eXchange (MNX) mice were fed an atherogenic diet. MNX mice have mtDNA from C57BL/6J mice on a C3H/HeN nuclear background and vice versa. Results from MNX mice were compared to wild-type C57BL/6J and C3H/HeN mice fed a control or atherogenic diet. Mice with the C57BL/6J nuclear genome developed more macrosteatosis, inflammation, and fibrosis compared with mice containing the C3H/HeN nuclear genome when fed the atherogenic diet. These changes were associated with parallel alterations in inflammation and fibrosis gene expression in wild-type mice, with intermediate responses in MNX mice. Mice with the C57BL/6J nuclear genome had increased State 4 respiration, whereas MNX mice had decreased State 3 respiration and RCR when fed the atherogenic diet. Complex IV activity and most mitochondrial biogenesis genes were increased in mice with the C57BL/6J nuclear or mitochondrial genome, or both fed the atherogenic diet. These results reveal new interactions between mitochondrial and nuclear genomes and support the concept that mtDNA influences mitochondrial function and metabolic pathways implicated in NAFLD. PMID:24758559

  12. Mitochondrial-nuclear genome interactions in non-alcoholic fatty liver disease in mice.

    PubMed

    Betancourt, Angela M; King, Adrienne L; Fetterman, Jessica L; Millender-Swain, Telisha; Finley, Rachel D; Oliva, Claudia R; Crowe, David R; Ballinger, Scott W; Bailey, Shannon M

    2014-07-15

    NAFLD (non-alcoholic fatty liver disease) involves significant changes in liver metabolism characterized by oxidative stress, lipid accumulation and fibrogenesis. Mitochondrial dysfunction and bioenergetic defects also contribute to NAFLD. In the present study, we examined whether differences in mtDNA influence NAFLD. To determine the role of mitochondrial and nuclear genomes in NAFLD, MNX (mitochondrial-nuclear exchange) mice were fed an atherogenic diet. MNX mice have mtDNA from C57BL/6J mice on a C3H/HeN nuclear background and vice versa. Results from MNX mice were compared with wild-type C57BL/6J and C3H/HeN mice fed a control or atherogenic diet. Mice with the C57BL/6J nuclear genome developed more macrosteatosis, inflammation and fibrosis compared with mice containing the C3H/HeN nuclear genome when fed the atherogenic diet. These changes were associated with parallel alterations in inflammation and fibrosis gene expression in wild-type mice, with intermediate responses in MNX mice. Mice with the C57BL/6J nuclear genome had increased State 4 respiration, whereas MNX mice had decreased State 3 respiration and RCR (respiratory control ratio) when fed the atherogenic diet. Complex IV activity and most mitochondrial biogenesis genes were increased in mice with the C57BL/6J nuclear or mitochondrial genome, or both fed the atherogenic diet. These results reveal new interactions between mitochondrial and nuclear genomes and support the concept that mtDNA influences mitochondrial function and metabolic pathways implicated in NAFLD.

  13. Characterization of Aeromonas hydrophila wound pathotypes by comparative genomic and functional analyses of virulence genes.

    PubMed

    Grim, Christopher J; Kozlova, Elena V; Sha, Jian; Fitts, Eric C; van Lier, Christina J; Kirtley, Michelle L; Joseph, Sandeep J; Read, Timothy D; Burd, Eileen M; Tall, Ben D; Joseph, Sam W; Horneman, Amy J; Chopra, Ashok K; Shak, Joshua R

    2013-04-23

    Aeromonas hydrophila has increasingly been implicated as a virulent and antibiotic-resistant etiologic agent in various human diseases. In a previously published case report, we described a subject with a polymicrobial wound infection that included a persistent and aggressive strain of A. hydrophila (E1), as well as a more antibiotic-resistant strain of A. hydrophila (E2). To better understand the differences between pathogenic and environmental strains of A. hydrophila, we conducted comparative genomic and functional analyses of virulence-associated genes of these two wound isolates (E1 and E2), the environmental type strain A. hydrophila ATCC 7966(T), and four other isolates belonging to A. aquariorum, A. veronii, A. salmonicida, and A. caviae. Full-genome sequencing of strains E1 and E2 revealed extensive differences between the two and strain ATCC 7966(T). The more persistent wound infection strain, E1, harbored coding sequences for a cytotoxic enterotoxin (Act), a type 3 secretion system (T3SS), flagella, hemolysins, and a homolog of exotoxin A found in Pseudomonas aeruginosa. Corresponding phenotypic analyses with A. hydrophila ATCC 7966(T) and SSU as reference strains demonstrated the functionality of these virulence genes, with strain E1 displaying enhanced swimming and swarming motility, lateral flagella on electron microscopy, the presence of T3SS effector AexU, and enhanced lethality in a mouse model of Aeromonas infection. By combining sequence-based analysis and functional assays, we characterized an A. hydrophila pathotype, exemplified by strain E1, that exhibited increased virulence in a mouse model of infection, likely because of encapsulation, enhanced motility, toxin secretion, and cellular toxicity. Aeromonas hydrophila is a common aquatic bacterium that has increasingly been implicated in serious human infections. While many determinants of virulence have been identified in Aeromonas, rapid identification of pathogenic versus nonpathogenic strains remains a challenge for this genus, as it is for other opportunistic pathogens. This paper demonstrates, by using whole-genome sequencing of clinical Aeromonas strains, followed by corresponding virulence assays, that comparative genomics can be used to identify a virulent subtype of A. hydrophila that is aggressive during human infection and more lethal in a mouse model of infection. This aggressive pathotype contained genes for toxin production, toxin secretion, and bacterial motility that likely enabled its pathogenicity. Our results highlight the potential of whole-genome sequencing to transform microbial diagnostics; with further advances in rapid sequencing and annotation, genomic analysis will be able to provide timely information on the identities and virulence potential of clinically isolated microorganisms.

  14. Characterization of Transposable Elements in Laccaria bicolor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

    2012-01-01

    Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copies elements distributed within 172 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs are ancient except some terminal inverted repeats (TIRS),more » long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TEs expansion in L. bicolor; the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 500,000 years ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis represents an initial characterization of TEs in the L. bicolor genome, contributes to genome assembly and to a greater understanding of the role TEs played in genome organization and evolution, and provides a valuable resource for the ongoing Laccaria Pan-Genome project supported by the U.S.-DOE Joint Genome Institute.« less

  15. Genome variations associated with viral susceptibility and calcification in Emiliania huxleyi.

    PubMed

    Kegel, Jessica U; John, Uwe; Valentin, Klaus; Frickenhaus, Stephan

    2013-01-01

    Emiliania huxleyi, a key player in the global carbon cycle is one of the best studied coccolithophores with respect to biogeochemical cycles, climatology, and host-virus interactions. Strains of E. huxleyi show phenotypic plasticity regarding growth behaviour, light-response, calcification, acidification, and virus susceptibility. This phenomenon is likely a consequence of genomic differences, or transcriptomic responses, to environmental conditions or threats such as viral infections. We used an E. huxleyi genome microarray based on the sequenced strain CCMP1516 (reference strain) to perform comparative genomic hybridizations (CGH) of 16 E. huxleyi strains of different geographic origin. We investigated the genomic diversity and plasticity and focused on the identification of genes related to virus susceptibility and coccolith production (calcification). Among the tested 31940 gene models a core genome of 14628 genes was identified by hybridization among 16 E. huxleyi strains. 224 probes were characterized as specific for the reference strain CCMP1516. Compared to the sequenced E. huxleyi strain CCMP1516 variation in gene content of up to 30 percent among strains was observed. Comparison of core and non-core transcripts sets in terms of annotated functions reveals a broad, almost equal functional coverage over all KOG-categories of both transcript sets within the whole annotated genome. Within the variable (non-core) genome we identified genes associated with virus susceptibility and calcification. Genes associated with virus susceptibility include a Bax inhibitor-1 protein, three LRR receptor-like protein kinases, and mitogen-activated protein kinase. Our list of transcripts associated with coccolith production will stimulate further research, e.g. by genetic manipulation. In particular, the V-type proton ATPase 16 kDa proteolipid subunit is proposed to be a plausible target gene for further calcification studies.

  16. Genome Variations Associated with Viral Susceptibility and Calcification in Emiliania huxleyi

    PubMed Central

    Kegel, Jessica U.; John, Uwe; Valentin, Klaus; Frickenhaus, Stephan

    2013-01-01

    Emiliania huxleyi, a key player in the global carbon cycle is one of the best studied coccolithophores with respect to biogeochemical cycles, climatology, and host-virus interactions. Strains of E. huxleyi show phenotypic plasticity regarding growth behaviour, light-response, calcification, acidification, and virus susceptibility. This phenomenon is likely a consequence of genomic differences, or transcriptomic responses, to environmental conditions or threats such as viral infections. We used an E. huxleyi genome microarray based on the sequenced strain CCMP1516 (reference strain) to perform comparative genomic hybridizations (CGH) of 16 E. huxleyi strains of different geographic origin. We investigated the genomic diversity and plasticity and focused on the identification of genes related to virus susceptibility and coccolith production (calcification). Among the tested 31940 gene models a core genome of 14628 genes was identified by hybridization among 16 E. huxleyi strains. 224 probes were characterized as specific for the reference strain CCMP1516. Compared to the sequenced E. huxleyi strain CCMP1516 variation in gene content of up to 30 percent among strains was observed. Comparison of core and non-core transcripts sets in terms of annotated functions reveals a broad, almost equal functional coverage over all KOG-categories of both transcript sets within the whole annotated genome. Within the variable (non-core) genome we identified genes associated with virus susceptibility and calcification. Genes associated with virus susceptibility include a Bax inhibitor-1 protein, three LRR receptor-like protein kinases, and mitogen-activated protein kinase. Our list of transcripts associated with coccolith production will stimulate further research, e.g. by genetic manipulation. In particular, the V-type proton ATPase 16 kDa proteolipid subunit is proposed to be a plausible target gene for further calcification studies. PMID:24260453

  17. Comparative genome analysis of 24 bovine-associated Staphylococcus isolates with special focus on the putative virulence genes

    PubMed Central

    Åvall-Jääskeläinen, Silja; Paulin, Lars; Blom, Jochen

    2018-01-01

    Non-aureus staphylococci (NAS) are most commonly isolated from subclinical mastitis. Different NAS species may, however, have diverse effects on the inflammatory response in the udder. We determined the genome sequences of 20 staphylococcal isolates from clinical or subclinical bovine mastitis, belonging to the NAS species Staphylococcus agnetis, S. chromogenes, and S. simulans, and focused on the putative virulence factor genes present in the genomes. For comparison we used our previously published genome sequences of four S. aureus isolates from bovine mastitis. The pan-genome and core genomes of the non-aureus isolates were characterized. After that, putative virulence factor orthologues were searched in silico. We compared the presence of putative virulence factors in the NAS species and S. aureus and evaluated the potential association between bacterial genotype and type of mastitis (clinical vs. subclinical). The NAS isolates had much less virulence gene orthologues than the S. aureus isolates. One third of the virulence genes were detected only in S. aureus. About 100 virulence genes were present in all S. aureus isolates, compared to about 40 to 50 in each NAS isolate. S. simulans differed the most. Several of the virulence genes detected among NAS were harbored only by S. simulans, but it also lacked a number of genes present both in S. agnetis and S. chromogenes. The type of mastitis was not associated with any specific virulence gene profile. It seems that the virulence gene profiles or cumulative number of different virulence genes are not directly associated with the type of mastitis (clinical or subclinical), indicating that host derived factors such as the immune status play a pivotal role in the manifestation of mastitis. PMID:29610707

  18. Final Report for LDRD Project 02-ERD-069: Discovering the Unknown Mechanism(s) of Virulence in a BW, Class A Select Agent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chain, P; Garcia, E

    2003-02-06

    The goal of this proposed effort was to assess the difficulty in identifying and characterizing virulence candidate genes in an organism for which very limited data exists. This was accomplished by first addressing the finishing phase of draft-sequenced F. tularensis genomes and conducting comparative analyses to determine the coding potential of each genome; to discover the differences in genome structure and content, and to identify potential genes whose products may be involved in the F. tularensis virulence process. The project was divided into three parts: (1) Genome finishing: This part involves determining the order and orientation of the consensus sequencesmore » of contigs obtained from Phrap assemblies of random draft genomic sequences. This tedious process consists of linking contig ends using information embedded in each sequence file that relates the sequence to the original cloned insert. Since inserts are sequenced from both ends, we can establish a link between these paired-ends in different contigs and thus order and orient contigs. Since these genomes carry numerous copies of insertion sequences, these repeated elements ''confuse'' the Phrap assembly program. It is thus necessary to break these contigs apart at the repeated sequences and individually join the proper flanking regions using paired-end information, or using results of comparisons against a similar genome. Larger repeated elements such as the small subunit ribosomal RNA operon require verification with PCR. Tandem repeats require manual intervention and typically rely on single nucleotide polymorphisms to be resolved. Remaining gaps require PCR reactions and sequencing. Once the genomes have been ''closed'', low quality regions are addressed by resequencing reactions. (2) Genome analysis: The final consensus sequences are processed by combining the results of three gene modelers: Glimmer, Critica and Generation. The final gene models are submitted to a battery of homology searches and domain prediction programs in order to annotate them (e.g. BLAST, Pfam, TIGRfam, COG, KEGG, InterPro, TMhmm, SignalP). The genome structure is also assessed in terms of G+C content, GC bias (GC skew), and locations of repeated regions (e.g. IS elements) and phage-like genes. (3) Comparative genomics: The results of the various genome analyses are compared between the finished (or almost finished) genomes. Here, we have compared the F. tularensis genomes from the extremely lethal strain Schu4 (subsp. tularensis), the vaccine strain LVS (subsp. holartica), and strain UT01-4992 of the less virulent, opportunistic subsp. novicida. Regions present in the highly virulent strain that are absent from the other less virulent strains may provide insight into what factors are required for the high level of virulence.« less

  19. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome.

    PubMed

    Greally, John M

    2002-01-08

    To test whether regions undergoing genomic imprinting have unique genomic characteristics, imprinted and nonimprinted human loci were compared for nucleotide and retroelement composition. Maternally and paternally expressed subgroups of imprinted genes were found to differ in terms of guanine and cytosine, CpG, and retroelement content, indicating a segregation into distinct genomic compartments. Imprinted regions have been normally permissive to L1 long interspersed transposable element retroposition during mammalian evolution but universally and significantly lack short interspersed transposable elements (SINEs). The primate-specific Alu SINEs, as well as the more ancient mammalian-wide interspersed repeat SINEs, are found at significantly low densities in imprinted regions. The latter paleogenomic signature indicates that the sequence characteristics of currently imprinted regions existed before the mammalian radiation. Transitions from imprinted to nonimprinted genomic regions in cis are characterized by a sharp inflection in SINE content, demonstrating that this genomic characteristic can help predict the presence and extent of regions undergoing imprinting. During primate evolution, SINE accumulation in imprinted regions occurred at a decreased rate compared with control loci. The constraint on SINE accumulation in imprinted regions may be mediated by an active selection process. This selection could be because of SINEs attracting and spreading methylation, as has been found at other loci. Methylation-induced silencing could lead to deleterious consequences at imprinted loci, where inactivation of one allele is already established, and expression is often essential for embryonic growth and survival.

  20. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome

    PubMed Central

    Greally, John M.

    2002-01-01

    To test whether regions undergoing genomic imprinting have unique genomic characteristics, imprinted and nonimprinted human loci were compared for nucleotide and retroelement composition. Maternally and paternally expressed subgroups of imprinted genes were found to differ in terms of guanine and cytosine, CpG, and retroelement content, indicating a segregation into distinct genomic compartments. Imprinted regions have been normally permissive to L1 long interspersed transposable element retroposition during mammalian evolution but universally and significantly lack short interspersed transposable elements (SINEs). The primate-specific Alu SINEs, as well as the more ancient mammalian-wide interspersed repeat SINEs, are found at significantly low densities in imprinted regions. The latter paleogenomic signature indicates that the sequence characteristics of currently imprinted regions existed before the mammalian radiation. Transitions from imprinted to nonimprinted genomic regions in cis are characterized by a sharp inflection in SINE content, demonstrating that this genomic characteristic can help predict the presence and extent of regions undergoing imprinting. During primate evolution, SINE accumulation in imprinted regions occurred at a decreased rate compared with control loci. The constraint on SINE accumulation in imprinted regions may be mediated by an active selection process. This selection could be because of SINEs attracting and spreading methylation, as has been found at other loci. Methylation-induced silencing could lead to deleterious consequences at imprinted loci, where inactivation of one allele is already established, and expression is often essential for embryonic growth and survival. PMID:11756672

  1. Characterization of a Genomic Signature of Pregnancy in the Breast

    PubMed Central

    Belitskaya-Lévy, Ilana; Zeleniuch-Jacquotte, Anne; Russo, Jose; Russo, Irma H.; Bordás, Pal; Åhman, Janet; Afanasyeva, Yelena; Johansson, Robert; Lenner, Per; Li, Xiaochun; de Cicco, Ricardo López; Peri, Suraj; Ross, Eric; Russo, Patricia A.; Santucci-Pereira, Julia; Sheriff, Fathima S.; Slifker, Michael; Hallmans, Göran; Toniolo, Paolo; Arslan, Alan A.

    2012-01-01

    The objective of the current study was to comprehensively compare the genomic profiles in the breast of parous and nulliparous postmenopausal women to identify genes that permanently change their expression following pregnancy. The study was designed as a two-phase approach. In the discovery phase, we compared breast genomic profiles of 37 parous with 18 nulliparous postmenopausal women. In the validation phase, confirmation of the genomic patterns observed in the discovery phase was sought in an independent set of 30 parous and 22 nulliparous postmenopausal women. RNA was hybridized to Affymetrix HG_U133 Plus 2.0 oligonucleotide arrays containing probes to 54,675 transcripts; scanned and the images analyzed using Affymetrix GCOS software. Surrogate variable analysis, logistic regression and significance analysis for microarrays were used to identify statistically significant differences in expression of genes. The False Discovery Rate (FDR) approach was used to control for multiple comparisons. We found that 208 genes (305 probe sets) were differentially expressed between parous and nulliparous women in both discovery and validation phases of the study at a FDR of 10% and with at least a 1.25-fold change. These genes are involved in regulation of transcription, centrosome organization, RNA splicing, cell cycle control, adhesion and differentiation. The results provide persuasive evidence that full-term pregnancy induces long-term genomic changes in the breast. The genomic signature of pregnancy could be used as an intermediate marker to assess potential chemopreventive interventions with hormones mimicking the effects of pregnancy for prevention of breast cancer. PMID:21622728

  2. Validating regulatory predictions from diverse bacteria with mutant fitness data

    DOE PAGES

    Sagawa, Shiori; Price, Morgan N.; Deutschbauer, Adam M.; ...

    2017-05-24

    Although transcriptional regulation is fundamental to understanding bacterial physiology, the targets of most bacterial transcription factors are not known. Comparative genomics has been used to identify likely targets of some of these transcription factors, but these predictions typically lack experimental support. Here, we used mutant fitness data, which measures the importance of each gene for a bacterium's growth across many conditions, to test regulatory predictions from RegPrecise, a curated collection of comparative genomics predictions. Because characterized transcription factors often have correlated fitness with one of their targets (either positively or negatively), correlated fitness patterns provide support for the comparative genomicsmore » predictions. At a false discovery rate of 3%, we identified significant cofitness for at least one target of 158 TFs in 107 ortholog groups and from 24 bacteria. Thus, high-throughput genetics can be used to identify a high-confidence subset of the sequence-based regulatory predictions.« less

  3. Validating regulatory predictions from diverse bacteria with mutant fitness data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sagawa, Shiori; Price, Morgan N.; Deutschbauer, Adam M.

    Although transcriptional regulation is fundamental to understanding bacterial physiology, the targets of most bacterial transcription factors are not known. Comparative genomics has been used to identify likely targets of some of these transcription factors, but these predictions typically lack experimental support. Here, we used mutant fitness data, which measures the importance of each gene for a bacterium's growth across many conditions, to test regulatory predictions from RegPrecise, a curated collection of comparative genomics predictions. Because characterized transcription factors often have correlated fitness with one of their targets (either positively or negatively), correlated fitness patterns provide support for the comparative genomicsmore » predictions. At a false discovery rate of 3%, we identified significant cofitness for at least one target of 158 TFs in 107 ortholog groups and from 24 bacteria. Thus, high-throughput genetics can be used to identify a high-confidence subset of the sequence-based regulatory predictions.« less

  4. Exploring the virome of diseased horses

    PubMed Central

    Li, Linlin; Giannitti, Federico; Low, Jason; Keyes, Casey; Ullmann, Leila S.; Deng, Xutao; Aleman, Monica; Pesavento, Patricia A.; Pusterla, Nicola

    2015-01-01

    Metagenomics was used to characterize viral genomes in clinical specimens of horses with various organ-specific diseases of unknown aetiology. A novel parvovirus as well as a previously described hepacivirus closely related to human hepatitis C virus and equid herpesvirus 2 were identified in the cerebrospinal fluid of horses with neurological signs. Four co-infecting picobirnaviruses, including an unusual genome with fused RNA segments, and a divergent anellovirus were found in the plasma of two febrile horses. A novel cyclovirus genome was characterized from the nasal secretion of another febrile animal. Lastly, a small circular DNA genome with a Rep gene, from a virus we called kirkovirus, was identified in the liver and spleen of a horse with fatal idiopathic hepatopathy. This study expands the number of viruses found in horses, and characterizes their genomes to assist future epidemiological studies of their transmission and potential association with various equine diseases. PMID:26044792

  5. Context based computational analysis and characterization of ARS consensus sequences (ACS) of Saccharomyces cerevisiae genome.

    PubMed

    Singh, Vinod Kumar; Krishnamachari, Annangarachari

    2016-09-01

    Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS) requires an essential consensus sequence (ACS) for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC) denoted as ORC-ACS and non-replicating ACS sequences (nrACS), that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  6. Comparative genomic and morphological analyses of Listeria phages isolated from farm environments.

    PubMed

    Denes, Thomas; Vongkamjan, Kitiya; Ackermann, Hans-Wolfgang; Moreno Switt, Andrea I; Wiedmann, Martin; den Bakker, Henk C

    2014-08-01

    The genus Listeria is ubiquitous in the environment and includes the globally important food-borne pathogen Listeria monocytogenes. While the genomic diversity of Listeria has been well studied, considerably less is known about the genomic and morphological diversity of Listeria bacteriophages. In this study, we sequenced and analyzed the genomes of 14 Listeria phages isolated mostly from New York dairy farm environments as well as one related Enterococcus faecalis phage to obtain information on genome characteristics and diversity. We also examined 12 of the phages by electron microscopy to characterize their morphology. These Listeria phages, based on gene orthology and morphology, together with previously sequenced Listeria phages could be classified into five orthoclusters, including one novel orthocluster. One orthocluster (orthocluster I) consists of large genome (~135-kb) myoviruses belonging to the genus “Twort-like viruses,” three orthoclusters (orthoclusters II to IV) contain small-genome (36- to 43-kb) siphoviruses with icosahedral heads, and the novel orthocluster V contains medium-sized-genome (~66-kb) siphoviruses with elongated heads. A novel orthocluster (orthocluster VI) of E. faecalis phages, with medium-sized genomes (~56 kb), was identified, which grouped together and shares morphological features with the novel Listeria phage orthocluster V. This new group of phages (i.e., orthoclusters V and VI) is composed of putative lytic phages that may prove to be useful in phage-based applications for biocontrol, detection, and therapeutic purposes.

  7. Comparative genomic analysis of bacteriophages specific to the channel catfish pathogen Edwardsiella ictaluri

    PubMed Central

    2011-01-01

    Background The bacterial pathogen Edwardsiella ictaluri is a primary cause of mortality in channel catfish raised commercially in aquaculture farms. Additional treatment and diagnostic regimes are needed for this enteric pathogen, motivating the discovery and characterization of bacteriophages specific to E. ictaluri. Results The genomes of three Edwardsiella ictaluri-specific bacteriophages isolated from geographically distant aquaculture ponds, at different times, were sequenced and analyzed. The genomes for phages eiAU, eiDWF, and eiMSLS are 42.80 kbp, 42.12 kbp, and 42.69 kbp, respectively, and are greater than 95% identical to each other at the nucleotide level. Nucleotide differences were mostly observed in non-coding regions and in structural proteins, with significant variability in the sequences of putative tail fiber proteins. The genome organization of these phages exhibit a pattern shared by other Siphoviridae. Conclusions These E. ictaluri-specific phage genomes reveal considerable conservation of genomic architecture and sequence identity, even with considerable temporal and spatial divergence in their isolation. Their genomic homogeneity is similarly observed among E. ictaluri bacterial isolates. The genomic analysis of these phages supports the conclusion that these are virulent phages, lacking the capacity for lysogeny or expression of virulence genes. This study contributes to our knowledge of phage genomic diversity and facilitates studies on the diagnostic and therapeutic applications of these phages. PMID:21214923

  8. Multiple genome alignment for identifying the core structure among moderately related microbial genomes.

    PubMed

    Uchiyama, Ikuo

    2008-10-31

    Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.

  9. Characterizing genomic alterations in cancer by complementary functional associations.

    PubMed

    Kim, Jong Wook; Botvinnik, Olga B; Abudayyeh, Omar; Birger, Chet; Rosenbluh, Joseph; Shrestha, Yashaswi; Abazeed, Mohamed E; Hammerman, Peter S; DiCara, Daniel; Konieczkowski, David J; Johannessen, Cory M; Liberzon, Arthur; Alizad-Rahvar, Amir Reza; Alexe, Gabriela; Aguirre, Andrew; Ghandi, Mahmoud; Greulich, Heidi; Vazquez, Francisca; Weir, Barbara A; Van Allen, Eliezer M; Tsherniak, Aviad; Shao, Diane D; Zack, Travis I; Noble, Michael; Getz, Gad; Beroukhim, Rameen; Garraway, Levi A; Ardakani, Masoud; Romualdi, Chiara; Sales, Gabriele; Barbie, David A; Boehm, Jesse S; Hahn, William C; Mesirov, Jill P; Tamayo, Pablo

    2016-05-01

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.

  10. Genomic and proteomic characterization of SuMu, a Mu-like bacteriophage infecting Haemophilus parasuis.

    PubMed

    Zehr, Emilie S; Tabatabai, Louisa B; Bayles, Darrell O

    2012-07-23

    Haemophilus parasuis, the causative agent of Glässer's disease, is prevalent in swine herds and clinical signs associated with this disease are meningitis, polyserositis, polyarthritis, and bacterial pneumonia. Six to eight week old pigs in segregated early weaning herds are particularly susceptible to the disease. Insufficient colostral antibody at weaning or the mixing of pigs with heterologous virulent H. parasuis strains from other farm sources in the nursery or grower-finisher stage are considered to be factors for the outbreak of Glässer's disease. Previously, a Mu-like bacteriophage portal gene was detected in a virulent swine isolate of H. parasuis by nested polymerase chain reaction. Mu-like bacteriophages are related phyologenetically to enterobacteriophage Mu and are thought to carry virulence genes or to induce host expression of virulence genes. This study characterizes the Mu-like bacteriophage, named SuMu, isolated from a virulent H. parasuis isolate. Characterization was done by genomic comparison to enterobacteriophage Mu and proteomic identification of various homologs by mass spectrometry. This is the first report of isolation and characterization of this bacteriophage from the Myoviridae family, a double-stranded DNA bacteriophage with a contractile tail, from a virulent field isolate of H. parasuis. The genome size of bacteriophage SuMu was 37,151 bp. DNA sequencing revealed fifty five open reading frames, including twenty five homologs to Mu-like bacteriophage proteins: Nlp, phage transposase-C-terminal, COG2842, Gam-like protein, gp16, Mor, peptidoglycan recognition protein, gp29, gp30, gpG, gp32, gp34, gp36, gp37, gpL, phage tail tube protein, DNA circulation protein, gpP, gp45, gp46, gp47, COG3778, tail fiber protein gp37-C terminal, tail fiber assembly protein, and Com. The last open reading frame was homologous to IS1414. The G + C content of bacteriophage SuMu was 41.87% while its H. parasuis host genome's G + C content was 39.93%. Twenty protein homologs to bacteriophage proteins, including 15 structural proteins, one lysogeny-related and one lysis-related protein, and three DNA replication proteins were identified by mass spectrometry. One of the tail proteins, gp36, may be a virulence-related protein. Bacteriophage SuMu was characterized by genomic and proteomic methods and compared to enterobacteriophage Mu.

  11. CNS germinomas are characterized by global demethylation, chromosomal instability and mutational activation of the Kit-, Ras/Raf/Erk- and Akt-pathways

    PubMed Central

    Schulte, Simone Laura; Waha, Andreas; Steiger, Barbara; Denkhaus, Dorota; Dörner, Evelyn; Calaminus, Gabriele; Leuschner, Ivo; Pietsch, Torsten

    2016-01-01

    CNS germinomas represent a unique germ cell tumor entity characterized by undifferentiated tumor cells and a high response rate to current treatment protocols. Limited information is available on their underlying genomic, epigenetic and biological alterations. We performed a genome-wide analysis of genomic copy number alterations in 49 CNS germinomas by molecular inversion profiling. In addition, CpG dinucleotide methylation was studied by immunohistochemistry for methylated cytosine residues. Mutational analysis was performed by resequencing of candidate genes including KIT and RAS family members. Ras/Erk and Akt pathway activation was analyzed by immunostaining with antibodies against phospho-Erk, phosho-Akt, phospho-mTOR and phospho-S6. All germinomas coexpressed Oct4 and Kit but showed an extensive global DNA demethylation compared to other tumors and normal tissues. Molecular inversion profiling showed predominant genomic instability in all tumors with a high frequency of regional gains and losses including high level gene amplifications. Activating mutations of KIT exons 11, 13, and 17 as well as a case with genomic KIT amplification and activating mutations or amplifications of RAS gene family members including KRAS, NRAS and RRAS2 indicated mutational activation of crucial signaling pathways. Co-activation of Ras/Erk and Akt pathways was present in 83% of germinomas. These data suggest that CNS germinoma cells display a demethylated nuclear DNA similar to primordial germ cells in early development. This finding has a striking coincidence with extensive genomic instability. In addition, mutational activation of Kit-, Ras/Raf/Erk- and Akt- pathways indicate the biological importance of these pathways and their components as potential targets for therapy. PMID:27391150

  12. Complete mitochondrial genomes of the ‘intermediate form’ of Fasciola and Fasciola gigantica, and their comparison with F. hepatica

    PubMed Central

    2014-01-01

    Background Fascioliasis is an important and neglected disease of humans and other mammals, caused by trematodes of the genus Fasciola. Fasciola hepatica and F. gigantica are valid species that infect humans and animals, but the specific status of Fasciola sp. (‘intermediate form’) is unclear. Methods Single specimens inferred to represent Fasciola sp. (‘intermediate form’; Heilongjiang) and F. gigantica (Guangxi) from China were genetically identified and characterized using PCR-based sequencing of the first and second internal transcribed spacer regions of nuclear ribosomal DNA. The complete mitochondrial (mt) genomes of these representative specimens were then sequenced. The relationships of these specimens with selected members of the Trematoda were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI). Results The complete mt genomes of representatives of Fasciola sp. and F. gigantica were 14,453 bp and 14,478 bp in size, respectively. Both mt genomes contain 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lack an atp8 gene. All protein-coding genes are transcribed in the same direction, and the gene order in both mt genomes is the same as that published for F. hepatica. Phylogenetic analysis of the concatenated amino acid sequence data for all 12 protein-coding genes showed that the specimen of Fasciola sp. was more closely related to F. gigantica than to F. hepatica. Conclusions The mt genomes characterized here provide a rich source of markers, which can be used in combination with nuclear markers and imaging techniques, for future comparative studies of the biology of Fasciola sp. from China and other countries. PMID:24685294

  13. Complete mitochondrial genomes of the 'intermediate form' of Fasciola and Fasciola gigantica, and their comparison with F. hepatica.

    PubMed

    Liu, Guo-Hua; Gasser, Robin B; Young, Neil D; Song, Hui-Qun; Ai, Lin; Zhu, Xing-Quan

    2014-03-31

    Fascioliasis is an important and neglected disease of humans and other mammals, caused by trematodes of the genus Fasciola. Fasciola hepatica and F. gigantica are valid species that infect humans and animals, but the specific status of Fasciola sp. ('intermediate form') is unclear. Single specimens inferred to represent Fasciola sp. ('intermediate form'; Heilongjiang) and F. gigantica (Guangxi) from China were genetically identified and characterized using PCR-based sequencing of the first and second internal transcribed spacer regions of nuclear ribosomal DNA. The complete mitochondrial (mt) genomes of these representative specimens were then sequenced. The relationships of these specimens with selected members of the Trematoda were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI). The complete mt genomes of representatives of Fasciola sp. and F. gigantica were 14,453 bp and 14,478 bp in size, respectively. Both mt genomes contain 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lack an atp8 gene. All protein-coding genes are transcribed in the same direction, and the gene order in both mt genomes is the same as that published for F. hepatica. Phylogenetic analysis of the concatenated amino acid sequence data for all 12 protein-coding genes showed that the specimen of Fasciola sp. was more closely related to F. gigantica than to F. hepatica. The mt genomes characterized here provide a rich source of markers, which can be used in combination with nuclear markers and imaging techniques, for future comparative studies of the biology of Fasciola sp. from China and other countries.

  14. Genome-wide identification of significant aberrations in cancer genome.

    PubMed

    Yuan, Xiguo; Yu, Guoqiang; Hou, Xuchu; Shih, Ie-Ming; Clarke, Robert; Zhang, Junying; Hoffman, Eric P; Wang, Roger R; Zhang, Zhen; Wang, Yue

    2012-07-27

    Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme. We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies. Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open-source and platform-independent SAIC software is implemented using C++, together with R scripts for data formatting and Perl scripts for user interfacing, and it is easy to install and efficient to use. The source code and documentation are freely available at http://www.cbil.ece.vt.edu/software.htm.

  15. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains.

    PubMed

    da Silva, Vivian S; Shida, Cláudio S; Rodrigues, Fabiana B; Ribeiro, Diógenes C D; de Souza, Alessandra A; Coletta-Filho, Helvécio D; Machado, Marcos A; Nunes, Luiz R; de Oliveira, Regina Costa

    2007-12-21

    The xylem-inhabiting bacterium Xylella fastidiosa (Xf) is the causal agent of Pierce's disease (PD) in vineyards and citrus variegated chlorosis (CVC) in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains - which is particularly important for CVC-associated strains. This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH), identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly identified ORFs, obtained by SSH, represent an approximately 10% increase in our current knowledge of the South American Xf gene pool and include new putative virulence factors, as well as novel potential markers for strain identification. Surprisingly, this list of novel elements include sequences previously believed to be unique to North American strains, pointing to the necessity of revising the list of specific markers that may be used for identification of distinct Xf strains.

  16. Identification and Differential Abundance of Mitochondrial Genome Encoding Small RNAs (mitosRNA) in Breast Muscles of Modern Broilers and Unselected Chicken Breed

    PubMed Central

    Bottje, Walter G.; Khatri, Bhuwan; Shouse, Stephanie A.; Seo, Dongwon; Mallmann, Barbara; Orlowski, Sara K.; Pan, Jeonghoon; Kong, Seongbae; Owens, Casey M.; Anthony, Nicholas B.; Kim, Jae K.; Kong, Byungwhi C.

    2017-01-01

    Background: Although small non-coding RNAs are mostly encoded by the nuclear genome, thousands of small non-coding RNAs encoded by the mitochondrial genome, termed as mitosRNAs were recently reported in human, mouse and trout. In this study, we first identified chicken mitosRNAs in breast muscle using small RNA sequencing method and the differential abundance was analyzed between modern pedigree male (PeM) broilers (characterized by rapid growth and large muscle mass) and the foundational Barred Plymouth Rock (BPR) chickens (characterized by slow growth and small muscle mass). Methods: Small RNA sequencing was performed with total RNAs extracted from breast muscles of PeM and BPR (n = 6 per group) using the 1 × 50 bp single end read method of Illumina sequencing. Raw reads were processed by quality assessment, adapter trimming, and alignment to the chicken mitochondrial genome (GenBank Accession: X52392.1) using the NGen program. Further statistical analyses were performed using the JMP Genomics 8. Differentially expressed (DE) mitosRNAs between PeM and BPR were confirmed by quantitative PCR. Results: Totals of 183,416 unique small RNA sequences were identified as potential chicken mitosRNAs. After stringent filtering processes, 117 mitosRNAs showing >100 raw read counts were abundantly produced from all 37 mitochondrial genes (except D-loop region) and the length of mitosRNAs ranged from 22 to 46 nucleotides. Of those, abundance of 44 mitosRNAs were significantly altered in breast muscles of PeM compared to those of BPR: all mitosRNAs were higher in PeM breast except those produced from 16S-rRNA gene. Possibly, the higher mitosRNAs abundance in PeM breast may be due to a higher mitochondrial content compared to BPR. Our data demonstrate that in addition to 37 known mitochondrial genes, the mitochondrial genome also encodes abundant mitosRNAs, that may play an important regulatory role in muscle growth via mitochondrial gene expression control. PMID:29104541

  17. Oenococcus oeni in Chilean Red Wines: Technological and Genomic Characterization

    PubMed Central

    Romero, Jaime; Ilabaca, Carolina; Ruiz, Mauricio; Jara, Carla

    2018-01-01

    The presence and load of species of LAB at the end of the malolactic fermentation (MLF) were investigated in 16 wineries from the different Chilean valleys (Limarí, Casablanca, Maipo, Rapel, and Maule Valleys) during 2012 and 2013, using PCR-RFLP and qPCR. Oenococcus oeni was observed in 80% of the samples collected. Dominance of O. oeni was reflected in the bacterial load (O. oeni/total bacteria) measured by qPCR, corresponding to >85% in most of the samples. A total of 178 LAB isolates were identified after sequencing molecular markers, 95 of them corresponded to O. oeni. Further genetic analyses were performed using MLST (7 genes) including 10 commercial strains; the results indicated that commercial strains were grouped together, while autochthonous strains distributed among different genetic clusters. To pre-select some autochthonous O. oeni, these isolates were also characterized based on technological tests such as ethanol tolerance (12 and 15%), SO2 resistance (0 and 80 mg l−1), and pH (3.1 and 3.6) and malic acid transformation (1.5 and 4 g l−1). For comparison purposes, commercial strain VP41 was also tested. Based on their technological performance, only 3 isolates were selected for further examination (genome analysis) and they were able to reduce malic acid concentration, to grow at low pH 3.1, 15% ethanol and 80 mg l−1 SO2. The genome analyses of three selected isolates were examined and compared to PSU-1 and VP41 strains to study their potential contribution to the organoleptic properties of the final product. The presence and homology of genes potentially related to aromatic profile were compared among those strains. The results indicated high conservation of malolactic enzyme (>99%) and the absence of some genes related to odor such as phenolic acid decarboxylase, in autochthonous strains. Genomic analysis also revealed that these strains shared 470 genes with VP41 and PSU-1 and that autochthonous strains harbor an interesting number of unique genes (>21). Altogether these results reveal the presence of local strains distinguishable from commercial strains at the genetic/genomic level and also having genomic traits that enforce their potential use as starter cultures. PMID:29491847

  18. Analysis of Aspergillus nidulans metabolism at the genome-scale

    PubMed Central

    David, Helga; Özçelik, İlknur Ş; Hofmann, Gerald; Nielsen, Jens

    2008-01-01

    Background Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs) in the genome, of which less than 10% were assigned a function. Results In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated, and to look for candidate ORFs in the genome of A. nidulans by comparing its sequence to sequences of well-characterized genes in other species encoding the function of interest. A classification system, based on defined criteria, was developed for evaluating and selecting the ORFs among the candidates, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated) to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene expression data concerning a study on glucose repression, thereby providing a means of upgrading the information content of experimental data and getting further insight into this phenomenon in A. nidulans. Conclusion We demonstrate how pathway modeling of A. nidulans can be used as an approach to improve the functional annotation of the genome of this organism. Furthermore we show how the metabolic model establishes functional links between genes, enabling the upgrade of the information content of transcriptome data. PMID:18405346

  19. Assessing signatures of selection through variation in linkage disequilibrium between taurine and indicine cattle

    PubMed Central

    2014-01-01

    Background Signatures of selection are regions in the genome that have been preferentially increased in frequency and fixed in a population because of their functional importance in specific processes. These regions can be detected because of their lower genetic variability and specific regional linkage disequilibrium (LD) patterns. Methods By comparing the differences in regional LD variation between dairy and beef cattle types, and between indicine and taurine subspecies, we aim at finding signatures of selection for production and adaptation in cattle breeds. The VarLD method was applied to compare the LD variation in the autosomal genome between breeds, including Angus and Brown Swiss, representing taurine breeds, and Nelore and Gir, representing indicine breeds. Genomic regions containing the top 0.01 and 0.1 percentile of signals were characterized using the UMD3.1 Bos taurus genome assembly to identify genes in those regions and compared with previously reported selection signatures and regions with copy number variation. Results For all comparisons, the top 0.01 and 0.1 percentile included 26 and 165 signals and 17 and 125 genes, respectively, including TECRL, BT.23182 or FPPS, CAST, MYOM1, UVRAG and DNAJA1. Conclusions The VarLD method is a powerful tool to identify differences in linkage disequilibrium between cattle populations and putative signatures of selection with potential adaptive and productive importance. PMID:24592996

  20. The genome and transcriptome of perennial ryegrass mitochondria

    PubMed Central

    2013-01-01

    Background Perennial ryegrass (Lolium perenne L.) is one of the most important forage and turf grass species of temperate regions worldwide. Its mitochondrial genome is inherited maternally and contains genes that can influence traits of agricultural importance. Moreover, the DNA sequence of mitochondrial genomes has been established and compared for a large number of species in order to characterize evolutionary relationships. Therefore, it is crucial to understand the organization of the mitochondrial genome and how it varies between and within species. Here, we report the first de novo assembly and annotation of the complete mitochondrial genome from perennial ryegrass. Results Intact mitochondria from perennial ryegrass leaves were isolated and used for mtDNA extraction. The mitochondrial genome was sequenced to a 167-fold coverage using the Roche 454 GS-FLX Titanium platform, and assembled into a circular master molecule of 678,580 bp. A total of 34 proteins, 14 tRNAs and 3 rRNAs are encoded by the mitochondrial genome, giving a total gene space of 48,723 bp (7.2%). Moreover, we identified 149 open reading frames larger than 300 bp and covering 67,410 bp (9.93%), 250 SSRs, 29 tandem repeats, 5 pairs of large repeats, and 96 pairs of short inverted repeats. The genes encoding subunits of the respiratory complexes – nad1 to nad9, cob, cox1 to cox3 and atp1 to atp9 – all showed high expression levels both in absolute numbers and after normalization. Conclusions The circular master molecule of the mitochondrial genome from perennial ryegrass presented here constitutes an important tool for future attempts to compare mitochondrial genomes within and between grass species. Our results also demonstrate that mitochondria of perennial ryegrass contain genes crucial for energy production that are well conserved in the mitochondrial genome of monocotyledonous species. The expression analysis gave us first insights into the transcriptome of these mitochondrial genes in perennial ryegrass. PMID:23521852

  1. Different phylogenomic approaches to resolve the evolutionary relationships among model fish species.

    PubMed

    Negrisolo, Enrico; Kuhl, Heiner; Forcato, Claudio; Vitulo, Nicola; Reinhardt, Richard; Patarnello, Tomaso; Bargelloni, Luca

    2010-12-01

    Comparative genomics holds the promise to magnify the information obtained from individual genome sequencing projects, revealing common features conserved across genomes and identifying lineage-specific characteristics. To implement such a comparative approach, a robust phylogenetic framework is required to accurately reconstruct evolution at the genome level. Among vertebrate taxa, teleosts represent the second best characterized group, with high-quality draft genome sequences for five model species (Danio rerio, Gasterosteus aculeatus, Oryzias latipes, Takifugu rubripes, and Tetraodon nigroviridis), and several others are in the finishing lane. However, the relationships among the acanthomorph teleost model fishes remain an unresolved taxonomic issue. Here, a genomic region spanning over 1.2 million base pairs was sequenced in the teleost fish Dicentrarchus labrax. Together with genomic data available for the above fish models, the new sequence was used to identify unique orthologous genomic regions shared across all target taxa. Different strategies were applied to produce robust multiple gene and genomic alignments spanning from 11,802 to 186,474 amino acid/nucleotide positions. Ten data sets were analyzed according to Bayesian inference, maximum likelihood, maximum parsimony, and neighbor joining methods. Extensive analyses were performed to explore the influence of several factors (e.g., alignment methodology, substitution model, data set partitions, and long-branch attraction) on the tree topology. Although a general consensus was observed for a closer relationship between G. aculeatus (Gasterosteidae) and Di. labrax (Moronidae) with the atherinomorph O. latipes (Beloniformes) sister taxon of this clade, with the tetraodontiform group Ta. rubripes and Te. nigroviridis (Tetraodontiformes) representing a more distantly related taxon among acanthomorph model fish species, conflicting results were obtained between data sets and methods, especially with respect to the choice of alignment methodology applied to noncoding parts of the genomic region under study. This may limit the use of intergenic/noncoding sequences in phylogenomics until more robust alignment algorithms are developed.

  2. Deep whole-genome sequencing of 90 Han Chinese genomes.

    PubMed

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects. © The Authors 2017. Published by Oxford University Press.

  3. Molecular, phylogenetic and comparative genomic analysis of the cytokinin oxidase/dehydrogenase gene family in the Poaceae.

    PubMed

    Mameaux, Sabine; Cockram, James; Thiel, Thomas; Steuernagel, Burkhard; Stein, Nils; Taudien, Stefan; Jack, Peter; Werner, Peter; Gray, John C; Greenland, Andy J; Powell, Wayne

    2012-01-01

    The genomes of cereals such as wheat (Triticum aestivum) and barley (Hordeum vulgare) are large and therefore problematic for the map-based cloning of agronomicaly important traits. However, comparative approaches within the Poaceae permit transfer of molecular knowledge between species, despite their divergence from a common ancestor sixty million years ago. The finding that null variants of the rice gene cytokinin oxidase/dehydrogenase 2 (OsCKX2) result in large yield increases provides an opportunity to explore whether similar gains could be achieved in other Poaceae members. Here, phylogenetic, molecular and comparative analyses of CKX families in the sequenced grass species rice, brachypodium, sorghum, maize and foxtail millet, as well as members identified from the transcriptomes/genomes of wheat and barley, are presented. Phylogenetic analyses define four Poaceae CKX clades. Comparative analyses showed that CKX phylogenetic groupings can largely be explained by a combination of local gene duplication, and the whole-genome duplication event that predates their speciation. Full-length OsCKX2 homologues in barley (HvCKX2.1, HvCKX2.2) and wheat (TaCKX2.3, TaCKX2.4, TaCKX2.5) are characterized, with comparative analysis at the DNA, protein and genetic/physical map levels suggesting that true CKX2 orthologs have been identified. Furthermore, our analysis shows CKX2 genes in barley and wheat have undergone a Triticeae-specific gene-duplication event. Finally, by identifying ten of the eleven CKX genes predicted to be present in barley by comparative analyses, we show that next-generation sequencing approaches can efficiently determine the gene space of large-genome crops. Together, this work provides the foundation for future functional investigation of CKX family members within the Poaceae. © 2011 National Institute of Agricultural Botany (NIAB). Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.

  4. Progressive but Previously Untreated CLL Patients with Greater Array CGH Complexity Exhibit a Less Durable Response to Chemoimmunotherapy

    PubMed Central

    Kay, Neil E.; Eckel-Passow, Jeanette E.; Braggio, Esteban; VanWier, Scott; Shanafelt, Tait D.; Van Dyke, Daniel L.; Jelinek, Diane F.; Tschumper, Renee C.; Kipps, Thomas; Byrd, John C.; Fonseca, Rafael

    2010-01-01

    To better understand the implications of genomic instability and outcome in B-cell CLL, we sought to address genomic complexity as a predictor of chemosensitivity and ultimately clinical outcome in this disease. We employed array-based comparative genomic hybridization (aCGH), using a one-million probe array and identified gains and losses of genetic material in 48 patients treated on a chemoimmunotherapy (CIT) clinical trial. We identified chromosomal gain or loss in ≥6% of the patients on chromosomes 3, 8, 9, 10, 11, 12, 13, 14 and 17. Higher genomic complexity, as a mechanism favoring clonal selection, was associated with shorter progression-free survival and predicted a poor response to treatment. Of interest, CLL cases with loss of p53 surveillance showed more complex genomic features and were found both in patients with a 17p13.1 deletion and in the more favorable genetic subtype characterized by the presence of 13q14.1 deletion. This aCGH study adds information on the association between inferior trial response and increasing genetic complexity as CLL progresses. PMID:21156228

  5. Complete genome sequence of maize yellow striate virus, a new cytorhabdovirus infecting maize and wheat crops in Argentina.

    PubMed

    Maurino, Fernanda; Dumón, Analía D; Llauger, Gabriela; Alemandri, Vanina; de Haro, Luis A; Mattio, M Fernanda; Del Vas, Mariana; Laguna, Irma Graciela; Giménez Pecci, María de la Paz

    2018-01-01

    A rhabdovirus infecting maize and wheat crops in Argentina was molecularly characterized. Through next-generation sequencing (NGS) of symptomatic leaf samples, the complete genome was obtained of two isolates of maize yellow striate virus (MYSV), a putative new rhabdovirus, differing by only 0.4% at the nucleotide level. The MYSV genome consists of 12,654 nucleotides for maize and wheat virus isolates, and shares 71% nucleotide sequence identity with the complete genome of barley yellow striate mosaic virus (BYSMV, NC028244). Ten open reading frames (ORFs) were predicted in the MYSV genome from the antigenomic strand and were compared with their BYSMV counterparts. The highest amino acid sequence identity of the MYSV and BYSMV proteins was 80% between the L proteins, and the lowest was 37% between the proteins 4. Phylogenetic analysis suggested that the MYSV isolates are new members of the genus Cytorhabdovirus, family Rhabdoviridae. Yellow striate, affecting maize and wheat crops in Argentina, is an emergent disease that presents a potential economic risk for these widely distributed crops.

  6. Characterization of the complete mitochondrial genome of Ortleppascaris sinensis (Nematoda: Heterocheilidae) and comparative mitogenomic analysis of eighteen Ascaridida nematodes.

    PubMed

    Zhao, J H; Tu, G J; Wu, X B; Li, C P

    2018-05-01

    Ortleppascaris sinensis (Nematoda: Ascaridida) is a dominant intestinal nematode of the captive Chinese alligator. However, the epidemiology, molecular ecology and population genetics of this parasite remain largely unexplored. In this study, the complete mitochondrial (mt) genome sequence of O. sinensis was first determined using a polymerase chain reaction (PCR)-based primer-walking strategy, and this is also the first sequencing of the complete mitochondrial genome of a member of the genus Ortleppascaris. The circular mitochondrial genome (13,828 bp) of O. sinensis contained 12 protein-coding, 22 transfer RNA and 2 ribosomal RNA genes, but lacked the ATP synthetase subunit 8 gene. Finally, phylogenetic analysis of mtDNAs indicated that the genus Ortleppascaris should be attributed to the family Heterocheilidae. It is necessary to sequence more mtNDAs of Ortleppascaris nematodes in the future to test and confirm our conclusion. The complete mitochondrial genome sequence of O. sinensis reported here should contribute to molecular diagnosis, epidemiological investigations and ecological studies of O. sinensis and other related Ascaridida nematodes.

  7. Ecological Genomics of Marine Picocyanobacteria†

    PubMed Central

    Scanlan, D. J.; Ostrowski, M.; Mazard, S.; Dufresne, A.; Garczarek, L.; Hess, W. R.; Post, A. F.; Hagemann, M.; Paulsen, I.; Partensky, F.

    2009-01-01

    Summary: Marine picocyanobacteria of the genera Prochlorococcus and Synechococcus numerically dominate the picophytoplankton of the world ocean, making a key contribution to global primary production. Prochlorococcus was isolated around 20 years ago and is probably the most abundant photosynthetic organism on Earth. The genus comprises specific ecotypes which are phylogenetically distinct and differ markedly in their photophysiology, allowing growth over a broad range of light and nutrient conditions within the 45°N to 40°S latitudinal belt that they occupy. Synechococcus and Prochlorococcus are closely related, together forming a discrete picophytoplankton clade, but are distinguishable by their possession of dissimilar light-harvesting apparatuses and differences in cell size and elemental composition. Synechococcus strains have a ubiquitous oceanic distribution compared to that of Prochlorococcus strains and are characterized by phylogenetically discrete lineages with a wide range of pigmentation. In this review, we put our current knowledge of marine picocyanobacterial genomics into an environmental context and present previously unpublished genomic information arising from extensive genomic comparisons in order to provide insights into the adaptations of these marine microbes to their environment and how they are reflected at the genomic level. PMID:19487728

  8. The complete chloroplast genome sequence of Actinidia arguta using the PacBio RS II platform

    PubMed Central

    Lin, Miaomiao; Qi, Xiujuan; Chen, Jinyong; Sun, Leiming; Zhong, Yunpeng; Fang, Jinbao; Hu, Chungen

    2018-01-01

    Actinidia arguta is the most basal species in a phylogenetically and economically important genus in the family Actinidiaceae. To better understand the molecular basis of the Actinidia arguta chloroplast (cp), we sequenced the complete cp genome from A. arguta using Illumina and PacBio RS II sequencing technologies. The cp genome from A. arguta was 157,611 bp in length and composed of a pair of 24,232 bp inverted repeats (IRs) separated by a 20,463 bp small single copy region (SSC) and an 88,684 bp large single copy region (LSC). Overall, the cp genome contained 113 unique genes. The cp genomes from A. arguta and three other Actinidia species from GenBank were subjected to a comparative analysis. Indel mutation events and high frequencies of base substitution were identified, and the accD and ycf2 genes showed a high degree of variation within Actinidia. Forty-seven simple sequence repeats (SSRs) and 155 repetitive structures were identified, further demonstrating the rapid evolution in Actinidia. The cp genome analysis and the identification of variable loci provide vital information for understanding the evolution and function of the chloroplast and for characterizing Actinidia population genetics. PMID:29795601

  9. Novel recA-Independent Horizontal Gene Transfer in Escherichia coli K-12.

    PubMed

    Kingston, Anthony W; Roussel-Rossin, Chloé; Dupont, Claire; Raleigh, Elisabeth A

    2015-01-01

    In bacteria, mechanisms that incorporate DNA into a genome without strand-transfer proteins such as RecA play a major role in generating novelty by horizontal gene transfer. We describe a new illegitimate recombination event in Escherichia coli K-12: RecA-independent homologous replacements, with very large (megabase-length) donor patches replacing recipient DNA. A previously uncharacterized gene (yjiP) increases the frequency of RecA-independent replacement recombination. To show this, we used conjugal DNA transfer, combining a classical conjugation donor, HfrH, with modern genome engineering methods and whole genome sequencing analysis to enable interrogation of genetic dependence of integration mechanisms and characterization of recombination products. As in classical experiments, genomic DNA transfer begins at a unique position in the donor, entering the recipient via conjugation; antibiotic resistance markers are then used to select recombinant progeny. Different configurations of this system were used to compare known mechanisms for stable DNA incorporation, including homologous recombination, F'-plasmid formation, and genome duplication. A genome island of interest known as the immigration control region was specifically replaced in a minority of recombinants, at a frequency of 3 X 10(-12) CFU/recipient per hour.

  10. Frequent genomic imbalances suggest commonly altered tumour genes in human hepatocarcinogenesis

    PubMed Central

    Niketeghad, F; Decker, H J; Caselmann, W H; Lund, P; Geissler, F; Dienes, H P; Schirmacher, P

    2001-01-01

    Hepatocellular carcinoma (HCC) is one of the most frequent-occurring malignant tumours worldwide, but molecular changes of tumour DNA, with the exception of viral integrations and p53 mutations, are poorly understood. In order to search for common macro-imbalances of genomic tumour DNA, 21 HCCs and 3 HCC-cell lines were characterized by comparative genomic hybridization (CGH), subsequent database analyses and in selected cases by fluorescence in situ hybridization (FISH). Chromosomal subregions of 1q, 8q, 17q and 20q showed frequent gains of genomic material, while losses were most prevalent in subregions of 4q, 6q, 13q and 16q. Deleted regions encompass tumour suppressor genes, like RB-1 and the cadherin gene cluster, some of them previously identified as potential target genes in HCC development. Several potential growth- or transformation-promoting genes located in chromosomal subregions showed frequent gains of genomic material. The present study provides a basis for further genomic and expression analyses in HCCs and in addition suggests chromosome 4q to carry a so far unidentified tumour suppressor gene relevant for HCC development. © 2001 Cancer Research Campaign http://www.bjcancer.com PMID:11531255

  11. Arpeggio: harmonic compression of ChIP-seq data reveals protein-chromatin interaction signatures

    PubMed Central

    Stanton, Kelly Patrick; Parisi, Fabio; Strino, Francesco; Rabin, Neta; Asp, Patrik; Kluger, Yuval

    2013-01-01

    Researchers generating new genome-wide data in an exploratory sequencing study can gain biological insights by comparing their data with well-annotated data sets possessing similar genomic patterns. Data compression techniques are needed for efficient comparisons of a new genomic experiment with large repositories of publicly available profiles. Furthermore, data representations that allow comparisons of genomic signals from different platforms and across species enhance our ability to leverage these large repositories. Here, we present a signal processing approach that characterizes protein–chromatin interaction patterns at length scales of several kilobases. This allows us to efficiently compare numerous chromatin-immunoprecipitation sequencing (ChIP-seq) data sets consisting of many types of DNA-binding proteins collected from a variety of cells, conditions and organisms. Importantly, these interaction patterns broadly reflect the biological properties of the binding events. To generate these profiles, termed Arpeggio profiles, we applied harmonic deconvolution techniques to the autocorrelation profiles of the ChIP-seq signals. We used 806 publicly available ChIP-seq experiments and showed that Arpeggio profiles with similar spectral densities shared biological properties. Arpeggio profiles of ChIP-seq data sets revealed characteristics that are not easily detected by standard peak finders. They also allowed us to relate sequencing data sets from different genomes, experimental platforms and protocols. Arpeggio is freely available at http://sourceforge.net/p/arpeggio/wiki/Home/. PMID:23873955

  12. Arpeggio: harmonic compression of ChIP-seq data reveals protein-chromatin interaction signatures.

    PubMed

    Stanton, Kelly Patrick; Parisi, Fabio; Strino, Francesco; Rabin, Neta; Asp, Patrik; Kluger, Yuval

    2013-09-01

    Researchers generating new genome-wide data in an exploratory sequencing study can gain biological insights by comparing their data with well-annotated data sets possessing similar genomic patterns. Data compression techniques are needed for efficient comparisons of a new genomic experiment with large repositories of publicly available profiles. Furthermore, data representations that allow comparisons of genomic signals from different platforms and across species enhance our ability to leverage these large repositories. Here, we present a signal processing approach that characterizes protein-chromatin interaction patterns at length scales of several kilobases. This allows us to efficiently compare numerous chromatin-immunoprecipitation sequencing (ChIP-seq) data sets consisting of many types of DNA-binding proteins collected from a variety of cells, conditions and organisms. Importantly, these interaction patterns broadly reflect the biological properties of the binding events. To generate these profiles, termed Arpeggio profiles, we applied harmonic deconvolution techniques to the autocorrelation profiles of the ChIP-seq signals. We used 806 publicly available ChIP-seq experiments and showed that Arpeggio profiles with similar spectral densities shared biological properties. Arpeggio profiles of ChIP-seq data sets revealed characteristics that are not easily detected by standard peak finders. They also allowed us to relate sequencing data sets from different genomes, experimental platforms and protocols. Arpeggio is freely available at http://sourceforge.net/p/arpeggio/wiki/Home/.

  13. Genome-wide identification and characterization of microRNAs differenytially expressed in fibers in a cotton phytochrome A1 RNAi line

    USDA-ARS?s Scientific Manuscript database

    Silencing phytochrome A1 gene (PHYA1) by RNA interference in Upland cotton (Gossypium hirsutum L. cv. Coker 312) had generated PHYA1 RNAi lines with simultaneously improved fiber quality (longer, stronger and finer fiber) and other key agronomic traits. Comparative analyses of altered molecular proc...

  14. Characterization of wood decay enzymes by MALDI-MS for post-translational modification and gene identification.

    Treesearch

    Theodorus H. de Koker; Philip J. Kersten

    2002-01-01

    The recent sequencing of the Phanerochaete chrysosporium genome presents many opportunities, including the possibility of rapidly correlating specific wood decay proteins of the fungus with the corresponding gene sequences. Here we compare mass fragments of trypsin digests, determined by MALDI-MS (Matrix Assisted Laser Desorption Ionization-Mass Spectrometry), with...

  15. Array-CGH Analysis in a Cohort of Phenotypically Well-Characterized Individuals with "Essential" Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Napoli, Eleonora; Russo, Serena; Casula, Laura; Alesi, Viola; Amendola, Filomena Alessandra; Angioni, Adriano; Novelli, Antonio; Valeri, Giovanni; Menghini, Deny; Vicari, Stefano

    2018-01-01

    Copy-number variants (CNVs) are associated with susceptibility to autism spectrum disorder (ASD). To detect the presence of CNVs, we conducted an array-comparative genomic hybridization (array-CGH) analysis in 133 children with "essential" ASD phenotype. Genetic analyses documented that 12 children had causative CNVs (C-CNVs), 29…

  16. Transposable element junctions in marker development and genomic characterization of barley

    USDA-ARS?s Scientific Manuscript database

    Barley is a model plant in genomic studies of Triticeae species. A complete barley genome sequence will facilitate not only barley breeding programs, but also those for related species. However, the large genome size and high repetitive sequence content complicate the barley genome assembly. The ma...

  17. Structure and variation of the mitochondrial genome of fishes.

    PubMed

    Satoh, Takashi P; Miya, Masaki; Mabuchi, Kohji; Nishida, Mutsumi

    2016-09-07

    The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets. An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species. Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.

  18. Ninety-nine de novo assembled genomes from the moose (Alces alces) rumen microbiome provide new insights into microbial plant biomass degradation

    PubMed Central

    Svartström, Olov; Alneberg, Johannes; Terrapon, Nicolas; Lombard, Vincent; de Bruijn, Ino; Malmsten, Jonas; Dalin, Ann-Marie; Muller, Emilie E.L.; Shah, Pranjul; Wilmes, Paul; Henrissat, Bernard; Aspeborg, Henrik; Andersson, Anders F.

    2017-01-01

    The moose (Alces alces) is a ruminant that harvests energy from fiber-rich lignocellulose material through carbohydrate-active enzymes (CAZymes) produced by its rumen microbes. We applied shotgun metagenomics to rumen contents from six moose to obtain insights into this microbiome. Following binning, 99 metagenome-assembled genomes (MAGs) belonging to eleven prokaryotic phyla were reconstructed and characterized based on phylogeny and CAZyme profile. The taxonomy of these MAGs reflected the overall composition of the metagenome, with dominance of the phyla Bacteroidetes and Firmicutes. Unlike in other ruminants, Spirochaetes constituted a significant proportion of the community and our analyses indicate that the corresponding strains are primarily pectin digesters. Pectin-degrading genes were also common in MAGs of Ruminococcus, Fibrobacteres and Bacteroidetes, and were overall overrepresented in the moose microbiome compared to other ruminants. Phylogenomic analyses revealed several clades within the Bacteriodetes without previously characterized genomes. Several of these MAGs encoded a large numbers of dockerins, a module usually associated with cellulosomes. The Bacteroidetes dockerins were often linked to CAZymes and sometimes encoded inside polysaccharide utilization loci (PULs), which has never been reported before. The almost one hundred CAZyme-annotated genomes reconstructed in this study provides an in-depth view of an efficient lignocellulose-degrading microbiome and prospects for developing enzyme technology for biorefineries. PMID:28731473

  19. Mining virulence genes using metagenomics.

    PubMed

    Belda-Ferre, Pedro; Cabrera-Rubio, Raúl; Moya, Andrés; Mira, Alex

    2011-01-01

    When a bacterial genome is compared to the metagenome of an environment it inhabits, most genes recruit at high sequence identity. In free-living bacteria (for instance marine bacteria compared against the ocean metagenome) certain genomic regions are totally absent in recruitment plots, representing therefore genes unique to individual bacterial isolates. We show that these Metagenomic Islands (MIs) are also visible in bacteria living in human hosts when their genomes are compared to sequences from the human microbiome, despite the compartmentalized structure of human-related environments such as the gut. From an applied point of view, MIs of human pathogens (e.g. those identified in enterohaemorragic Escherichia coli against the gut metagenome or in pathogenic Neisseria meningitidis against the oral metagenome) include virulence genes that appear to be absent in related strains or species present in the microbiome of healthy individuals. We propose that this strategy (i.e. recruitment analysis of pathogenic bacteria against the metagenome of healthy subjects) can be used to detect pathogenicity regions in species where the genes involved in virulence are poorly characterized. Using this approach, we detect well-known pathogenicity islands and identify new potential virulence genes in several human pathogens.

  20. The Human Genome Initiative of the Department of Energy

    DOE R&D Accomplishments Database

    1988-01-01

    The structural characterization of genes and elucidation of their encoded functions have become a cornerstone of modern health research, biology and biotechnology. A genome program is an organized effort to locate and identify the functions of all the genes of an organism. Beginning with the DOE-sponsored, 1986 human genome workshop at Santa Fe, the value of broadly organized efforts supporting total genome characterization became a subject of intensive study. There is now national recognition that benefits will rapidly accrue from an effective scientific infrastructure for total genome research. In the US genome research is now receiving dedicated funds. Several other nations are implementing genome programs. Supportive infrastructure is being improved through both national and international cooperation. The Human Genome Initiative of the Department of Energy (DOE) is a focused program of Resource and Technology Development, with objectives of speeding and bringing economies to the national human genome effort. This report relates the origins and progress of the Initiative.

  1. Genome-wide comparison of paired fresh frozen and formalin-fixed paraffin-embedded gliomas by custom BAC and oligonucleotide array comparative genomic hybridization: facilitating analysis of archival gliomas.

    PubMed

    Mohapatra, Gayatry; Engler, David A; Starbuck, Kristen D; Kim, James C; Bernay, Derek C; Scangas, George A; Rousseau, Audrey; Batchelor, Tracy T; Betensky, Rebecca A; Louis, David N

    2011-04-01

    Array comparative genomic hybridization (aCGH) is a powerful tool for detecting DNA copy number alterations (CNA). Because diffuse malignant gliomas are often sampled by small biopsies, formalin-fixed paraffin-embedded (FFPE) blocks are often the only tissue available for genetic analysis; FFPE tissues are also needed to study the intratumoral heterogeneity that characterizes these neoplasms. In this paper, we present a combination of evaluations and technical advances that provide strong support for the ready use of oligonucleotide aCGH on FFPE diffuse gliomas. We first compared aCGH using bacterial artificial chromosome (BAC) arrays in 45 paired frozen and FFPE gliomas, and demonstrate a high concordance rate between FFPE and frozen DNA in an individual clone-level analysis of sensitivity and specificity, assuring that under certain array conditions, frozen and FFPE DNA can perform nearly identically. However, because oligonucleotide arrays offer advantages to BAC arrays in genomic coverage and practical availability, we next developed a method of labeling DNA from FFPE tissue that allows efficient hybridization to oligonucleotide arrays. To demonstrate utility in FFPE tissues, we applied this approach to biphasic anaplastic oligoastrocytomas and demonstrate CNA differences between DNA obtained from the two components. Therefore, BAC and oligonucleotide aCGH can be sensitive and specific tools for detecting CNAs in FFPE DNA, and novel labeling techniques enable the routine use of oligonucleotide arrays for FFPE DNA. In combination, these advances should facilitate genome-wide analysis of rare, small and/or histologically heterogeneous gliomas from FFPE tissues.

  2. Estimating true evolutionary distances under the DCJ model.

    PubMed

    Lin, Yu; Moret, Bernard M E

    2008-07-01

    Modern techniques can yield the ordering and strandedness of genes on each chromosome of a genome; such data already exists for hundreds of organisms. The evolutionary mechanisms through which the set of the genes of an organism is altered and reordered are of great interest to systematists, evolutionary biologists, comparative genomicists and biomedical researchers. Perhaps the most basic concept in this area is that of evolutionary distance between two genomes: under a given model of genomic evolution, how many events most likely took place to account for the difference between the two genomes? We present a method to estimate the true evolutionary distance between two genomes under the 'double-cut-and-join' (DCJ) model of genome rearrangement, a model under which a single multichromosomal operation accounts for all genomic rearrangement events: inversion, transposition, translocation, block interchange and chromosomal fusion and fission. Our method relies on a simple structural characterization of a genome pair and is both analytically and computationally tractable. We provide analytical results to describe the asymptotic behavior of genomes under the DCJ model, as well as experimental results on a wide variety of genome structures to exemplify the very high accuracy (and low variance) of our estimator. Our results provide a tool for accurate phylogenetic reconstruction from multichromosomal gene rearrangement data as well as a theoretical basis for refinements of the DCJ model to account for biological constraints. All of our software is available in source form under GPL at http://lcbb.epfl.ch.

  3. Characterization of transposable elements in the ectomycorrhizal fungus Laccaria bicolor.

    PubMed

    Labbé, Jessy; Murat, Claude; Morin, Emmanuelle; Tuskan, Gerald A; Le Tacon, François; Martin, Francis

    2012-01-01

    The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copy elements distributed within 171 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs exhibits signs of ancient transposition except some intact copies of terminal inverted repeats (TIRS), long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TE expansion in L. bicolor: the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 0.5 Mya ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. This analysis 1) represents an initial characterization of TEs in the L. bicolor genome, 2) contributes to improve genome annotation and a greater understanding of the role TEs played in genome organization and evolution and 3) provides a valuable resource for future research on the genome evolution within the Laccaria genus.

  4. Characterization of Transposable Elements in the Ectomycorrhizal Fungus Laccaria bicolor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

    2012-01-01

    Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TEspecific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copy elements distributed within 171 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs exhibits signs of ancient transposition except some intactmore » copies of terminal inverted repeats (TIRS), long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TE expansion in L. bicolor: the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 0.5 Mya ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis 1) represents an initial characterization of TEs in the L. bicolor genome, 2) contributes to improve genome annotation and a greater understanding of the role TEs played in genome organization and evolution and 3) provides a valuable resource for future research on the genome evolution within the Laccaria genus.« less

  5. Draft genome sequence of bitter gourd (Momordica charantia), a vegetable and medicinal plant in tropical and subtropical regions.

    PubMed

    Urasaki, Naoya; Takagi, Hiroki; Natsume, Satoshi; Uemura, Aiko; Taniai, Naoki; Miyagi, Norimichi; Fukushima, Mai; Suzuki, Shouta; Tarora, Kazuhiko; Tamaki, Moritoshi; Sakamoto, Moriaki; Terauchi, Ryohei; Matsumura, Hideo

    2017-02-01

    Bitter gourd (Momordica charantia) is an important vegetable and medicinal plant in tropical and subtropical regions globally. In this study, the draft genome sequence of a monoecious bitter gourd inbred line, OHB3-1, was analyzed. Through Illumina sequencing and de novo assembly, scaffolds of 285.5 Mb in length were generated, corresponding to ∼84% of the estimated genome size of bitter gourd (339 Mb). In this draft genome sequence, 45,859 protein-coding gene loci were identified, and transposable elements accounted for 15.3% of the whole genome. According to synteny mapping and phylogenetic analysis of conserved genes, bitter gourd was more related to watermelon (Citrullus lanatus) than to cucumber (Cucumis sativus) or melon (C. melo). Using RAD-seq analysis, 1507 marker loci were genotyped in an F2 progeny of two bitter gourd lines, resulting in an improved linkage map, comprising 11 linkage groups. By anchoring RAD tag markers, 255 scaffolds were assigned to the linkage map. Comparative analysis of genome sequences and predicted genes determined that putative trypsin-inhibitor and ribosome-inactivating genes were distinctive in the bitter gourd genome. These genes could characterize the bitter gourd as a medicinal plant. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  6. Bacteriophages of Gordonia spp. Display a Spectrum of Diversity and Genetic Relationships

    PubMed Central

    Pope, Welkin H.; Mavrich, Travis N.; Garlena, Rebecca A.; Guerrero-Bustamante, Carlos A.; Jacobs-Sera, Deborah; Montgomery, Matthew T.; Russell, Daniel A.; Warner, Marcie H.

    2017-01-01

    ABSTRACT The global bacteriophage population is large, dynamic, old, and highly diverse genetically. Many phages are tailed and contain double-stranded DNA, but these remain poorly characterized genomically. A collection of over 1,000 phages infecting Mycobacterium smegmatis reveals the diversity of phages of a common bacterial host, but their relationships to phages of phylogenetically proximal hosts are not known. Comparative sequence analysis of 79 phages isolated on Gordonia shows these also to be diverse and that the phages can be grouped into 14 clusters of related genomes, with an additional 14 phages that are “singletons” with no closely related genomes. One group of six phages is closely related to Cluster A mycobacteriophages, but the other Gordonia phages are distant relatives and share only 10% of their genes with the mycobacteriophages. The Gordonia phage genomes vary in genome length (17.1 to 103.4 kb), percentage of GC content (47 to 68.8%), and genome architecture and contain a variety of features not seen in other phage genomes. Like the mycobacteriophages, the highly mosaic Gordonia phages demonstrate a spectrum of genetic relationships. We show this is a general property of bacteriophages and suggest that any barriers to genetic exchange are soft and readily violable. PMID:28811342

  7. The genome sequence of Dyella jiangningensis FCAV SCS01 from a lignocellulose-decomposing microbial consortium metagenome reveals potential for biotechnological applications.

    PubMed

    Desiderato, Joana G; Alvarenga, Danillo O; Constancio, Milena T L; Alves, Lucia M C; Varani, Alessandro M

    2018-05-14

    Cellulose and its associated polymers are structural components of the plant cell wall, constituting one of the major sources of carbon and energy in nature. The carbon cycle is dependent on cellulose- and lignin-decomposing microbial communities and their enzymatic systems acting as consortia. These microbial consortia are under constant exploration for their potential biotechnological use. Herein, we describe the characterization of the genome of Dyella jiangningensis FCAV SCS01, recovered from the metagenome of a lignocellulose-degrading microbial consortium, which was isolated from a sugarcane crop soil under mechanical harvesting and covered by decomposing straw. The 4.7 Mbp genome encodes 4,194 proteins, including 36 glycoside hydrolases (GH), supporting the hypothesis that this bacterium may contribute to lignocellulose decomposition. Comparative analysis among fully sequenced Dyella species indicate that the genome synteny is not conserved, and that D. jiangningensis FCAV SCS01 carries 372 unique genes, including an alpha-glucosidase and maltodextrin glucosidase coding genes, and other potential biomass degradation related genes. Additional genomic features, such as prophage-like, genomic islands and putative new biosynthetic clusters were also uncovered. Overall, D. jiangningensis FCAV SCS01 represents the first South American Dyella genome sequenced and shows an exclusive feature among its genus, related to biomass degradation.

  8. Insights into archaeal evolution and symbiosis from the genomes of a Nanoarchaeon and its crenarchaeal host from Yellowstone National Park

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Podar, Mircea; Graham, David E; Reysenbach, Anna-Louise

    A hyperthemophilic member of the Nanoarchaeota from Obsidian Pool, a thermal feature in Yellowstone National Park was characterized using single cell isolation and sequencing, together with its putative host, a Sulfolobales archaeon. This first representative of a non-marine Nanoarchaeota (Nst1) resembles Nanoarchaeum equitans by lacking most biosynthetic capabilities, the two forming a deep-branching archaeal lineage. However, the Nst1 genome is over 20% larger, encodes a complete gluconeogenesis pathway and a full complement of archaeal flagellum proteins. Comparison of the two genomes suggests that the marine and terrestrial Nanoarchaeota lineages share a common ancestor that was already a symbiont of anothermore » archaeon. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. The two distinct Nanoarchaeota-host genomic data sets offer insights into the evolution of archaeal symbiosis and parasitism and will further enable studies of the cellular and molecular mechanisms of these relationships.« less

  9. Lactobacillus iners: Friend or Foe?

    PubMed

    Petrova, Mariya I; Reid, Gregor; Vaneechoutte, Mario; Lebeer, Sarah

    2017-03-01

    The vaginal microbial community is typically characterized by abundant lactobacilli. Lactobacillus iners, a fairly recently detected species, is frequently present in the vaginal niche. However, the role of this species in vaginal health is unclear, since it can be detected in normal conditions as well as during vaginal dysbiosis, such as bacterial vaginosis, a condition characterized by an abnormal increase in bacterial diversity and lack of typical lactobacilli. Compared to other Lactobacillus species, L. iners has more complex nutritional requirements and a Gram-variable morphology. L. iners has an unusually small genome (ca. 1 Mbp), indicative of a symbiotic or parasitic lifestyle, in contrast to other lactobacilli that show niche flexibility and genomes of up to 3-4 Mbp. The presence of specific L. iners genes, such as those encoding iron-sulfur proteins and unique σ-factors, reflects a high degree of niche specification. The genome of L. iners strains also encodes inerolysin, a pore-forming toxin related to vaginolysin of Gardnerella vaginalis. Possibly, this organism may have clonal variants that in some cases promote a healthy vagina, and in other cases are associated with dysbiosis and disease. Future research should examine this friend or foe relationship with the host. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Characterization of the Complete Mitochondrial Genome Sequence of Spirometra erinaceieuropaei (Cestoda: Diphyllobothriidae) from China

    PubMed Central

    Liu, Guo-Hua; Li, Chun; Li, Jia-Yuan; Zhou, Dong-Hui; Xiong, Rong-Chuan; Lin, Rui-Qing; Zou, Feng-Cai; Zhu, Xing-Quan

    2012-01-01

    Sparganosis, caused by the plerocercoid larvae of members of the genus Spirometra, can cause significant public health problem and considerable economic losses. In the present study, the complete mitochondrial DNA (mtDNA) sequence of Spirometra erinaceieuropaei from China was determined, characterized and compared with that of S. erinaceieuropaei from Japan. The gene arrangement in the mt genome sequences of S. erinaceieuropaei from China and Japan is identical. The identity of the mt genomes was 99.1% between S. erinaceieuropaei from China and Japan, and the complete mtDNA sequence of S. erinaceieuropaei from China is slightly shorter (2 bp) than that from Japan. Phylogenetic analysis of S. erinaceieuropaei with other representative cestodes using two different computational algorithms [Bayesian inference (BI) and maximum likelihood (ML)] based on concatenated amino acid sequences of 12 protein-coding genes, revealed that S. erinaceieuropaei is closely related to Diphyllobothrium spp., supporting classification based on morphological features. The present study determined the complete mtDNA sequences of S. erinaceieuropaei from China that provides novel genetic markers for studying the population genetics and molecular epidemiology of S. erinaceieuropaei in humans and animals. PMID:22553464

  11. Genomic and Phenotypic Characterization of Yeast Biosensor for Deep-space Radiation

    NASA Technical Reports Server (NTRS)

    Marina, Diana B.; Santa Maria, Sergio; Bhattacharya, Sharmila

    2016-01-01

    The BioSentinel mission was selected to launch as a secondary payload onboard NASA Exploration Mission 1 (EM-1) in 2018. In BioSentinel, the budding yeast Saccharomyces cerevisiae will be used as a biosensor to measure the long-term impact of deep-space radiation to living organisms. In the 4U-payload, desiccated yeast cells from different strains will be stored inside microfluidic cards equipped with 3-color LED optical detection system to monitor cell growth and metabolic activity. At different times throughout the 12-month mission, these cards will be filled with liquid yeast growth media to rehydrate and grow the desiccated cells. The growth and metabolic rates of wild-type and radiation-sensitive strains in deep-space radiation environment will be compared to the rates measured in the ground- and microgravity-control units. These rates will also be correlated with measurements obtained from onboard physical dosimeters. In our preliminary long-term desiccation study, we found that air-drying yeast cells in 10% trehalose is the best method of cell preservation in order to survive the entire 18-month mission duration (6-month pre-launch plus 12-month full-mission periods). However, our study also revealed that desiccated yeast cells have decreasing viability over time when stored in payload-like environment. This suggests that the yeast biosensor will have different population of cells at different time points during the long-term mission. In this study, we are characterizing genomic and phenotypic changes in our yeast biosensor due to long-term storage and desiccation. For each yeast strain that will be part of the biosensor, several clones were reisolated after long-term storage by desiccation. These clones were compared to their respective original isolate in terms of genomic composition, desiccation tolerance and radiation sensitivity. Interestingly, clones from a radiation-sensitive mutant have better desiccation tolerance compared to their original isolate without losing radiation sensitivity. We employed Next-Generation Sequencing technology to better understand this phenotypic variation. Current effort is focusing on the analysis of high-throughput sequencing data to look for genomic changes in these reisolated clones compared to their original isolate.

  12. Characterization of Capsicum annuum Genetic Diversity and Population Structure Based on Parallel Polymorphism Discovery with a 30K Unigene Pepper GeneChip

    PubMed Central

    Hill, Theresa A.; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W.; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum. PMID:23409153

  13. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    PubMed

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum.

  14. Modular assembly of transposable element arrays by microsatellite targeting in the guayule and rice genomes.

    PubMed

    Valdes Franco, José A; Wang, Yi; Huo, Naxin; Ponciano, Grisel; Colvin, Howard A; McMahan, Colleen M; Gu, Yong Q; Belknap, William R

    2018-04-19

    Guayule (Parthenium argentatum A. Gray) is a rubber-producing desert shrub native to Mexico and the United States. Guayule represents an alternative to Hevea brasiliensis as a source for commercial natural rubber. The efficient application of modern molecular/genetic tools to guayule improvement requires characterization of its genome. The 1.6 Gb guayule genome was sequenced, assembled and annotated. The final 1.5 Gb assembly, while fragmented (N 50  = 22 kb), maps > 95% of the shotgun reads and is essentially complete. Approximately 40,000 transcribed, protein encoding genes were annotated on the assembly. Further characterization of this genome revealed 15 families of small, microsatellite-associated, transposable elements (TEs) with unexpected chromosomal distribution profiles. These SaTar (Satellite Targeted) elements, which are non-autonomous Mu-like elements (MULEs), were frequently observed in multimeric linear arrays of unrelated individual elements within which no individual element is interrupted by another. This uniformly non-nested TE multimer architecture has not been previously described in either eukaryotic or prokaryotic genomes. Five families of similarly distributed non-autonomous MULEs (microsatellite associated, modularly assembled) were characterized in the rice genome. Families of TEs with similar structures and distribution profiles were identified in sorghum and citrus. The sequencing and assembly of the guayule genome provides a foundation for application of current crop improvement technologies to this plant. In addition, characterization of this genome revealed SaTar elements with distribution profiles unique among TEs. Satar targeting appears based on an alternative MULE recombination mechanism with the potential to impact gene evolution.

  15. Systems approach to characterize the metabolism of liver cancer stem cells expressing CD133

    NASA Astrophysics Data System (ADS)

    Hur, Wonhee; Ryu, Jae Yong; Kim, Hyun Uk; Hong, Sung Woo; Lee, Eun Byul; Lee, Sang Yup; Yoon, Seung Kew

    2017-04-01

    Liver cancer stem cells (LCSCs) have attracted attention because they cause therapeutic resistance in hepatocellular carcinoma (HCC). Understanding the metabolism of LCSCs can be a key to developing therapeutic strategy, but metabolic characteristics have not yet been studied. Here, we systematically analyzed and compared the global metabolic phenotype between LCSCs and non-LCSCs using transcriptome and metabolome data. We also reconstructed genome-scale metabolic models (GEMs) for LCSC and non-LCSC to comparatively examine differences in their metabolism at genome-scale. We demonstrated that LCSCs exhibited an increased proliferation rate through enhancing glycolysis compared with non-LCSCs. We also confirmed that MYC, a central point of regulation in cancer metabolism, was significantly up-regulated in LCSCs compared with non-LCSCs. Moreover, LCSCs tend to have less active fatty acid oxidation. In this study, the metabolic characteristics of LCSCs were identified using integrative systems analysis, and these characteristics could be potential cures for the resistance of liver cancer cells to anticancer treatments.

  16. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

    PubMed

    Borziak, Kirill; Posner, Mareike G; Upadhyay, Abhishek; Danson, Michael J; Bagby, Stefan; Dorus, Steve

    2014-01-01

    Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea, the extension of comparative genomic pathway profiling to broader metabolic and homeostasis networks should be useful in revealing characteristics from metagenomic datasets related to adaptations to diverse environments.

  17. Evolutionary Dynamics of Microsatellite Distribution in Plants: Insight from the Comparison of Sequenced Brassica, Arabidopsis and Other Angiosperm Species

    PubMed Central

    Shi, Jiaqin; Huang, Shunmou; Fu, Donghui; Yu, Jinyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2013-01-01

    Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences). The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number) of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type) the angiosperm species (aside from a few species) all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite distribution with respect to motif length, type and repeat number. Interestingly, several microsatellite characteristics seemed to be constant in plant evolution, which can be well explained by the general biological rules. PMID:23555856

  18. Molecular characterization of the Great Lakes viral hemorrhagic septicemia virus (VHSV) isolate from USA

    PubMed Central

    Ammayappan, Arun; Vakharia, Vikram N

    2009-01-01

    Background Viral hemorrhagic septicemia virus (VHSV) is a highly contagious viral disease of fresh and saltwater fish worldwide. VHSV caused several large scale fish kills in the Great Lakes area and has been found in 28 different host species. The emergence of VHS in the Great Lakes began with the isolation of VHSV from a diseased muskellunge (Esox masquinongy) caught from Lake St. Clair in 2003. VHSV is a member of the genus Novirhabdovirus, within the family Rhabdoviridae. It has a linear single-stranded, negative-sense RNA genome of approximately 11 kbp, with six genes. VHSV replicates in the cytoplasm and produces six monocistronic mRNAs. The gene order of VHSV is 3'-N-P-M-G-NV-L-5'. This study describes molecular characterization of the Great Lakes VHSV strain (MI03GL), and its phylogenetic relationships with selected European and North American isolates. Results The complete genomic sequences of VHSV-MI03GL strain was determined from cloned cDNA of six overlapping fragments, obtained by RT-PCR amplification of genomic RNA. The complete genome sequence of MI03GL comprises 11,184 nucleotides (GenBank GQ385941) with the gene order of 3'-N-P-M-G-NV-L-5'. These genes are separated by conserved gene junctions, with di-nucleotide gene spacers. The first 4 nucleotides at the termini of the VHSV genome are complementary and identical to other novirhadoviruses genomic termini. Sequence homology and phylogenetic analysis show that the Great Lakes virus is closely related to the Japanese strains JF00Ehi1 (96%) and KRRV9822 (95%). Among other novirhabdoviruses, VHSV shares highest sequence homology (62%) with snakehead rhabdovirus. Conclusion Phylogenetic tree obtained by comparing 48 glycoprotein gene sequences of different VHSV strains demonstrate that the Great Lakes VHSV is closely related to the North American and Japanese genotype IVa, but forms a distinct genotype IVb, which is clearly different from the three European genotypes. Molecular characterization of the Great Lakes isolate will be helpful in studying the pathogenesis of VHSV using a reverse genetics approach and developing efficient control strategies. PMID:19852863

  19. Pulmonary Sarcomatoid Carcinomas Commonly Harbor Either Potentially Targetable Genomic Alterations or High Tumor Mutational Burden as Observed by Comprehensive Genomic Profiling.

    PubMed

    Schrock, Alexa B; Li, Shuyu D; Frampton, Garrett M; Suh, James; Braun, Eduardo; Mehra, Ranee; Buck, Steven C; Bufill, Jose A; Peled, Nir; Karim, Nagla Abdel; Hsieh, K Cynthia; Doria, Manuel; Knost, James; Chen, Rong; Ou, Sai-Hong Ignatius; Ross, Jeffrey S; Stephens, Philip J; Fishkin, Paul; Miller, Vincent A; Ali, Siraj M; Halmos, Balazs; Liu, Jane J

    2017-06-01

    Pulmonary sarcomatoid carcinoma (PSC) is a high-grade NSCLC characterized by poor prognosis and resistance to chemotherapy. Development of targeted therapeutic strategies for PSC has been hampered because of limited and inconsistent molecular characterization. Hybrid capture-based comprehensive genomic profiling was performed on DNA from formalin-fixed paraffin-embedded sections of 15,867 NSCLCs, including 125 PSCs (0.8%). Tumor mutational burden (TMB) was calculated from 1.11 megabases (Mb) of sequenced DNA. The median age of the patients with PSC was 67 years (range 32-87), 58% were male, and 78% had stage IV disease. Tumor protein p53 gene (TP53) genomic alterations (GAs) were identified in 74% of cases, which had genomics distinct from TP53 wild-type cases, and 62% featured a GA in KRAS (34%) or one of seven genes currently recommended for testing in the National Comprehensive Cancer Network NSCLC guidelines, including the following: hepatocyte growth factor receptor gene (MET) (13.6%), EGFR (8.8%), BRAF (7.2%), erb-b2 receptor tyrosine kinase 2 gene (HER2) (1.6%), and ret proto-oncogene (RET) (0.8%). MET exon 14 alterations were enriched in PSC (12%) compared with non-PSC NSCLCs (∼3%) (p < 0.0001) and were more prevalent in PSC cases with an adenocarcinoma component. The fraction of PSC with a high TMB (>20 mutations per Mb) was notably higher than in non-PSC NSCLC (20% versus 14%, p = 0.056). Of nine patients with PSC treated with targeted or immunotherapies, three had partial responses and three had stable disease. Potentially targetable GAs in National Comprehensive Cancer Network NSCLC genes (30%) or intermediate or high TMB (43%, >10 mutations per Mb) were identified in most of the PSC cases. Thus, the use of comprehensive genomic profiling in clinical care may provide important treatment options for a historically poorly characterized and difficult to treat disease. Copyright © 2017 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.

  20. Subspecies diversity in bacteriocin production by intestinal Lactobacillus salivarius strains

    PubMed Central

    O’ Shea, Eileen F.; O’ Connor, Paula M.; Raftis, Emma J.; O’ Toole, Paul W.; Stanton, Catherine; Cotter, Paul D.; Ross, R. Paul; Hill, Colin

    2012-01-01

    A recent comparative genomic hybridization study in our laboratory revealed considerable plasticity within the bacteriocin locus of gastrointestinal strains of Lactobacillus salivarius. Most notably, these analyses led to the identification of two novel unmodified bacteriocins, salivaricin L and salivaricin T, produced by the neonatal isolate L. salivarius DPC6488 with immunity, regulatory and export systems analogous to those of abp118, a two-component bacteriocin produced by the well characterized reference strain L. salivarius UCC118. In this addendum we discuss the intraspecific diversity of our seven bacteriocin-producing L. salivarius isolates on a genome-wide level, and more specifically, with respect to their salivaricin loci. PMID:22892690

  1. Genome-scale reconstruction of the metabolic network in Yersinia pestis CO92

    NASA Astrophysics Data System (ADS)

    Navid, Ali; Almaas, Eivind

    2007-03-01

    The gram-negative bacterium Yersinia pestis is the causative agent of bubonic plague. Using publicly available genomic, biochemical and physiological data, we have developed a constraint-based flux balance model of metabolism in the CO92 strain (biovar Orientalis) of this organism. The metabolic reactions were appropriately compartmentalized, and the model accounts for the exchange of metabolites, as well as the import of nutrients and export of waste products. We have characterized the metabolic capabilities and phenotypes of this organism, after comparing the model predictions with available experimental observations to evaluate accuracy and completeness. We have also begun preliminary studies into how cellular metabolism affects virulence.

  2. Towards a complete map of the human long non-coding RNA transcriptome.

    PubMed

    Uszczynska-Ratajczak, Barbara; Lagarde, Julien; Frankish, Adam; Guigó, Roderic; Johnson, Rory

    2018-05-23

    Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.

  3. Subspecies diversity in bacteriocin production by intestinal Lactobacillus salivarius strains.

    PubMed

    O' Shea, Eileen F; O' Connor, Paula M; Raftis, Emma J; O' Toole, Paul W; Stanton, Catherine; Cotter, Paul D; Ross, R Paul; Hill, Colin

    2012-01-01

    A recent comparative genomic hybridization study in our laboratory revealed considerable plasticity within the bacteriocin locus of gastrointestinal strains of Lactobacillus salivarius. Most notably, these analyses led to the identification of two novel unmodified bacteriocins, salivaricin L and salivaricin T, produced by the neonatal isolate L. salivarius DPC6488 with immunity, regulatory and export systems analogous to those of abp118, a two-component bacteriocin produced by the well characterized reference strain L. salivarius UCC118. In this addendum we discuss the intraspecific diversity of our seven bacteriocin-producing L. salivarius isolates on a genome-wide level, and more specifically, with respect to their salivaricin loci.

  4. The Carcinogenic Liver Fluke, Clonorchis sinensis: New Assembly, Reannotation and Analysis of the Genome and Characterization of Tissue Transcriptomes

    PubMed Central

    Wang, Xiaoyun; Liu, Hailiang; Chen, Yangyi; Guo, Lei; Luo, Fang; Sun, Jiufeng; Mao, Qiang; Liang, Pei; Xie, Zhizhi; Zhou, Chenhui; Tian, Yanli; Lv, Xiaoli; Huang, Lisi; Zhou, Juanjuan; Hu, Yue; Li, Ran; Zhang, Fan; Lei, Huali; Li, Wenfang; Hu, Xuchu; Liang, Chi; Xu, Jin; Li, Xuerong; Yu, Xinbing

    2013-01-01

    Clonorchis sinensis (C. sinensis), an important food-borne parasite that inhabits the intrahepatic bile duct and causes clonorchiasis, is of interest to both the public health field and the scientific research community. To learn more about the migration, parasitism and pathogenesis of C. sinensis at the molecular level, the present study developed an upgraded genomic assembly and annotation by sequencing paired-end and mate-paired libraries. We also performed transcriptome sequence analyses on multiple C. sinensis tissues (sucker, muscle, ovary and testis). Genes encoding molecules involved in responses to stimuli and muscle-related development were abundantly expressed in the oral sucker. Compared with other species, genes encoding molecules that facilitate the recognition and transport of cholesterol were observed in high copy numbers in the genome and were highly expressed in the oral sucker. Genes encoding transporters for fatty acids, glucose, amino acids and oxygen were also highly expressed, along with other molecules involved in metabolizing these substrates. All genes involved in energy metabolism pathways, including the β-oxidation of fatty acids, the citrate cycle, oxidative phosphorylation, and fumarate reduction, were expressed in the adults. Finally, we also provide valuable insights into the mechanism underlying the process of pathogenesis by characterizing the secretome of C. sinensis. The characterization and elaborate analysis of the upgraded genome and the tissue transcriptomes not only form a detailed and fundamental C. sinensis resource but also provide novel insights into the physiology and pathogenesis of C. sinensis. We anticipate that this work will aid the development of innovative strategies for the prevention and control of clonorchiasis. PMID:23382950

  5. HnRNP A3 genes and pseudogenes in the vertebrate genomes.

    PubMed

    Makeyev, Aleksandr V; Kim, Chang Bae; Ruddle, Frank H; Enkhmandakh, Badam; Erdenechimeg, Lkhamsuren; Bayarsaihan, Dashzeveg

    2005-04-01

    The hnRNP A/B type proteins are abundant nuclear factors that bind to Pol II transcripts and are involved in numerous RNA-related activities. To date most data on the hnRNP A/B family have been obtained with recombinant proteins and cell cultures. Further characterization can result from an examination of the impact of various modifications in intact functional loci; however, such characterization is hampered by the presence of numerous and widely dispersed hnRNP A/B-related sequences in the mammalian genome. We have found hnRNP A3, a poorly recognized member of the hnRNP A/B family, among candidate transcription factors that interact with the regulatory region of the Hoxc8 gene and screened the human and mouse genomes for genes that encode hnRNP A3. We demonstrate that the sequence reported previously as the human hnRNP A3 gene (Accession number S63912) and located on 10p11.1 belongs to a processed pseudogene of the functional intron-containing locus HNRPA3, which we have identified on 2q31.2. We have also identified its murine orthologs on mouse chromosome 2D and rat chromosome 3q23. Alternative splices were revealed at the N-terminus and in the middle of hnRNP A3. 14 and 28 additional loci in the human and mouse genome, respectively, were mapped and identified as hnRNP A3 processed pseudogenes. In addition, we have found and compared hnRNP A3 orthologous genes in Gallus gallus, Xenopus tropicalis, and Danio rerio. The present in silico analysis serves as a necessary step toward a further functional characterization of hnRNP A3. (c) 2005 Wiley-Liss, Inc.

  6. Genetic characterization of K13965, a strain of Oak Vale virus from Western Australia

    PubMed Central

    Quan, Phenix-Lan; Williams, David T.; Johansen, Cheryl A.; Jain, Komal; Petrosov, Alexandra; Diviney, Sinead M.; Tashmukhamedova, Alla; Hutchison, Stephen K.; Tesh, Robert B.; Mackenzie, John S.; Briese, Thomas; Lipkin, W. Ian

    2011-01-01

    K13965, an uncharacterized virus, was isolated in 1993 from Anopheles annulipes mosquitoes collected in the Kimberley region of northern Western Australia. Here, we report its genomic sequence, identify it as a rhabdovirus, and characterize its phylogenetic relationships. The genome comprises a P′ (C) and SH protein similar to the recently characterized Tupaia and Durham viruses, and shows overlap between G and L genes. Comparison of K13965 genome sequence to other rhabdoviruses identified K13965 as a strain of the unclassified Australian Oak Vale rhabdovirus, whose complete genome sequence we also determined. Phylogenetic analysis of N and L sequences indicated genetic relationship to a recently proposed Sandjima virus clade, although the Oak Vale virus sequences form a branch separate from the African members of that group. PMID:21740935

  7. Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna

    Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less

  8. Novel Phage Group Infecting Lactobacillus delbrueckii subsp. lactis, as Revealed by Genomic and Proteomic Analysis of Bacteriophage Ldl1

    PubMed Central

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio

    2014-01-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 ± 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species. PMID:25501478

  9. Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species

    DOE PAGES

    Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; ...

    2015-04-09

    Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less

  10. A Hybrid Approach for the Automated Finishing of Bacterial Genomes

    PubMed Central

    Robins, William P.; Chin, Chen-Shan; Webster, Dale; Paxinos, Ellen; Hsu, David; Ashby, Meredith; Wang, Susana; Peluso, Paul; Sebra, Robert; Sorenson, Jon; Bullard, James; Yen, Jackie; Valdovino, Marie; Mollova, Emilia; Luong, Khai; Lin, Steven; LaMay, Brianna; Joshi, Amruta; Rowe, Lori; Frace, Michael; Tarr, Cheryl L.; Turnsek, Maryann; Davis, Brigid M; Kasarskis, Andrew; Mekalanos, John J.; Waldor, Matthew K.; Schadt, Eric E.

    2013-01-01

    Dramatic improvements in DNA sequencing technology have revolutionized our ability to characterize most genomic diversity. However, accurate resolution of large structural events has remained challenging due to the comparatively shorter read lengths of second-generation technologies. Emerging third-generation sequencing technologies, which yield markedly increased read length on rapid time scales and for low cost, have the potential to address assembly limitations. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at > 99.9% accuracy. Complex regions with clinically significant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 reference we obtain 14 and 8 scaffolds greater than 1kb, respectively, correcting several errors in the underlying source data. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly. PMID:22750883

  11. Marker chromosome genomic structure and temporal origin implicate a chromoanasynthesis event in a family with pleiotropic psychiatric phenotypes.

    PubMed

    Grochowski, Christopher M; Gu, Shen; Yuan, Bo; Tcw, Julia; Brennand, Kristen J; Sebat, Jonathan; Malhotra, Dheeraj; McCarthy, Shane; Rudolph, Uwe; Lindstrand, Anna; Chong, Zechen; Levy, Deborah L; Lupski, James R; Carvalho, Claudia M B

    2018-04-25

    Small supernumerary marker chromosomes (sSMC) are chromosomal fragments difficult to characterize genomically. Here, we detail a proband with schizoaffective disorder and a mother with bipolar disorder with psychotic features who present with a marker chromosome that segregates with disease. We explored the architecture of this marker and investigated its temporal origin. Array comparative genomic hybridization (aCGH) analysis revealed three duplications and three triplications that spanned the short arm of chromosome 9, suggestive of a chromoanasynthesis-like event. Segregation of marker genotypes, phased using sSMC mosaicism in the mother, provided evidence that it was generated during a germline-level event in the proband's maternal grandmother. Whole-genome sequencing (WGS) was performed to resolve the structure and junctions of the chromosomal fragments, revealing further complexities. While structural variations have been previously associated with neuropsychiatric disorders and marker chromosomes, here we detail the precise architecture, human life-cycle genesis, and propose a DNA replicative/repair mechanism underlying formation. © 2018 Wiley Periodicals, Inc.

  12. Insights into Land Plant Evolution Garnered from the Marchantia polymorpha Genome.

    PubMed

    Bowman, John L; Kohchi, Takayuki; Yamato, Katsuyuki T; Jenkins, Jerry; Shu, Shengqiang; Ishizaki, Kimitsune; Yamaoka, Shohei; Nishihama, Ryuichi; Nakamura, Yasukazu; Berger, Frédéric; Adam, Catherine; Aki, Shiori Sugamata; Althoff, Felix; Araki, Takashi; Arteaga-Vazquez, Mario A; Balasubrmanian, Sureshkumar; Barry, Kerrie; Bauer, Diane; Boehm, Christian R; Briginshaw, Liam; Caballero-Perez, Juan; Catarino, Bruno; Chen, Feng; Chiyoda, Shota; Chovatia, Mansi; Davies, Kevin M; Delmans, Mihails; Demura, Taku; Dierschke, Tom; Dolan, Liam; Dorantes-Acosta, Ana E; Eklund, D Magnus; Florent, Stevie N; Flores-Sandoval, Eduardo; Fujiyama, Asao; Fukuzawa, Hideya; Galik, Bence; Grimanelli, Daniel; Grimwood, Jane; Grossniklaus, Ueli; Hamada, Takahiro; Haseloff, Jim; Hetherington, Alexander J; Higo, Asuka; Hirakawa, Yuki; Hundley, Hope N; Ikeda, Yoko; Inoue, Keisuke; Inoue, Shin-Ichiro; Ishida, Sakiko; Jia, Qidong; Kakita, Mitsuru; Kanazawa, Takehiko; Kawai, Yosuke; Kawashima, Tomokazu; Kennedy, Megan; Kinose, Keita; Kinoshita, Toshinori; Kohara, Yuji; Koide, Eri; Komatsu, Kenji; Kopischke, Sarah; Kubo, Minoru; Kyozuka, Junko; Lagercrantz, Ulf; Lin, Shih-Shun; Lindquist, Erika; Lipzen, Anna M; Lu, Chia-Wei; De Luna, Efraín; Martienssen, Robert A; Minamino, Naoki; Mizutani, Masaharu; Mizutani, Miya; Mochizuki, Nobuyoshi; Monte, Isabel; Mosher, Rebecca; Nagasaki, Hideki; Nakagami, Hirofumi; Naramoto, Satoshi; Nishitani, Kazuhiko; Ohtani, Misato; Okamoto, Takashi; Okumura, Masaki; Phillips, Jeremy; Pollak, Bernardo; Reinders, Anke; Rövekamp, Moritz; Sano, Ryosuke; Sawa, Shinichiro; Schmid, Marc W; Shirakawa, Makoto; Solano, Roberto; Spunde, Alexander; Suetsugu, Noriyuki; Sugano, Sumio; Sugiyama, Akifumi; Sun, Rui; Suzuki, Yutaka; Takenaka, Mizuki; Takezawa, Daisuke; Tomogane, Hirokazu; Tsuzuki, Masayuki; Ueda, Takashi; Umeda, Masaaki; Ward, John M; Watanabe, Yuichiro; Yazaki, Kazufumi; Yokoyama, Ryusuke; Yoshitake, Yoshihiro; Yotsui, Izumi; Zachgo, Sabine; Schmutz, Jeremy

    2017-10-05

    The evolution of land flora transformed the terrestrial environment. Land plants evolved from an ancestral charophycean alga from which they inherited developmental, biochemical, and cell biological attributes. Additional biochemical and physiological adaptations to land, and a life cycle with an alternation between multicellular haploid and diploid generations that facilitated efficient dispersal of desiccation tolerant spores, evolved in the ancestral land plant. We analyzed the genome of the liverwort Marchantia polymorpha, a member of a basal land plant lineage. Relative to charophycean algae, land plant genomes are characterized by genes encoding novel biochemical pathways, new phytohormone signaling pathways (notably auxin), expanded repertoires of signaling pathways, and increased diversity in some transcription factor families. Compared with other sequenced land plants, M. polymorpha exhibits low genetic redundancy in most regulatory pathways, with this portion of its genome resembling that predicted for the ancestral land plant. PAPERCLIP. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  13. Molecular Networking and Pattern-Based Genome Mining Improves discovery of biosynthetic gene clusters and their products from Salinispora species

    PubMed Central

    Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; Sarkar, Anindita; Li, Jie; Ziemert, Nadine; Wang, Mingxun; Bandeira, Nuno; Moore, Bradley S.; Dorrestein, Pieter C.; Jensen, Paul R.

    2015-01-01

    Summary Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. Here we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated the identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. These efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches. PMID:25865308

  14. Isolation and characterization of a bacteriophage phiEap-2 infecting multidrug resistant Enterobacter aerogenes

    PubMed Central

    Li, Erna; Wei, Xiao; Ma, Yanyan; Yin, Zhe; Li, Huan; Lin, Weishi; Wang, Xuesong; Li, Chao; Shen, Zhiqiang; Zhao, Ruixiang; Yang, Huiying; Jiang, Aimin; Yang, Wenhui; Yuan, Jing; Zhao, Xiangna

    2016-01-01

    Enterobacter aerogenes (Enterobacteriaceae) is an important opportunistic pathogen that causes hospital-acquired pneumonia, bacteremia, and urinary tract infections. Recently, multidrug-resistant E. aerogenes have been a public health problem. To develop an effective antimicrobial agent, bacteriophage phiEap-2 was isolated from sewage and its genome was sequenced because of its ability to lyse the multidrug-resistant clinical E. aerogenes strain 3-SP. Morphological observations suggested that the phage belongs to the Siphoviridae family. Comparative genome analysis revealed that phage phiEap-2 is related to the Salmonella phage FSL SP-031 (KC139518). All of the structural gene products (except capsid protein) encoded by phiEap-2 had orthologous gene products in FSL SP-031 and Serratia phage Eta (KC460990). Here, we report the complete genome sequence of phiEap-2 and major findings from the genomic analysis. Knowledge of this phage might be helpful for developing therapeutic strategies against E. aerogenes. PMID:27320081

  15. Isolation and characterization of a bacteriophage phiEap-2 infecting multidrug resistant Enterobacter aerogenes.

    PubMed

    Li, Erna; Wei, Xiao; Ma, Yanyan; Yin, Zhe; Li, Huan; Lin, Weishi; Wang, Xuesong; Li, Chao; Shen, Zhiqiang; Zhao, Ruixiang; Yang, Huiying; Jiang, Aimin; Yang, Wenhui; Yuan, Jing; Zhao, Xiangna

    2016-06-20

    Enterobacter aerogenes (Enterobacteriaceae) is an important opportunistic pathogen that causes hospital-acquired pneumonia, bacteremia, and urinary tract infections. Recently, multidrug-resistant E. aerogenes have been a public health problem. To develop an effective antimicrobial agent, bacteriophage phiEap-2 was isolated from sewage and its genome was sequenced because of its ability to lyse the multidrug-resistant clinical E. aerogenes strain 3-SP. Morphological observations suggested that the phage belongs to the Siphoviridae family. Comparative genome analysis revealed that phage phiEap-2 is related to the Salmonella phage FSL SP-031 (KC139518). All of the structural gene products (except capsid protein) encoded by phiEap-2 had orthologous gene products in FSL SP-031 and Serratia phage Eta (KC460990). Here, we report the complete genome sequence of phiEap-2 and major findings from the genomic analysis. Knowledge of this phage might be helpful for developing therapeutic strategies against E. aerogenes.

  16. Germline Mutations and Polymorphisms in the Origins of Cancers in Women

    PubMed Central

    Hirshfield, Kim M.; Rebbeck, Timothy R.; Levine, Arnold J.

    2010-01-01

    Several female malignancies including breast, ovarian, and endometrial cancers can be characterized based on known somatic and germline mutations. Initiation and propagation of tumors reflect underlying genomic alterations such as mutations, polymorphisms, and copy number variations found in genes of multiple cellular pathways. The contributions of any single genetic variation or mutation in a population depend on its frequency and penetrance as well as tissue-specific functionality. Genome wide association studies, fluorescence in situ hybridization, comparative genomic hybridization, and candidate gene studies have enumerated genetic contributors to cancers in women. These include p53, BRCA1, BRCA2, STK11, PTEN, CHEK2, ATM, BRIP1, PALB2, FGFR2, TGFB1, MDM2, MDM4 as well as several other chromosomal loci. Based on the heterogeneity within a specific tumor type, a combination of genomic alterations defines the cancer subtype, biologic behavior, and in some cases, response to therapeutics. Consideration of tumor heterogeneity is therefore important in the critical analysis of gene associations in cancer. PMID:20111735

  17. Reconstructing genome evolution in historic samples of the Irish potato famine pathogen

    PubMed Central

    Martin, Michael D.; Cappellini, Enrico; Samaniego, Jose A.; Zepeda, M. Lisandra; Campos, Paula F.; Seguin-Orlando, Andaine; Wales, Nathan; Orlando, Ludovic; Ho, Simon Y. W.; Dietrich, Fred S.; Mieczkowski, Piotr A.; Heitman, Joseph; Willerslev, Eske; Krogh, Anders; Ristaino, Jean B.; Gilbert, M. Thomas P.

    2013-01-01

    Responsible for the Irish potato famine of 1845–49, the oomycete pathogen Phytophthora infestans caused persistent, devastating outbreaks of potato late blight across Europe in the 19th century. Despite continued interest in the history and spread of the pathogen, the genome of the famine-era strain remains entirely unknown. Here we characterize temporal genomic changes in introduced P. infestans. We shotgun sequence five 19th-century European strains from archival herbarium samples—including the oldest known European specimen, collected in 1845 from the first reported source of introduction. We then compare their genomes to those of extant isolates. We report multiple distinct genotypes in historical Europe and a suite of infection-related genes different from modern strains. At virulence-related loci, several now-ubiquitous genotypes were absent from the historical gene pool. At least one of these genotypes encodes a virulent phenotype in modern strains, which helps explain the 20th century’s episodic replacements of European P. infestans lineages. PMID:23863894

  18. Identification and characterization of dinucleotide repeat (CA)[sub n] markers for genetic mapping in dog

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ostrander, E.A.; Sprague, G.F. Jr.; Rine, J.

    1993-04-01

    A large block of simple sequence repeat (SSR) polymorphisms for the dog genome has been isolated and characterized. Screening of primary libraries by conventional hybridization methods as well as by screening of enriched marker-selected libraries led to the isolation of a large number of genomic clones that contained (CA)[sub n] repeats. The sequences of 101 clones showed that the size and complexity of (CA)[sub n] repeats in the dog genome were similar to those reported for these markers in the human genome. Detailed analysis of a representative subset of these markers revealed that most markers were moderately to highly polymorphic,more » with PIC values exceeding 0.70 for 33% of the markers tested. An association between higher PIC values and markers containing longer (CA)[sub n] repeats was observed in these studies, as previously noted for similar markers in the human genome. A list of primer sequences that tag each characterized marker is provided, and a comprehensive system of nomenclature for the dog genome is suggested. 28 refs., 4 figs., 2 tabs.« less

  19. Molecular characterization of colorectal cancer patients and concomitant patient-derived tumor cell establishment

    PubMed Central

    Kim, Seung Tae; Kim, Sun Young; Kim, Nayoung K.D.; Jang, Jiryeon; Kang, Mihyun; Jang, Hyojin; Ahn, Soomin; Kim, Seok Hyeong; Park, Yoona; Cho, Yong Beom; Heo, Jeong Wook; Lee, Woo Yong; Park, Joon Oh; Lim, Ho Yeong; Kang, Won Ki; Park, Young Suk; Park, Woong-Yang; Lee, Jeeyun; Kim, Hee Cheol

    2016-01-01

    Background We aimed to establish a prospectively enrolled colorectal cancer (CRC) cohort for targeted sequencing of primary tumors from CRC patients. In parallel, we established collateral PDC models from the matched primary tumor tissues, which may be later used as preclinical models for genome-directed targeted therapy experiments. Results In all, we identified 27 SNVs in the 6 genes such as PIK3CA (N = 16), BRAF (N = 6), NRAS (N = 2), and CTNNB1 (N = 1), PTEN (N = 1), and ERBB2 (N = 1). RET-NCOA4 translocation was observed in one out of 105 patients (0.9%). PDC models were successfully established from 62 (55.4%) of the 112 samples. To confirm the genomic features of various tumor cells, we compared variant allele frequency results of the primary tumor and progeny PDCs. The Pearson correlation coefficient between the variants from primary tumor cells and PDCs was 0.881. Methods Between April 2014 and June 2015, 112 patients with CRC who underwent resection of the primary tumor were enrolled in the SMC Oncology Biomarker study. The PDC culture protocol was performed for all eligible patients. All of the primary tumors from the 112 patients who provided written informed consent were genomically sequenced with targeted sequencing. In parallel, PDC establishment was attempted for all sequenced tumors. Conclusions We have prospectively sequenced a CRC cohort of 105 patients and successfully established 62 PDC in parallel. Each genomically characterized PDCs can be used as a preclinical model especially in rare genomic alteration event. PMID:26909603

  20. Genomics of Bacterial and Archaeal Viruses: Dynamics within the Prokaryotic Virosphere

    PubMed Central

    Krupovic, Mart; Prangishvili, David; Hendrix, Roger W.; Bamford, Dennis H.

    2011-01-01

    Summary: Prokaryotes, bacteria and archaea, are the most abundant cellular organisms among those sharing the planet Earth with human beings (among others). However, numerous ecological studies have revealed that it is actually prokaryotic viruses that predominate on our planet and outnumber their hosts by at least an order of magnitude. An understanding of how this viral domain is organized and what are the mechanisms governing its evolution is therefore of great interest and importance. The vast majority of characterized prokaryotic viruses belong to the order Caudovirales, double-stranded DNA (dsDNA) bacteriophages with tails. Consequently, these viruses have been studied (and reviewed) extensively from both genomic and functional perspectives. However, albeit numerous, tailed phages represent only a minor fraction of the prokaryotic virus diversity. Therefore, the knowledge which has been generated for this viral system does not offer a comprehensive view of the prokaryotic virosphere. In this review, we discuss all families of bacterial and archaeal viruses that contain more than one characterized member and for which evolutionary conclusions can be attempted by use of comparative genomic analysis. We focus on the molecular mechanisms of their genome evolution as well as on the relationships between different viral groups and plasmids. It becomes clear that evolutionary mechanisms shaping the genomes of prokaryotic viruses vary between different families and depend on the type of the nucleic acid, characteristics of the virion structure, as well as the mode of the life cycle. We also point out that horizontal gene transfer is not equally prevalent in different virus families and is not uniformly unrestricted for diverse viral functions. PMID:22126996

  1. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

    PubMed Central

    2012-01-01

    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding community in marker-assisted breeding, and for performing comparative genomic studies in Rosaceae. PMID:23039990

  2. Comparative genome and transcriptome analyses of the social amoeba Acytostelium subglobosum that accomplishes multicellular development without germ-soma differentiation.

    PubMed

    Urushihara, Hideko; Kuwayama, Hidekazu; Fukuhara, Kensuke; Itoh, Takehiko; Kagoshima, Hiroshi; Shin-I, Tadasu; Toyoda, Atsushi; Ohishi, Kazuyo; Taniguchi, Tateaki; Noguchi, Hideki; Kuroki, Yoko; Hata, Takashi; Uchi, Kyoko; Mohri, Kurato; King, Jason S; Insall, Robert H; Kohara, Yuji; Fujiyama, Asao

    2015-02-14

    Social amoebae are lower eukaryotes that inhabit the soil. They are characterized by the construction of a starvation-induced multicellular fruiting body with a spore ball and supportive stalk. In most species, the stalk is filled with motile stalk cells, as represented by the model organism Dictyostelium discoideum, whose developmental mechanisms have been well characterized. However, in the genus Acytostelium, the stalk is acellular and all aggregated cells become spores. Phylogenetic analyses have shown that it is not an ancestral genus but has lost the ability to undergo cell differentiation. We performed genome and transcriptome analyses of Acytostelium subglobosum and compared our findings to other available dictyostelid genome data. Although A. subglobosum adopts a qualitatively different developmental program from other dictyostelids, its gene repertoire was largely conserved. Yet, families of polyketide synthase and extracellular matrix proteins have not expanded and a serine protease and ABC transporter B family gene, tagA, and a few other developmental genes are missing in the A. subglobosum lineage. Temporal gene expression patterns are astonishingly dissimilar from those of D. discoideum, and only a limited fraction of the ortholog pairs shared the same expression patterns, so that some signaling cascades for development seem to be disabled in A. subglobosum. The absence of the ability to undergo cell differentiation in Acytostelium is accompanied by a small change in coding potential and extensive alterations in gene expression patterns.

  3. Intrastrand triplex DNA repeats in bacteria: a source of genomic instability

    PubMed Central

    Holder, Isabelle T.; Wagner, Stefanie; Xiong, Peiwen; Sinn, Malte; Frickey, Tancred; Meyer, Axel; Hartig, Jörg S.

    2015-01-01

    Repetitive nucleic acid sequences are often prone to form secondary structures distinct from B-DNA. Prominent examples of such structures are DNA triplexes. We observed that certain intrastrand triplex motifs are highly conserved and abundant in prokaryotic genomes. A systematic search of 5246 different prokaryotic plasmids and genomes for intrastrand triplex motifs was conducted and the results summarized in the ITxF database available online at http://bioinformatics.uni-konstanz.de/utils/ITxF/. Next we investigated biophysical and biochemical properties of a particular G/C-rich triplex motif (TM) that occurs in many copies in more than 260 bacterial genomes by CD and nuclear magnetic resonance spectroscopy as well as in vivo footprinting techniques. A characterization of putative properties and functions of these unusually frequent nucleic acid motifs demonstrated that the occurrence of the TM is associated with a high degree of genomic instability. TM-containing genomic loci are significantly more rearranged among closely related Escherichia coli strains compared to control sites. In addition, we found very high frequencies of TM motifs in certain Enterobacteria and Cyanobacteria that were previously described as genetically highly diverse. In conclusion we link intrastrand triplex motifs with the induction of genomic instability. We speculate that the observed instability might be an adaptive feature of these genomes that creates variation for natural selection to act upon. PMID:26450966

  4. Genome Wide Characterization of Short Tandem Repeat Markers in Sweet Orange (Citrus sinensis)

    PubMed Central

    Biswas, Manosh Kumar; Xu, Qiang; Mayer, Christoph; Deng, Xiuxin

    2014-01-01

    Sweet orange (Citrus sinensis) is one of the major cultivated and most-consumed citrus species. With the goal of enhancing the genomic resources in citrus, we surveyed, developed and characterized microsatellite markers in the ≈347 Mb sequence assembly of the sweet orange genome. A total of 50,846 SSRs were identified with a frequency of 146.4 SSRs/Mbp. Dinucleotide repeats are the most frequent repeat class and the highest density of SSRs was found in chromosome 4. SSRs are non-randomly distributed in the genome and most of the SSRs (62.02%) are located in the intergenic regions. We found that AT-rich SSRs are more frequent than GC-rich SSRs. A total number of 21,248 SSR primers were successfully developed, which represents 89 SSR markers per Mb of the genome. A subset of 950 developed SSR primer pairs were synthesized and tested by wet lab experiments on a set of 16 citrus accessions. In total we identified 534 (56.21%) polymorphic SSR markers that will be useful in citrus improvement. The number of amplified alleles ranges from 2 to 12 with an average of 4 alleles per marker and an average PIC value of 0.75. The newly developed sweet orange primer sequences, their in silico PCR products, exact position in the genome assembly and putative function are made publicly available. We present the largest number of SSR markers ever developed for a citrus species. Almost two thirds of the markers are transferable to 16 citrus relatives and may be used for constructing a high density linkage map. In addition, they are valuable for marker-assisted selection studies, population structure analyses and comparative genomic studies of C. sinensis with other citrus related species. Altogether, these markers provide a significant contribution to the citrus research community. PMID:25148383

  5. Genome wide characterization of short tandem repeat markers in sweet orange (Citrus sinensis).

    PubMed

    Biswas, Manosh Kumar; Xu, Qiang; Mayer, Christoph; Deng, Xiuxin

    2014-01-01

    Sweet orange (Citrus sinensis) is one of the major cultivated and most-consumed citrus species. With the goal of enhancing the genomic resources in citrus, we surveyed, developed and characterized microsatellite markers in the ≈347 Mb sequence assembly of the sweet orange genome. A total of 50,846 SSRs were identified with a frequency of 146.4 SSRs/Mbp. Dinucleotide repeats are the most frequent repeat class and the highest density of SSRs was found in chromosome 4. SSRs are non-randomly distributed in the genome and most of the SSRs (62.02%) are located in the intergenic regions. We found that AT-rich SSRs are more frequent than GC-rich SSRs. A total number of 21,248 SSR primers were successfully developed, which represents 89 SSR markers per Mb of the genome. A subset of 950 developed SSR primer pairs were synthesized and tested by wet lab experiments on a set of 16 citrus accessions. In total we identified 534 (56.21%) polymorphic SSR markers that will be useful in citrus improvement. The number of amplified alleles ranges from 2 to 12 with an average of 4 alleles per marker and an average PIC value of 0.75. The newly developed sweet orange primer sequences, their in silico PCR products, exact position in the genome assembly and putative function are made publicly available. We present the largest number of SSR markers ever developed for a citrus species. Almost two thirds of the markers are transferable to 16 citrus relatives and may be used for constructing a high density linkage map. In addition, they are valuable for marker-assisted selection studies, population structure analyses and comparative genomic studies of C. sinensis with other citrus related species. Altogether, these markers provide a significant contribution to the citrus research community.

  6. Comparative Genomics of the Ectomycorrhizal Sister Species Rhizopogon vinicolor and Rhizopogon vesiculosus (Basidiomycota: Boletales) Reveals a Divergence of the Mating Type B Locus

    PubMed Central

    Mujic, Alija Bajro; Kuo, Alan; Tritt, Andrew; Lipzen, Anna; Chen, Cindy; Johnson, Jenifer; Sharma, Aditi; Barry, Kerrie; Grigoriev, Igor V.; Spatafora, Joseph W.

    2017-01-01

    Divergence of breeding system plays an important role in fungal speciation. Ectomycorrhizal fungi, however, pose a challenge for the study of reproductive biology because most cannot be mated under laboratory conditions. To overcome this barrier, we sequenced the draft genomes of the ectomycorrhizal sister species Rhizopogon vinicolor Smith and Zeller and R. vesiculosus Smith and Zeller (Basidiomycota, Boletales)—the first genomes available for Basidiomycota truffles—and characterized gene content and organization surrounding their mating type loci. Both species possess a pair of homeodomain transcription factor homologs at the mating type A-locus as well as pheromone receptor and pheromone precursor homologs at the mating type B-locus. Comparison of Rhizopogon genomes with genomes from Boletales, Agaricales, and Polyporales revealed synteny of the A-locus region within Boletales, but several genomic rearrangements across orders. Our findings suggest correlation between gene content at the B-locus region and breeding system in Boletales with tetrapolar species possessing more diverse gene content than bipolar species. Rhizopogon vinicolor possesses a greater number of B-locus pheromone receptor and precursor genes than R. vesiculosus, as well as a pair of isoprenyl cysteine methyltransferase genes flanking the B-locus compared to a single copy in R. vesiculosus. Examination of dikaryotic single nucleotide polymorphisms within genomes revealed greater heterozygosity in R. vinicolor, consistent with increased rates of outcrossing. Both species possess the components of a heterothallic breeding system with R. vinicolor possessing a B-locus region structure consistent with tetrapolar Boletales and R. vesiculosus possessing a B-locus region structure intermediate between bipolar and tetrapolar Boletales. PMID:28450370

  7. Comparative genome analysis of a large Dutch Legionella pneumophila strain collection identifies five markers highly correlated with clinical strains

    PubMed Central

    2010-01-01

    Background Discrimination between clinical and environmental strains within many bacterial species is currently underexplored. Genomic analyses have clearly shown the enormous variability in genome composition between different strains of a bacterial species. In this study we have used Legionella pneumophila, the causative agent of Legionnaire's disease, to search for genomic markers related to pathogenicity. During a large surveillance study in The Netherlands well-characterized patient-derived strains and environmental strains were collected. We have used a mixed-genome microarray to perform comparative-genome analysis of 257 strains from this collection. Results Microarray analysis indicated that 480 DNA markers (out of in total 3360 markers) showed clear variation in presence between individual strains and these were therefore selected for further analysis. Unsupervised statistical analysis of these markers showed the enormous genomic variation within the species but did not show any correlation with a pathogenic phenotype. We therefore used supervised statistical analysis to identify discriminating markers. Genetic programming was used both to identify predictive markers and to define their interrelationships. A model consisting of five markers was developed that together correctly predicted 100% of the clinical strains and 69% of the environmental strains. Conclusions A novel approach for identifying predictive markers enabling discrimination between clinical and environmental isolates of L. pneumophila is presented. Out of over 3000 possible markers, five were selected that together enabled correct prediction of all the clinical strains included in this study. This novel approach for identifying predictive markers can be applied to all bacterial species, allowing for better discrimination between strains well equipped to cause human disease and relatively harmless strains. PMID:20630115

  8. Shifts in the evolutionary rate and intensity of purifying selection between two Brassica genomes revealed by analyses of orthologous transposons and relics of a whole genome triplication.

    PubMed

    Zhao, Meixia; Du, Jianchang; Lin, Feng; Tong, Chaobo; Yu, Jingyin; Huang, Shunmou; Wang, Xiaowu; Liu, Shengyi; Ma, Jianxin

    2013-10-01

    Recent sequencing of the Brassica rapa and Brassica oleracea genomes revealed extremely contrasting genomic features such as the abundance and distribution of transposable elements between the two genomes. However, whether and how these structural differentiations may have influenced the evolutionary rates of the two genomes since their split from a common ancestor are unknown. Here, we investigated and compared the rates of nucleotide substitution between two long terminal repeats (LTRs) of individual orthologous LTR-retrotransposons, the rates of synonymous and non-synonymous substitution among triplicated genes retained in both genomes from a shared whole genome triplication event, and the rates of genetic recombination estimated/deduced by the comparison of physical and genetic distances along chromosomes and ratios of solo LTRs to intact elements. Overall, LTR sequences and genic sequences showed more rapid nucleotide substitution in B. rapa than in B. oleracea. Synonymous substitution of triplicated genes retained from a shared whole genome triplication was detected at higher rates in B. rapa than in B. oleracea. Interestingly, non-synonymous substitution was observed at lower rates in the former than in the latter, indicating shifted densities of purifying selection between the two genomes. In addition to evolutionary asymmetry, orthologous genes differentially regulated and/or disrupted by transposable elements between the two genomes were also characterized. Our analyses suggest that local genomic and epigenomic features, such as recombination rates and chromatin dynamics reshaped by independent proliferation of transposable elements and elimination between the two genomes, are perhaps partially the causes and partially the outcomes of the observed inter-specific asymmetric evolution. © 2013 Purdue University The Plant Journal © 2013 John Wiley & Sons Ltd.

  9. Comparative molecular cytogenetic characterization of seven Deschampsia (Poaceae) species

    PubMed Central

    Bolsheva, Nadezhda L.; Zoshchuk, Svyatoslav A.; Twardovska, Maryana O.; Yurkevich, Olga Yu; Andreev, Igor O.; Samatadze, Tatiana E.; Badaeva, Ekaterina D.; Kunakh, Viktor A.; Muravenko, Olga V.

    2017-01-01

    The genus Deschampsia P. Beauv (Poaceae) involves a group of widespread polymorphic species. Some of them are highly tolerant to stressful and variable environmental conditions, and D. antarctica is one of the only two vascular plants growing in Antarctic. This species is a source of useful for selection traits and a valuable model for studying an environmental stress tolerance in plants. Genome diversity and comparative chromosomal phylogeny within the genus have not been studied yet as karyotypes of most Deschampsia species are poorly investigated. We firstly conducted a comparative molecular cytogenetic analysis of D. antarctica (Antarctic Peninsula) and related species from various localities (D. cespitosa, D. danthonioides, D. elongata, D. flexuosa (= Avenella flexuosa), D. parvula and D. sukatschewii by fluorescence in situ hybridization with 45S and 5S rDNA, DAPI-banding and sequential rapid in situ hybridization with genomic DNA of D. antarctica, D. cespitosa, and D. flexuosa. Based on patterns of distribution of the examined markers, chromosomes of the studied species were identified. Within these species, common features as well as species peculiarities in their karyotypic structure and chromosomal distribution of molecular cytogenetic markers were characterized. Different chromosomal rearrangements were detected in D. antarctica, D. flexuosa, D. elongata and D. sukatschewii. In karyotypes of D. antarctica, D. cespitosa, D. elongata and D. sukatschewii, 0–3 B chromosomes possessed distinct DAPI-bands were observed. Our findings suggest that the genome evolution of the genus Deschampsia involved polyploidy and also different chromosomal rearrangements. The obtained results will help clarify the relationships within the genus Deschampsia, and can be a basis for the further genetic and biotechnological studies as well as for selection of plants tolerant to extreme habitats. PMID:28407010

  10. Comparative Genomic and Phenotypic Characterization of Pathogenic and Non-Pathogenic Strains of Xanthomonas arboricola Reveals Insights into the Infection Process of Bacterial Spot Disease of Stone Fruits

    PubMed Central

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.

    2016-01-01

    Xanthomonas arboricola pv. pruni is the causal agent of bacterial spot disease of stone fruits, a quarantinable pathogen in several areas worldwide, including the European Union. In order to develop efficient control methods for this disease, it is necessary to improve the understanding of the key determinants associated with host restriction, colonization and the development of pathogenesis. After an initial characterization, by multilocus sequence analysis, of 15 strains of X. arboricola isolated from Prunus, one strain did not group into the pathovar pruni or into other pathovars of this species and therefore it was identified and defined as a X. arboricola pv. pruni look-a-like. This non-pathogenic strain and two typical strains of X. arboricola pv. pruni were selected for a whole genome and phenotype comparative analysis in features associated with the pathogenesis process in Xanthomonas. Comparative analysis among these bacterial strains isolated from Prunus spp. and the inclusion of 15 publicly available genome sequences from other pathogenic and non-pathogenic strains of X. arboricola revealed variations in the phenotype associated with variations in the profiles of TonB-dependent transporters, sensors of the two-component regulatory system, methyl accepting chemotaxis proteins, components of the flagella and the type IV pilus, as well as in the repertoire of cell-wall degrading enzymes and the components of the type III secretion system and related effectors. These variations provide a global overview of those mechanisms that could be associated with the development of bacterial spot disease. Additionally, it pointed out some features that might influence the host specificity and the variable virulence observed in X. arboricola. PMID:27571391

  11. Characterization of a herpes virus isolated from domestic geese in Australia

    USGS Publications Warehouse

    Gough, R.E.; Hansen, W.R.

    2000-01-01

    A herpesvirus (GHV 552/89) associated with high mortality in a flock of domestic geese in Australia was compared with duck virus enteritis (DVE) herpesvirus by cross-protection studies in domestic geese, Muscovy ducks and commercial Pekin ducks. In DVE-vaccinated geese, Muscovy ducks and Pekin ducks, mortality levels of 100, 50 and 0%, respectively, were recorded following challenge with GHV 552/89. Conversely, in geese, Muscovy ducks and Pekin ducks immunized with inactivated GHV 552/89, 100% mortality was observed in the geese and Muscovy ducks, and 80% in the Pekin ducks following challenge with DVE virus. The isolate was also compared with six other avian herpesviruses using cross-neutralization tests in cell cultures. No detectable cross-neutralization occurred with any of the avian herpesviruses tested. Further characterization of GHV 552/89 was undertaken by comparing its genome with strains of DVE herpesvirus using restriction endonuclease analysis of the viral DNA and a polymerase chain reaction (PCR) test. Following digestion with HindIII, the DNA fragment pattern of GHV 552/89 was found to be completely different from the DVE viruses. Similarities were found between the digestion patterns of a UK and a US DVE isolate, but both were distinguishable from a UK vaccine strain. The results of the PCR analysis and comparison using two DVE-specific primer sets did not produce specific amplification products of expected molecular weights (603 and 446 base pairs) from the GHV 552/89 genome. The PCR products derived from the DVE strains were similar to those derived from the DVE control DNA. From the results of this study, it is concluded that the goose herpesvirus GHV 552/89 is antigenically and genomically distinct from DVE herpesvirus.

  12. Comparative molecular cytogenetic characterization of seven Deschampsia (Poaceae) species.

    PubMed

    Amosova, Alexandra V; Bolsheva, Nadezhda L; Zoshchuk, Svyatoslav A; Twardovska, Maryana O; Yurkevich, Olga Yu; Andreev, Igor O; Samatadze, Tatiana E; Badaeva, Ekaterina D; Kunakh, Viktor A; Muravenko, Olga V

    2017-01-01

    The genus Deschampsia P. Beauv (Poaceae) involves a group of widespread polymorphic species. Some of them are highly tolerant to stressful and variable environmental conditions, and D. antarctica is one of the only two vascular plants growing in Antarctic. This species is a source of useful for selection traits and a valuable model for studying an environmental stress tolerance in plants. Genome diversity and comparative chromosomal phylogeny within the genus have not been studied yet as karyotypes of most Deschampsia species are poorly investigated. We firstly conducted a comparative molecular cytogenetic analysis of D. antarctica (Antarctic Peninsula) and related species from various localities (D. cespitosa, D. danthonioides, D. elongata, D. flexuosa (= Avenella flexuosa), D. parvula and D. sukatschewii by fluorescence in situ hybridization with 45S and 5S rDNA, DAPI-banding and sequential rapid in situ hybridization with genomic DNA of D. antarctica, D. cespitosa, and D. flexuosa. Based on patterns of distribution of the examined markers, chromosomes of the studied species were identified. Within these species, common features as well as species peculiarities in their karyotypic structure and chromosomal distribution of molecular cytogenetic markers were characterized. Different chromosomal rearrangements were detected in D. antarctica, D. flexuosa, D. elongata and D. sukatschewii. In karyotypes of D. antarctica, D. cespitosa, D. elongata and D. sukatschewii, 0-3 B chromosomes possessed distinct DAPI-bands were observed. Our findings suggest that the genome evolution of the genus Deschampsia involved polyploidy and also different chromosomal rearrangements. The obtained results will help clarify the relationships within the genus Deschampsia, and can be a basis for the further genetic and biotechnological studies as well as for selection of plants tolerant to extreme habitats.

  13. Characterization of genome-wide copy number aberrations in colonic mixed adenoneuroendocrine carcinoma and neuroendocrine carcinoma reveals recurrent amplification of PTGER4 and MYC genes.

    PubMed

    Sinha, Namita; Gaston, Daniel; Manders, Daniel; Goudie, Marissa; Matsuoka, Makoto; Xie, Tao; Huang, Weei-Yuarn

    2018-03-01

    Colonic mixed adenoneuroendocrine carcinoma (MANEC) is an aggressive neoplasm with worse prognosis compared with adenocarcinoma. To gain a better understanding of the molecular features of colonic MANEC, we characterized the genome-wide copy number aberrations of 14 MANECs and 5 neuroendocrine carcinomas using the OncoScan FFPE (Affymetrix, Santa Clara, CA) assay. Compared with 269 colonic adenocarcinomas, 19 of 42 chromosomal arms of MANEC exhibited a similar frequency of major aberrant events as adenocarcinomas, and 13 chromosomal arms exhibited a higher frequency of copy number gains. Among them, the most significant chromosomal arms were 5p (77% versus 13%, P = .000012) and 8q (85% versus 33%, P = .0018). The Genomic Identification of Significant Targets in Cancers algorithm identified 7 peaks that drive the tumorgenesis of MANEC. For all except 5p13.1, the peaks largely overlapped with those of adenocarcinoma. Two tumors exhibited MYC amplification localized in 8q24.21, and 2 tumors exhibited PTGER4 amplification localized in 5p13.1. A total of 8 tumors exhibited high copy number gain of PTGER4 and/or MYC. Whereas the frequency of MYC amplification was similar to adenocarcinoma (10.5% versus 4%, P = .2), the frequency of PTGER4 amplification was higher than adenocarcinoma (10.5% versus 0.3%, P = .01). Our study demonstrates similar, but also distinct, copy number aberrations in MANEC compared with adenocarcinoma and suggests an important role for the MYC pathway of colonic carcinoma with neuroendocrine differentiation. The discovery of recurrent PTGER4 amplification implies a potential of exploring targeting therapy to the prostaglandin synthesis pathways in a subset of these tumors. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Characterization of the Asian citrus psyllid transcriptome

    USDA-ARS?s Scientific Manuscript database

    The Asian citrus psyllid (Diaphorina citri Kuwayama) and other psyllids are important agricultural pests that cause extensive economic damage by feeding and as vectors of plant pathogens. No psyllid genomes have been characterized, and little is known about the composition of psyllid genomes or the ...

  15. Genome sequencing of mucosal melanomas reveals that they are driven by distinct mechanisms from cutaneous melanoma.

    PubMed

    Furney, Simon J; Turajlic, Samra; Stamp, Gordon; Nohadani, Mahrokh; Carlisle, Anna; Thomas, J Meirion; Hayes, Andrew; Strauss, Dirk; Gore, Martin; van den Oord, Joost; Larkin, James; Marais, Richard

    2013-07-01

    Mucosal melanoma displays distinct clinical and epidemiological features compared to cutaneous melanoma. Here we used whole genome and whole exome sequencing to characterize the somatic alterations and mutation spectra in the genomes of ten mucosal melanomas. We observed somatic mutation rates that are considerably lower than occur in sun-exposed cutaneous melanoma, but comparable to the rates seen in cancers not associated with exposure to known mutagens. In particular, the mutation signatures are not indicative of ultraviolet light- or tobacco smoke-induced DNA damage. Genes previously reported as mutated in other cancers were also mutated in mucosal melanoma. Notably, there were substantially more copy number and structural variations in mucosal melanoma than have been reported in cutaneous melanoma. Thus, mucosal and cutaneous melanomas are distinct diseases with discrete genetic features. Our data suggest that different mechanisms underlie the genesis of these diseases and that structural variations play a more important role in mucosal than in cutaneous melanomagenesis. Copyright © 2013 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  16. Biotechnological Potential of Cold Adapted Pseudoalteromonas spp. Isolated from ‘Deep Sea’ Sponges

    PubMed Central

    Borchert, Erik; Knobloch, Stephen; Dwyer, Emilie; Flynn, Sinéad; Jackson, Stephen A.; Jóhannsson, Ragnar; Marteinsson, Viggó T.; O’Gara, Fergal; Dobson, Alan D. W.

    2017-01-01

    The marine genus Pseudoalteromonas is known for its versatile biotechnological potential with respect to the production of antimicrobials and enzymes of industrial interest. We have sequenced the genomes of three Pseudoalteromonas sp. strains isolated from different deep sea sponges on the Illumina MiSeq platform. The isolates have been screened for various industrially important enzymes and comparative genomics has been applied to investigate potential relationships between the isolates and their host organisms, while comparing them to free-living Pseudoalteromonas spp. from shallow and deep sea environments. The genomes of the sponge associated Pseudoalteromonas strains contained much lower levels of potential eukaryotic-like proteins which are known to be enriched in symbiotic sponge associated microorganisms, than might be expected for true sponge symbionts. While all the Pseudoalteromonas shared a large distinct subset of genes, nonetheless the number of unique and accessory genes is quite large and defines the pan-genome as open. Enzymatic screens indicate that a vast array of enzyme activities is expressed by the isolates, including β-galactosidase, β-glucosidase, and protease activities. A β-glucosidase gene from one of the Pseudoalteromonas isolates, strain EB27 was heterologously expressed in Escherichia coli and, following biochemical characterization, the recombinant enzyme was found to be cold-adapted, thermolabile, halotolerant, and alkaline active. PMID:28629190

  17. Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin

    PubMed Central

    Wu, Guangxi; Zhao, He; Li, Chenhao; Rajapakse, Menaka Priyadarsani; Wong, Wing Cheong; Xu, Jun; Saunders, Charles W.; Reeder, Nancy L.; Reilman, Raymond A.; Scheynius, Annika; Sun, Sheng; Billmyre, Blake Robert; Li, Wenjun; Averette, Anna Floyd; Mieczkowski, Piotr; Heitman, Joseph; Theelen, Bart; Schröder, Markus S.; De Sessions, Paola Florez; Butler, Geraldine; Maurer-Stroh, Sebastian; Boekhout, Teun; Nagarajan, Niranjan; Dawson, Thomas L.

    2015-01-01

    Malassezia is a unique lipophilic genus in class Malasseziomycetes in Ustilaginomycotina, (Basidiomycota, fungi) that otherwise consists almost exclusively of plant pathogens. Malassezia are typically isolated from warm-blooded animals, are dominant members of the human skin mycobiome and are associated with common skin disorders. To characterize the genetic basis of the unique phenotypes of Malassezia spp., we sequenced the genomes of all 14 accepted species and used comparative genomics against a broad panel of fungal genomes to comprehensively identify distinct features that define the Malassezia gene repertoire: gene gain and loss; selection signatures; and lineage-specific gene family expansions. Our analysis revealed key gene gain events (64) with a single gene conserved across all Malassezia but absent in all other sequenced Basidiomycota. These likely horizontally transferred genes provide intriguing gain-of-function events and prime candidates to explain the emergence of Malassezia. A larger set of genes (741) were lost, with enrichment for glycosyl hydrolases and carbohydrate metabolism, concordant with adaptation to skin’s carbohydrate-deficient environment. Gene family analysis revealed extensive turnover and underlined the importance of secretory lipases, phospholipases, aspartyl proteases, and other peptidases. Combining genomic analysis with a re-evaluation of culture characteristics, we establish the likely lipid-dependence of all Malassezia. Our phylogenetic analysis sheds new light on the relationship between Malassezia and other members of Ustilaginomycotina, as well as phylogenetic lineages within the genus. Overall, our study provides a unique genomic resource for understanding Malassezia niche-specificity and potential virulence, as well as their abundance and distribution in the environment and on human skin. PMID:26539826

  18. Lifestyle Evolution in Cyanobacterial Symbionts of Sponges

    PubMed Central

    Burgsdorf, Ilia; Slaby, Beate M.; Handley, Kim M.; Haber, Markus; Blom, Jochen; Marshall, Christopher W.; Gilbert, Jack A.; Hentschel, Ute

    2015-01-01

    ABSTRACT The “Candidatus Synechococcus spongiarum” group includes different clades of cyanobacteria with high 16S rRNA sequence identity (~99%) and is the most abundant and widespread cyanobacterial symbiont of marine sponges. The first draft genome of a “Ca. Synechococcus spongiarum” group member was recently published, providing evidence of genome reduction by loss of genes involved in several nonessential functions. However, “Ca. Synechococcus spongiarum” includes a variety of clades that may differ widely in genomic repertoire and consequently in physiology and symbiotic function. Here, we present three additional draft genomes of “Ca. Synechococcus spongiarum,” each from a different clade. By comparing all four symbiont genomes to those of free-living cyanobacteria, we revealed general adaptations to life inside sponges and specific adaptations of each phylotype. Symbiont genomes shared about half of their total number of coding genes. Common traits of “Ca. Synechococcus spongiarum” members were a high abundance of DNA modification and recombination genes and a reduction in genes involved in inorganic ion transport and metabolism, cell wall biogenesis, and signal transduction mechanisms. Moreover, these symbionts were characterized by a reduced number of antioxidant enzymes and low-weight peptides of photosystem II compared to their free-living relatives. Variability within the “Ca. Synechococcus spongiarum” group was mostly related to immune system features, potential for siderophore-mediated iron transport, and dependency on methionine from external sources. The common absence of genes involved in synthesis of residues, typical of the O antigen of free-living Synechococcus species, suggests a novel mechanism utilized by these symbionts to avoid sponge predation and phage attack. PMID:26037118

  19. Lifestyle Evolution in Cyanobacterial Symbionts of Sponges

    DOE PAGES

    Burgsdorf, Ilia; Slaby, Beate M.; Handley, Kim M.; ...

    2015-06-02

    The “Candidatus Synechococcus spongiarum” group includes different clades of cyanobacteria with high 16S rRNA sequence identity (~99%) and is the most abundant and widespread cyanobacterial symbiont of marine sponges. The first draft genome of a “Ca. Synechococcus spongiarum” group member was recently published, providing evidence of genome reduction by loss of genes involved in several nonessential functions. However, “Ca. Synechococcus spongiarum” includes a variety of clades that may differ widely in genomic repertoire and consequently in physiology and symbiotic function. Here, we present three additional draft genomes of “Ca. Synechococcus spongiarum,” each from a different clade. By comparing all fourmore » symbiont genomes to those of free-living cyanobacteria, we revealed general adaptations to life inside sponges and specific adaptations of each phylotype. Symbiont genomes shared about half of their total number of coding genes. Common traits of “Ca. Synechococcus spongiarum” members were a high abundance of DNA modification and recombination genes and a reduction in genes involved in inorganic ion transport and metabolism, cell wall biogenesis, and signal transduction mechanisms. Moreover, these symbionts were characterized by a reduced number of antioxidant enzymes and low-weight peptides of photosystem II compared to their free-living relatives. Variability within the “Ca. Synechococcus spongiarum” group was mostly related to immune system features, potential for siderophore-mediated iron transport, and dependency on methionine from external sources. The common absence of genes involved in synthesis of residues, typical of the O antigen of free-living Synechococcus species, suggests a novel mechanism utilized by these symbionts to avoid sponge predation and phage attack.« less

  20. Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin.

    PubMed

    Wu, Guangxi; Zhao, He; Li, Chenhao; Rajapakse, Menaka Priyadarsani; Wong, Wing Cheong; Xu, Jun; Saunders, Charles W; Reeder, Nancy L; Reilman, Raymond A; Scheynius, Annika; Sun, Sheng; Billmyre, Blake Robert; Li, Wenjun; Averette, Anna Floyd; Mieczkowski, Piotr; Heitman, Joseph; Theelen, Bart; Schröder, Markus S; De Sessions, Paola Florez; Butler, Geraldine; Maurer-Stroh, Sebastian; Boekhout, Teun; Nagarajan, Niranjan; Dawson, Thomas L

    2015-11-01

    Malassezia is a unique lipophilic genus in class Malasseziomycetes in Ustilaginomycotina, (Basidiomycota, fungi) that otherwise consists almost exclusively of plant pathogens. Malassezia are typically isolated from warm-blooded animals, are dominant members of the human skin mycobiome and are associated with common skin disorders. To characterize the genetic basis of the unique phenotypes of Malassezia spp., we sequenced the genomes of all 14 accepted species and used comparative genomics against a broad panel of fungal genomes to comprehensively identify distinct features that define the Malassezia gene repertoire: gene gain and loss; selection signatures; and lineage-specific gene family expansions. Our analysis revealed key gene gain events (64) with a single gene conserved across all Malassezia but absent in all other sequenced Basidiomycota. These likely horizontally transferred genes provide intriguing gain-of-function events and prime candidates to explain the emergence of Malassezia. A larger set of genes (741) were lost, with enrichment for glycosyl hydrolases and carbohydrate metabolism, concordant with adaptation to skin's carbohydrate-deficient environment. Gene family analysis revealed extensive turnover and underlined the importance of secretory lipases, phospholipases, aspartyl proteases, and other peptidases. Combining genomic analysis with a re-evaluation of culture characteristics, we establish the likely lipid-dependence of all Malassezia. Our phylogenetic analysis sheds new light on the relationship between Malassezia and other members of Ustilaginomycotina, as well as phylogenetic lineages within the genus. Overall, our study provides a unique genomic resource for understanding Malassezia niche-specificity and potential virulence, as well as their abundance and distribution in the environment and on human skin.

  1. Lifestyle Evolution in Cyanobacterial Symbionts of Sponges

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burgsdorf, Ilia; Slaby, Beate M.; Handley, Kim M.

    The “Candidatus Synechococcus spongiarum” group includes different clades of cyanobacteria with high 16S rRNA sequence identity (~99%) and is the most abundant and widespread cyanobacterial symbiont of marine sponges. The first draft genome of a “Ca. Synechococcus spongiarum” group member was recently published, providing evidence of genome reduction by loss of genes involved in several nonessential functions. However, “Ca. Synechococcus spongiarum” includes a variety of clades that may differ widely in genomic repertoire and consequently in physiology and symbiotic function. Here, we present three additional draft genomes of “Ca. Synechococcus spongiarum,” each from a different clade. By comparing all fourmore » symbiont genomes to those of free-living cyanobacteria, we revealed general adaptations to life inside sponges and specific adaptations of each phylotype. Symbiont genomes shared about half of their total number of coding genes. Common traits of “Ca. Synechococcus spongiarum” members were a high abundance of DNA modification and recombination genes and a reduction in genes involved in inorganic ion transport and metabolism, cell wall biogenesis, and signal transduction mechanisms. Moreover, these symbionts were characterized by a reduced number of antioxidant enzymes and low-weight peptides of photosystem II compared to their free-living relatives. Variability within the “Ca. Synechococcus spongiarum” group was mostly related to immune system features, potential for siderophore-mediated iron transport, and dependency on methionine from external sources. The common absence of genes involved in synthesis of residues, typical of the O antigen of free-living Synechococcus species, suggests a novel mechanism utilized by these symbionts to avoid sponge predation and phage attack.« less

  2. The vacuolar protein sorting genes in insects: A comparative genome view.

    PubMed

    Li, Zhaofei; Blissard, Gary

    2015-07-01

    In eukaryotic cells, regulated vesicular trafficking is critical for directing protein transport and for recycling and degradation of membrane lipids and proteins. Through carefully regulated transport vesicles, the endomembrane system performs a large and important array of dynamic cellular functions while maintaining the integrity of the cellular membrane system. Genetic studies in yeast Saccharomyces cerevisiae have identified approximately 50 vacuolar protein sorting (VPS) genes involved in vesicle trafficking, and most of these genes are also characterized in mammals. The VPS proteins form distinct functional complexes, which include complexes known as ESCRT, retromer, CORVET, HOPS, GARP, and PI3K-III. Little is known about the orthologs of VPS proteins in insects. Here, with the newly annotated Manduca sexta genome, we carried out genomic comparative analysis of VPS proteins in yeast, humans, and 13 sequenced insect genomes representing the Orders Hymenoptera, Diptera, Hemiptera, Phthiraptera, Lepidoptera, and Coleoptera. Amino acid sequence alignments and domain/motif structure analyses reveal that most of the components of ESCRT, retromer, CORVET, HOPS, GARP, and PI3K-III are evolutionarily conserved across yeast, insects, and humans. However, in contrast to the VPS gene expansions observed in the human genome, only four VPS genes (VPS13, VPS16, VPS33, and VPS37) were expanded in the six insect Orders. Additionally, VPS2 was expanded only in species from Phthiraptera, Lepidoptera, and Coleoptera. These studies provide a baseline for understanding the evolution of vesicular trafficking across yeast, insect, and human genomes, and also provide a basis for further addressing specific functional roles of VPS proteins in insects. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Comparative Genomic and Morphological Analyses of Listeria Phages Isolated from Farm Environments

    PubMed Central

    Denes, Thomas; Ackermann, Hans-Wolfgang; Moreno Switt, Andrea I.; Wiedmann, Martin; den Bakker, Henk C.

    2014-01-01

    The genus Listeria is ubiquitous in the environment and includes the globally important food-borne pathogen Listeria monocytogenes. While the genomic diversity of Listeria has been well studied, considerably less is known about the genomic and morphological diversity of Listeria bacteriophages. In this study, we sequenced and analyzed the genomes of 14 Listeria phages isolated mostly from New York dairy farm environments as well as one related Enterococcus faecalis phage to obtain information on genome characteristics and diversity. We also examined 12 of the phages by electron microscopy to characterize their morphology. These Listeria phages, based on gene orthology and morphology, together with previously sequenced Listeria phages could be classified into five orthoclusters, including one novel orthocluster. One orthocluster (orthocluster I) consists of large-genome (∼135-kb) myoviruses belonging to the genus “Twort-like viruses,” three orthoclusters (orthoclusters II to IV) contain small-genome (36- to 43-kb) siphoviruses with icosahedral heads, and the novel orthocluster V contains medium-sized-genome (∼66-kb) siphoviruses with elongated heads. A novel orthocluster (orthocluster VI) of E. faecalis phages, with medium-sized genomes (∼56 kb), was identified, which grouped together and shares morphological features with the novel Listeria phage orthocluster V. This new group of phages (i.e., orthoclusters V and VI) is composed of putative lytic phages that may prove to be useful in phage-based applications for biocontrol, detection, and therapeutic purposes. PMID:24837381

  4. Approaches to integrating germline and tumor genomic data in cancer research

    PubMed Central

    Feigelson, Heather Spencer; Goddard, Katrina A.B.; Hollombe, Celine; Tingle, Sharna R.; Gillanders, Elizabeth M.; Mechanic, Leah E.; Nelson, Stefanie A.

    2014-01-01

    Cancer is characterized by a diversity of genetic and epigenetic alterations occurring in both the germline and somatic (tumor) genomes. Hundreds of germline variants associated with cancer risk have been identified, and large amounts of data identifying mutations in the tumor genome that participate in tumorigenesis have been generated. Increasingly, these two genomes are being explored jointly to better understand how cancer risk alleles contribute to carcinogenesis and whether they influence development of specific tumor types or mutation profiles. To understand how data from germline risk studies and tumor genome profiling is being integrated, we reviewed 160 articles describing research that incorporated data from both genomes, published between January 2009 and December 2012, and summarized the current state of the field. We identified three principle types of research questions being addressed using these data: (i) use of tumor data to determine the putative function of germline risk variants; (ii) identification and analysis of relationships between host genetic background and particular tumor mutations or types; and (iii) use of tumor molecular profiling data to reduce genetic heterogeneity or refine phenotypes for germline association studies. We also found descriptive studies that compared germline and tumor genomic variation in a gene or gene family, and papers describing research methods, data sources, or analytical tools. We identified a large set of tools and data resources that can be used to analyze and integrate data from both genomes. Finally, we discuss opportunities and challenges for cancer research that integrates germline and tumor genomics data. PMID:25115441

  5. Characterizing the developmental transcriptome of the oriental fruit fly, Bactrocera dorsalis (Diptera: Tephritidae) through comparative genomic analysis with Drosophila melanogaster utilizing modENCODE datasets

    USDA-ARS?s Scientific Manuscript database

    Background The oriental fruit fly, Bactrocera dorsalis, is an important pest of fruit and vegetable crops throughout Asia, and is considered a high risk pest for establishment in the mainland United States. It is a member of the family Tephritidae, which are the most agriculturally important family ...

  6. Microbial Composition and Adaptations in Oligotrophic Inland Seas

    NASA Astrophysics Data System (ADS)

    Coleman, M.; Paver, S.; Anderson, M. R.; Vargas, G.

    2016-02-01

    The Laurentian Great Lakes comprise an interconnected freshwater system with certain areas resembling the oligotrophic open ocean in terms of productivity and nutrient availability. This resemblance creates an opportunity for comparing marine and Great Lake microorganisms to identify signatures of adaptation to low nutrient environments and re-evaluate differences between marine and freshwater microorganisms. We present results from the first comprehensive microbial characterization of all five Great Lakes. We compared community structure, genetic functional potential, and genome properties across the Great Lakes and other aquatic systems. Taxonomic and functional comparisons across lakes yielded three consistent groups: trophically distinct Lake Erie, Lakes Michigan and Huron, and Lakes Superior and Ontario. Lake metagenomic signatures were repeatedly differentiated by the presence of phage sequences and phage-related functional genes. We observed sequence similarity and synteny between contigs assembled from Great Lake metagenomes and genomes of marine organisms, including Nitrosopumilus sp. NF5, Synechococcus sp. RCC307 and Synechococcus phage S-SKS1. Assembly of metagenomic sequences additionally yielded large contigs from poorly characterized taxa. These results begin to fill the gap in our understanding of how nutrients, salinity, and other environmental factors shape microbial structure and function.

  7. A universe of dwarfs and giants: genome size and chromosome evolution in the monocot family Melanthiaceae.

    PubMed

    Pellicer, Jaume; Kelly, Laura J; Leitch, Ilia J; Zomlefer, Wendy B; Fay, Michael F

    2014-03-01

    • Since the occurrence of giant genomes in angiosperms is restricted to just a few lineages, identifying where shifts towards genome obesity have occurred is essential for understanding the evolutionary mechanisms triggering this process. • Genome sizes were assessed using flow cytometry in 79 species and new chromosome numbers were obtained. Phylogenetically based statistical methods were applied to infer ancestral character reconstructions of chromosome numbers and nuclear DNA contents. • Melanthiaceae are the most diverse family in terms of genome size, with C-values ranging more than 230-fold. Our data confirmed that giant genomes are restricted to tribe Parideae, with most extant species in the family characterized by small genomes. Ancestral genome size reconstruction revealed that the most recent common ancestor (MRCA) for the family had a relatively small genome (1C = 5.37 pg). Chromosome losses and polyploidy are recovered as the main evolutionary mechanisms generating chromosome number change. • Genome evolution in Melanthiaceae has been characterized by a trend towards genome size reduction, with just one episode of dramatic DNA accumulation in Parideae. Such extreme contrasting profiles of genome size evolution illustrate the key role of transposable elements and chromosome rearrangements in driving the evolution of plant genomes. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.

  8. Genetic characterization of K13965, a strain of Oak Vale virus from Western Australia.

    PubMed

    Quan, Phenix-Lan; Williams, David T; Johansen, Cheryl A; Jain, Komal; Petrosov, Alexandra; Diviney, Sinead M; Tashmukhamedova, Alla; Hutchison, Stephen K; Tesh, Robert B; Mackenzie, John S; Briese, Thomas; Lipkin, W Ian

    2011-09-01

    K13965, an uncharacterized virus, was isolated in 1993 from Anopheles annulipes mosquitoes collected in the Kimberley region of northern Western Australia. Here, we report its genomic sequence, identify it as a rhabdovirus, and characterize its phylogenetic relationships. The genome comprises a P' (C) and SH protein similar to the recently characterized Tupaia and Durham viruses, and shows overlap between G and L genes. Comparison of K13965 genome sequence to other rhabdoviruses identified K13965 as a strain of the unclassified Australian Oak Vale rhabdovirus, whose complete genome sequence we also determined. Phylogenetic analysis of N and L sequences indicated genetic relationship to a recently proposed Sandjima virus clade, although the Oak Vale virus sequences form a branch separate from the African members of that group. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Contributing to Tumor Molecular Characterization Projects with a Global Impact | Office of Cancer Genomics

    Cancer.gov

    My name is Nicholas Griner and I am the Scientific Program Manager for the Cancer Genome Characterization Initiative (CGCI) in the Office of Cancer Genomics (OCG). Until recently, I spent most of my scientific career working in a cancer research laboratory. In my postdoctoral training, my research focused on identifying novel pathways that contribute to both prostate and breast cancers and studying proteins within these pathways that may be targeted with cancer drugs.

  10. The (in)complete organelle genome: exploring the use and nonuse of available technologies for characterizing mitochondrial and plastid chromosomes.

    PubMed

    Sanitá Lima, Matheus; Woods, Laura C; Cartwright, Matthew W; Smith, David Roy

    2016-11-01

    Not long ago, scientists paid dearly in time, money and skill for every nucleotide that they sequenced. Today, DNA sequencing technologies epitomize the slogan 'faster, easier, cheaper and more', and in many ways, sequencing an entire genome has become routine, even for the smallest laboratory groups. This is especially true for mitochondrial and plastid genomes. Given their relatively small sizes and high copy numbers per cell, organelle DNAs are currently among the most highly sequenced kind of chromosome. But accurately characterizing an organelle genome and the information it encodes can require much more than DNA sequencing and bioinformatics analyses. Organelle genomes can be surprisingly complex and can exhibit convoluted and unconventional modes of gene expression. Unravelling this complexity can demand a wide assortment of experiments, from pulsed-field gel electrophoresis to Southern and Northern blots to RNA analyses. Here, we show that it is exactly these types of 'complementary' analyses that are often lacking from contemporary organelle genome papers, particularly short 'genome announcement' articles. Consequently, crucial and interesting features of organelle chromosomes are going undescribed, which could ultimately lead to a poor understanding and even a misrepresentation of these genomes and the genes they express. High-throughput sequencing and bioinformatics have made it easy to sequence and assemble entire chromosomes, but they should not be used as a substitute for or at the expense of other types of genomic characterization methods. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  11. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    USDA-ARS?s Scientific Manuscript database

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  12. Genome-Based Comparison of Clostridioides difficile: Average Amino Acid Identity Analysis of Core Genomes.

    PubMed

    Cabal, Adriana; Jun, Se-Ran; Jenjaroenpun, Piroon; Wanchai, Visanu; Nookaew, Intawat; Wongsurawat, Thidathip; Burgess, Mary J; Kothari, Atul; Wassenaar, Trudy M; Ussery, David W

    2018-02-14

    Infections due to Clostridioides difficile (previously known as Clostridium difficile) are a major problem in hospitals, where cases can be caused by community-acquired strains as well as by nosocomial spread. Whole genome sequences from clinical samples contain a lot of information but that needs to be analyzed and compared in such a way that the outcome is useful for clinicians or epidemiologists. Here, we compare 663 public available complete genome sequences of C. difficile using average amino acid identity (AAI) scores. This analysis revealed that most of these genomes (640, 96.5%) clearly belong to the same species, while the remaining 23 genomes produce four distinct clusters within the Clostridioides genus. The main C. difficile cluster can be further divided into sub-clusters, depending on the chosen cutoff. We demonstrate that MLST, either based on partial or full gene-length, results in biased estimates of genetic differences and does not capture the true degree of similarity or differences of complete genomes. Presence of genes coding for C. difficile toxins A and B (ToxA/B), as well as the binary C. difficile toxin (CDT), was deduced from their unique PfamA domain architectures. Out of the 663 C. difficile genomes, 535 (80.7%) contained at least one copy of ToxA or ToxB, while these genes were missing from 128 genomes. Although some clusters were enriched for toxin presence, these genes are variably present in a given genetic background. The CDT genes were found in 191 genomes, which were restricted to a few clusters only, and only one cluster lacked the toxin A/B genes consistently. A total of 310 genomes contained ToxA/B without CDT (47%). Further, published metagenomic data from stools were used to assess the presence of C. difficile sequences in blinded cases of C. difficile infection (CDI) and controls, to test if metagenomic analysis is sensitive enough to detect the pathogen, and to establish strain relationships between cases from the same hospital. We conclude that metagenomics can contribute to the identification of CDI and can assist in characterization of the most probable causative strain in CDI patients.

  13. Pan-genome analysis of the emerging foodborne pathogen Cronobacter spp. suggests a species-level bidirectional divergence driven by niche adaptation

    PubMed Central

    2013-01-01

    Background Members of the genus Cronobacter are causes of rare but severe illness in neonates and preterm infants following the ingestion of contaminated infant formula. Seven species have been described and two of the species genomes were subsequently published. In this study, we performed comparative genomics on eight strains of Cronobacter, including six that we sequenced (representing six of the seven species) and two previously published, closed genomes. Results We identified and characterized the features associated with the core and pan genome of the genus Cronobacter in an attempt to understand the evolution of these bacteria and the genetic content of each species. We identified 84 genomic regions that are present in two or more Cronobacter genomes, along with 45 unique genomic regions. Many potentially horizontally transferred genes, such as lysogenic prophages, were also identified. Most notable among these were several type six secretion system gene clusters, transposons that carried tellurium, copper and/or silver resistance genes, and a novel integrative conjugative element. Conclusions Cronobacter have diverged into two clusters, one consisting of C. dublinensis and C. muytjensii (Cdub-Cmuy) and the other comprised of C. sakazakii, C. malonaticus, C. universalis, and C. turicensis, (Csak-Cmal-Cuni-Ctur) from the most recent common ancestral species. While several genetic determinants for plant-association and human virulence could be found in the core genome of Cronobacter, the four Cdub-Cmuy clade genomes contained several accessory genomic regions important for survival in a plant-associated environmental niche, while the Csak-Cmal-Cuni-Ctur clade genomes harbored numerous virulence-related genetic traits. PMID:23724777

  14. Characterization of a Stable, Metronidazole-Resistant Clostridium difficile Clinical Isolate

    PubMed Central

    Lynch, Tarah; Chong, Patrick; Zhang, Jason; Hizon, Romeo; Du, Tim; Graham, Morag R.; Beniac, Daniel R.; Booth, Timothy F.; Kibsey, Pamela; Miller, Mark; Gravel, Denise; Mulvey, Michael R.

    2013-01-01

    Background Clostridium difficile are Gram-positive, spore forming anaerobic bacteria that are the leading cause of healthcare-associated diarrhea, usually associated with antibiotic usage. Metronidazole is currently the first-line treatment for mild to moderate C. difficile diarrhea however recurrence occurs at rates of 15–35%. There are few reports of C. difficile metronidazole resistance in the literature, and when observed, the phenotype has been transient and lost after storage or exposure of the bacteria to freeze/thaw cycles. Owing to the unstable nature of the resistance phenotype in the laboratory, clinical significance and understanding of the resistance mechanisms is lacking. Methodology/Principal Findings Genotypic and phenotypic characterization was performed on a metronidazole resistant clinical isolate of C. difficile. Whole-genome sequencing was used to identify potential genetic contributions to the phenotypic variation observed with molecular and bacteriological techniques. Phenotypic observations of the metronidazole resistant strain revealed aberrant growth in broth and elongated cell morphology relative to a metronidazole-susceptible, wild type NAP1 strain. Comparative genomic analysis revealed single nucleotide polymorphism (SNP) level variation within genes affecting core metabolic pathways such as electron transport, iron utilization and energy production. Conclusions/Significance This is the first characterization of stable, metronidazole resistance in a C. difficile isolate. The study provides an in-depth genomic and phenotypic analysis of this strain and provides a foundation for future studies to elucidate mechanisms conferring metronidazole resistance in C. difficile that have not been previously described. PMID:23349739

  15. The Cancer Genome Atlas (TCGA): The next stage - TCGA

    Cancer.gov

    The Cancer Genome Atlas (TCGA), the NIH research program that has helped set the standards for characterizing the genomic underpinnings of dozens of cancers on a large scale, is moving to its next phase.

  16. Endometrial and acute myeloid leukemia cancer genomes characterized

    Cancer.gov

    Two studies from The Cancer Genome Atlas (TCGA) program reveal details about the genomic landscapes of acute myeloid leukemia (AML) and endometrial cancer. Both provide new insights into the molecular underpinnings of these cancers.

  17. Population Sciences, Translational Research and the Opportunities and Challenges for Genomics to Reduce the Burden of Cancer in the 21st Century

    PubMed Central

    Khoury, Muin J.; Clauser, Steven B.; Freedman, Andrew N.; Gillanders, Elizabeth M.; Glasgow, Russ E.; Klein, William M. P.; Schully, Sheri D.

    2011-01-01

    Advances in genomics and related fields are promising tools for risk assessment, early detection, and targeted therapies across the entire cancer care continuum. In this commentary, we submit that this promise cannot be fulfilled without an enhanced translational genomics research agenda firmly rooted in the population sciences. Population sciences include multiple disciplines that are needed throughout the translational research continuum. For example, epidemiologic studies are needed not only to accelerate genomic discoveries and new biological insights into cancer etiology and pathogenesis, but to characterize and critically evaluate these discoveries in well defined populations for their potential for cancer prediction, prevention and response to treatments. Behavioral, social and communication sciences are needed to explore genomic-modulated responses to old and new behavioral interventions, adherence to therapies, decision-making across the continuum, and effective use in health care. Implementation science, health services, outcomes research, comparative effectiveness research and regulatory science are needed for moving validated genomic applications into practice and for measuring their effectiveness, cost effectiveness and unintended consequences. Knowledge synthesis, evidence reviews and economic modeling of the effects of promising genomic applications will facilitate policy decisions, and evidence-based recommendations. Several independent and multidisciplinary panels have recently made specific recommendations for enhanced research and policy infrastructure to inform clinical and population research for moving genomic innovations into the cancer care continuum. An enhanced translational genomics and population sciences agenda is urgently needed to fulfill the promise of genomics in reducing the burden of cancer. PMID:21795499

  18. SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells.

    PubMed

    Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang

    2018-01-01

    Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. © 2018 Han et al.; Published by Cold Spring Harbor Laboratory Press.

  19. SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells

    PubMed Central

    Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang

    2018-01-01

    Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. PMID:29208629

  20. Genomic characterization of a core set of the USDA-NPGS Ethiopian sorghum germplasm collection

    USDA-ARS?s Scientific Manuscript database

    The USDA Agriculture Research Service National Plant Germplasm System (NPGS) preserves the largest sorghum germplasm collection in the world, which includes 7,217 accessions from the center of diversity in Ethiopia. The characterization of this exotic germplasm at a genome-wide scale will improve co...

Top