Sample records for bacterial genome segmentations

  1. Assessing the Robustness of Complete Bacterial Genome Segmentations

    NASA Astrophysics Data System (ADS)

    Devillers, Hugo; Chiapello, Hélène; Schbath, Sophie; El Karoui, Meriem

    Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a "backbone" that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The scores presented in this paper are simple to implement and our results show that they allow to discriminate easily between robust and non-robust bacterial genome segmentations when using aligners such as MAUVE and MGA.

  2. MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

    PubMed

    Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

    2008-11-27

    The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.

  3. Segmental Duplications and Copy-Number Variation in the Human Genome

    PubMed Central

    Sharp, Andrew J. ; Locke, Devin P. ; McGrath, Sean D. ; Cheng, Ze ; Bailey, Jeffrey A. ; Vallente, Rhea U. ; Pertz, Lisa M. ; Clark, Royden A. ; Schwartz, Stuart ; Segraves, Rick ; Oseroff, Vanessa V. ; Albertson, Donna G. ; Pinkel, Daniel ; Eichler, Evan E. 

    2005-01-01

    The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic

  4. Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

    PubMed

    Christen, Matthias; Deutsch, Samuel; Christen, Beat

    2015-08-21

    Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .

  5. Markov models of genome segmentation

    NASA Astrophysics Data System (ADS)

    Thakur, Vivek; Azad, Rajeev K.; Ramaswamy, Ram

    2007-01-01

    We introduce Markov models for segmentation of symbolic sequences, extending a segmentation procedure based on the Jensen-Shannon divergence that has been introduced earlier. Higher-order Markov models are more sensitive to the details of local patterns and in application to genome analysis, this makes it possible to segment a sequence at positions that are biologically meaningful. We show the advantage of higher-order Markov-model-based segmentation procedures in detecting compositional inhomogeneity in chimeric DNA sequences constructed from genomes of diverse species, and in application to the E. coli K12 genome, boundaries of genomic islands, cryptic prophages, and horizontally acquired regions are accurately identified.

  6. Bacterial Genome Instability

    PubMed Central

    Darmon, Elise

    2014-01-01

    SUMMARY Bacterial genomes are remarkably stable from one generation to the next but are plastic on an evolutionary time scale, substantially shaped by horizontal gene transfer, genome rearrangement, and the activities of mobile DNA elements. This implies the existence of a delicate balance between the maintenance of genome stability and the tolerance of genome instability. In this review, we describe the specialized genetic elements and the endogenous processes that contribute to genome instability. We then discuss the consequences of genome instability at the physiological level, where cells have harnessed instability to mediate phase and antigenic variation, and at the evolutionary level, where horizontal gene transfer has played an important role. Indeed, this ability to share DNA sequences has played a major part in the evolution of life on Earth. The evolutionary plasticity of bacterial genomes, coupled with the vast numbers of bacteria on the planet, substantially limits our ability to control disease. PMID:24600039

  7. Dynamics of Genome Rearrangement in Bacterial Populations

    PubMed Central

    Darling, Aaron E.; Miklós, István; Ragan, Mark A.

    2008-01-01

    Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unprecedented opportunity to study the evolution of genome structure and arrangement. We introduce a set of statistical methods to study patterns of rearrangement in circular chromosomes and apply them to the Yersinia. We constructed a multiple alignment of eight Yersinia genomes using Mauve software to identify 78 conserved segments that are internally free from genome rearrangement. Based on the alignment, we applied Bayesian statistical methods to infer the phylogenetic inversion history of Yersinia. The sampling of genome arrangement reconstructions contains seven parsimonious tree topologies, each having different histories of 79 inversions. Topologies with a greater number of inversions also exist, but were sampled less frequently. The inversion phylogenies agree with results suggested by SNP patterns. We then analyzed reconstructed inversion histories to identify patterns of rearrangement. We confirm an over-representation of “symmetric inversions”—inversions with endpoints that are equally distant from the origin of chromosomal replication. Ancestral genome arrangements demonstrate moderate preference for replichore balance in Yersinia. We found that all inversions are shorter than expected under a neutral model, whereas inversions acting within a single replichore are much shorter than expected. We also found evidence for a canonical configuration of the origin and terminus of replication. Finally, breakpoint reuse analysis reveals that inversions with endpoints proximal to the origin of DNA replication are nearly three times more frequent. Our findings represent the

  8. Insights from 20 years of bacterial genome sequencing

    DOE PAGES

    Land, Miriam L.; Hauser, Loren; Jun, Se-Ran; ...

    2015-02-27

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date,more » there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about

  9. Insights from 20 years of bacterial genome sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Land, Miriam L.; Hauser, Loren; Jun, Se-Ran

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date,more » there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about

  10. Gene calling and bacterial genome annotation with BG7.

    PubMed

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  11. Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

    PubMed

    diCenzo, George C; Finan, Turlough M

    2018-01-01

    The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.

  12. A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors

    PubMed Central

    Qin, Xin-Cheng; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Gao, Dong-Ya; He, Jin-Rong; Wang, Jian-Bo; Li, Ci-Xiu; Kang, Yan-Jun; Yu, Bin; Zhou, Dun-Jin; Xu, Jianguo; Plyusnin, Alexander; Holmes, Edward C.; Zhang, Yong-Zhen

    2014-01-01

    Although segmented and unsegmented RNA viruses are commonplace, the evolutionary links between these two very different forms of genome organization are unclear. We report the discovery and characterization of a tick-borne virus—Jingmen tick virus (JMTV)—that reveals an unexpected connection between segmented and unsegmented RNA viruses. The JMTV genome comprises four segments, two of which are related to the nonstructural protein genes of the genus Flavivirus (family Flaviviridae), whereas the remaining segments are unique to this virus, have no known homologs, and contain a number of features indicative of structural protein genes. Remarkably, homology searching revealed that sequences related to JMTV were present in the cDNA library from Toxocara canis (dog roundworm; Nematoda), and that shared strong sequence and structural resemblances. Epidemiological studies showed that JMTV is distributed in tick populations across China, especially Rhipicephalus and Haemaphysalis spp., and experiences frequent host-switching and genomic reassortment. To our knowledge, JMTV is the first example of a segmented RNA virus with a genome derived in part from unsegmented viral ancestors. PMID:24753611

  13. The Divided Bacterial Genome: Structure, Function, and Evolution.

    PubMed

    diCenzo, George C; Finan, Turlough M

    2017-09-01

    Approximately 10% of bacterial genomes are split between two or more large DNA fragments, a genome architecture referred to as a multipartite genome. This multipartite organization is found in many important organisms, including plant symbionts, such as the nitrogen-fixing rhizobia, and plant, animal, and human pathogens, including the genera Brucella , Vibrio , and Burkholderia . The availability of many complete bacterial genome sequences means that we can now examine on a broad scale the characteristics of the different types of DNA molecules in a genome. Recent work has begun to shed light on the unique properties of each class of replicon, the unique functional role of chromosomal and nonchromosomal DNA molecules, and how the exploitation of novel niches may have driven the evolution of the multipartite genome. The aims of this review are to (i) outline the literature regarding bacterial genomes that are divided into multiple fragments, (ii) provide a meta-analysis of completed bacterial genomes from 1,708 species as a way of reviewing the abundant information present in these genome sequences, and (iii) provide an encompassing model to explain the evolution and function of the multipartite genome structure. This review covers, among other topics, salient genome terminology; mechanisms of multipartite genome formation; the phylogenetic distribution of multipartite genomes; how each part of a genome differs with respect to genomic signatures, genetic variability, and gene functional annotation; how each DNA molecule may interact; as well as the costs and benefits of this genome structure. Copyright © 2017 American Society for Microbiology.

  14. Single-Molecule FISH Reveals Non-selective Packaging of Rift Valley Fever Virus Genome Segments

    PubMed Central

    Wichgers Schreur, Paul J.; Kortekaas, Jeroen

    2016-01-01

    The bunyavirus genome comprises a small (S), medium (M), and large (L) RNA segment of negative polarity. Although genome segmentation confers evolutionary advantages by enabling genome reassortment events with related viruses, genome segmentation also complicates genome replication and packaging. Accumulating evidence suggests that genomes of viruses with eight or more genome segments are incorporated into virions by highly selective processes. Remarkably, little is known about the genome packaging process of the tri-segmented bunyaviruses. Here, we evaluated, by single-molecule RNA fluorescence in situ hybridization (FISH), the intracellular spatio-temporal distribution and replication kinetics of the Rift Valley fever virus (RVFV) genome and determined the segment composition of mature virions. The results reveal that the RVFV genome segments start to replicate near the site of infection before spreading and replicating throughout the cytoplasm followed by translocation to the virion assembly site at the Golgi network. Despite the average intracellular S, M and L genome segments approached a 1:1:1 ratio, major differences in genome segment ratios were observed among cells. We also observed a significant amount of cells lacking evidence of M-segment replication. Analysis of two-segmented replicons and four-segmented viruses subsequently confirmed the previous notion that Golgi recruitment is mediated by the Gn glycoprotein. The absence of colocalization of the different segments in the cytoplasm and the successful rescue of a tri-segmented variant with a codon shuffled M-segment suggested that inter-segment interactions are unlikely to drive the copackaging of the different segments into a single virion. The latter was confirmed by direct visualization of RNPs inside mature virions which showed that the majority of virions lack one or more genome segments. Altogether, this study suggests that RVFV genome packaging is a non-selective process. PMID:27548280

  15. A Primer on Infectious Disease Bacterial Genomics

    PubMed Central

    Petkau, Aaron; Knox, Natalie; Graham, Morag; Van Domselaar, Gary

    2016-01-01

    SUMMARY The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects. PMID:28590251

  16. Genome-based approaches to develop vaccines against bacterial pathogens.

    PubMed

    Serruto, Davide; Serino, Laura; Masignani, Vega; Pizza, Mariagrazia

    2009-05-26

    Bacterial infectious diseases remain the single most important threat to health worldwide. Although conventional vaccinology approaches were successful in conferring protection against several diseases, they failed to provide efficacious solutions against many others. The advent of whole-genome sequencing changed the way to think about vaccine development, enabling the targeting of possible vaccine candidates starting from the genomic information of a single bacterial isolate, with a process named reverse vaccinology. As the genomic era progressed, reverse vaccinology has evolved with a pan-genome approach and multi-strain genome analysis became fundamental for the design of universal vaccines. This review describes the applications of genome-based approaches in the development of new vaccines against bacterial pathogens.

  17. Correlation between genome reduction and bacterial growth.

    PubMed

    Kurokawa, Masaomi; Seno, Shigeto; Matsuda, Hideo; Ying, Bei-Wen

    2016-12-01

    Genome reduction by removing dispensable genomic sequences in bacteria is commonly used in both fundamental and applied studies to determine the minimal genetic requirements for a living system or to develop highly efficient bioreactors. Nevertheless, whether and how the accumulative loss of dispensable genomic sequences disturbs bacterial growth remains unclear. To investigate the relationship between genome reduction and growth, a series of Escherichia coli strains carrying genomes reduced in a stepwise manner were used. Intensive growth analyses revealed that the accumulation of multiple genomic deletions caused decreases in the exponential growth rate and the saturated cell density in a deletion-length-dependent manner as well as gradual changes in the patterns of growth dynamics, regardless of the growth media. Accordingly, a perspective growth model linking genome evolution to genome engineering was proposed. This study provides the first demonstration of a quantitative connection between genomic sequence and bacterial growth, indicating that growth rate is potentially associated with dispensable genomic sequences. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  18. Exploration of sequence space as the basis of viral RNA genome segmentation.

    PubMed

    Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

    2014-05-06

    The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration.

  19. Genomic features of bacterial adaptation to plants

    PubMed Central

    Levy, Asaf; Gonzalez, Isai Salas; Mittelviefhaus, Maximilian; Clingenpeel, Scott; Paredes, Sur Herrera; Miao, Jiamin; Wang, Kunru; Devescovi, Giulia; Stillman, Kyra; Monteiro, Freddy; Alvarez, Bryan Rangel; Lundberg, Derek S.; Lu, Tse-Yuan; Lebeis, Sarah; Jin, Zhao; McDonald, Meredith; Klein, Andrew P.; Feltcher, Meghan E.; del Rio, Tijana Glavina; Grant, Sarah R.; Doty, Sharon L.; Ley, Ruth E.; Zhao, Bingyu; Venturi, Vittorio; Pelletier, Dale A.; Vorholt, Julia A.; Tringe, Susannah G.; Woyke, Tanja; Dangl, Jeffery L.

    2017-01-01

    Plants intimately associate with diverse bacteria. Plant-associated (PA) bacteria have ostensibly evolved genes enabling adaptation to the plant environment. However, the identities of such genes are mostly unknown and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3837 bacterial genomes to identify thousands of PA gene clusters. Genomes of PA bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant associated genomes. We experimentally validated candidates from two sets of PA genes, one involved in plant colonization, the other serving in microbe-microbe competition between PA bacteria. We also identified 64 PA protein domains that potentially mimic plant domains; some are shared with PA fungi and oomycetes. This work expands the genome-based understanding of plant-microbe interactions and provides leads for efficient and sustainable agriculture through microbiome engineering. PMID:29255260

  20. Harnessing CRISPR-Cas systems for bacterial genome editing.

    PubMed

    Selle, Kurt; Barrangou, Rodolphe

    2015-04-01

    Manipulation of genomic sequences facilitates the identification and characterization of key genetic determinants in the investigation of biological processes. Genome editing via clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) constitutes a next-generation method for programmable and high-throughput functional genomics. CRISPR-Cas systems are readily reprogrammed to induce sequence-specific DNA breaks at target loci, resulting in fixed mutations via host-dependent DNA repair mechanisms. Although bacterial genome editing is a relatively unexplored and underrepresented application of CRISPR-Cas systems, recent studies provide valuable insights for the widespread future implementation of this technology. This review summarizes recent progress in bacterial genome editing and identifies fundamental genetic and phenotypic outcomes of CRISPR targeting in bacteria, in the context of tool development, genome homeostasis, and DNA repair. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Genome engineering and gene expression control for bacterial strain development.

    PubMed

    Song, Chan Woo; Lee, Joungmin; Lee, Sang Yup

    2015-01-01

    In recent years, a number of techniques and tools have been developed for genome engineering and gene expression control to achieve desired phenotypes of various bacteria. Here we review and discuss the recent advances in bacterial genome manipulation and gene expression control techniques, and their actual uses with accompanying examples. Genome engineering has been commonly performed based on homologous recombination. During such genome manipulation, the counterselection systems employing SacB or nucleases have mainly been used for the efficient selection of desired engineered strains. The recombineering technology enables simple and more rapid manipulation of the bacterial genome. The group II intron-mediated genome engineering technology is another option for some bacteria that are difficult to be engineered by homologous recombination. Due to the increasing demands on high-throughput screening of bacterial strains having the desired phenotypes, several multiplex genome engineering techniques have recently been developed and validated in some bacteria. Another approach to achieve desired bacterial phenotypes is the repression of target gene expression without the modification of genome sequences. This can be performed by expressing antisense RNA, small regulatory RNA, or CRISPR RNA to repress target gene expression at the transcriptional or translational level. All of these techniques allow efficient and rapid development and screening of bacterial strains having desired phenotypes, and more advanced techniques are expected to be seen. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Genomic features of bacterial adaptation to plants

    DOE PAGES

    Levy, Asaf; Salas Gonzalez, Isai; Mittelviefhaus, Maximilian; ...

    2017-12-18

    Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. In this study, we sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and themore » other serving in microbe–microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. In conclusion, this work expands the genome-based understanding of plant–microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.« less

  3. Genomic features of bacterial adaptation to plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Levy, Asaf; Salas Gonzalez, Isai; Mittelviefhaus, Maximilian

    Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. In this study, we sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and themore » other serving in microbe–microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. In conclusion, this work expands the genome-based understanding of plant–microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.« less

  4. Creation of Rift Valley Fever Viruses with Four-Segmented Genomes Reveals Flexibility in Bunyavirus Genome Packaging

    PubMed Central

    Oreshkova, Nadia; Moormann, Rob J. M.; Kortekaas, Jeroen

    2014-01-01

    ABSTRACT Bunyavirus genomes comprise a small (S), a medium (M), and a large (L) RNA segment of negative polarity. Although the untranslated regions have been shown to comprise signals required for transcription, replication, and encapsidation, the mechanisms that drive the packaging of at least one S, M, and L segment into a single virion to generate infectious virus are largely unknown. One of the most important members of the Bunyaviridae family that causes devastating disease in ruminants and occasionally humans is the Rift Valley fever virus (RVFV). We studied the flexibility of RVFV genome packaging by splitting the glycoprotein precursor gene, encoding the (NSm)GnGc polyprotein, into two individual genes encoding either (NSm)Gn or Gc. Using reverse genetics, six viruses with a segmented glycoprotein precursor gene were rescued, varying from a virus comprising two S-type segments in the absence of an M-type segment to a virus consisting of four segments (RVFV-4s), of which three are M-type. Despite that all virus variants were able to grow in mammalian cell lines, they were unable to spread efficiently in cells of mosquito origin. Moreover, in vivo studies demonstrated that RVFV-4s is unable to cause disseminated infection and disease in mice, even in the presence of the main virulence factor NSs, but induced a protective immune response against a lethal challenge with wild-type virus. In summary, splitting bunyavirus glycoprotein precursor genes provides new opportunities to study bunyavirus genome packaging and offers new methods to develop next-generation live-attenuated bunyavirus vaccines. IMPORTANCE Rift Valley fever virus (RVFV) causes devastating disease in ruminants and occasionally humans. Virions capable of productive infection comprise at least one copy of the small (S), medium (M), and large (L) RNA genome segments. The M segment encodes a glycoprotein precursor (GPC) protein that is cotranslationally cleaved into Gn and Gc, which are required for

  5. Creation of Rift Valley fever viruses with four-segmented genomes reveals flexibility in bunyavirus genome packaging.

    PubMed

    Wichgers Schreur, Paul J; Oreshkova, Nadia; Moormann, Rob J M; Kortekaas, Jeroen

    2014-09-01

    Bunyavirus genomes comprise a small (S), a medium (M), and a large (L) RNA segment of negative polarity. Although the untranslated regions have been shown to comprise signals required for transcription, replication, and encapsidation, the mechanisms that drive the packaging of at least one S, M, and L segment into a single virion to generate infectious virus are largely unknown. One of the most important members of the Bunyaviridae family that causes devastating disease in ruminants and occasionally humans is the Rift Valley fever virus (RVFV). We studied the flexibility of RVFV genome packaging by splitting the glycoprotein precursor gene, encoding the (NSm)GnGc polyprotein, into two individual genes encoding either (NSm)Gn or Gc. Using reverse genetics, six viruses with a segmented glycoprotein precursor gene were rescued, varying from a virus comprising two S-type segments in the absence of an M-type segment to a virus consisting of four segments (RVFV-4s), of which three are M-type. Despite that all virus variants were able to grow in mammalian cell lines, they were unable to spread efficiently in cells of mosquito origin. Moreover, in vivo studies demonstrated that RVFV-4s is unable to cause disseminated infection and disease in mice, even in the presence of the main virulence factor NSs, but induced a protective immune response against a lethal challenge with wild-type virus. In summary, splitting bunyavirus glycoprotein precursor genes provides new opportunities to study bunyavirus genome packaging and offers new methods to develop next-generation live-attenuated bunyavirus vaccines. Rift Valley fever virus (RVFV) causes devastating disease in ruminants and occasionally humans. Virions capable of productive infection comprise at least one copy of the small (S), medium (M), and large (L) RNA genome segments. The M segment encodes a glycoprotein precursor (GPC) protein that is cotranslationally cleaved into Gn and Gc, which are required for virus entry and

  6. Kullback Leibler divergence in complete bacterial and phage genomes

    PubMed Central

    Akhter, Sajia; Kashef, Mona T.; Ibrahim, Eslam S.; Bailey, Barbara

    2017-01-01

    The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback–Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses. PMID:29204318

  7. Kullback Leibler divergence in complete bacterial and phage genomes.

    PubMed

    Akhter, Sajia; Aziz, Ramy K; Kashef, Mona T; Ibrahim, Eslam S; Bailey, Barbara; Edwards, Robert A

    2017-01-01

    The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback-Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses.

  8. Heuristic Bayesian segmentation for discovery of coexpressed genes within genomic regions.

    PubMed

    Pehkonen, Petri; Wong, Garry; Törönen, Petri

    2010-01-01

    Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.

  9. Use of Optical Mapping in Bacterial Genome Finishing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kumar, Dibyendu

    2010-06-03

    Dibyendu Kumar from the University of Florida discusses whole-genome optical mapping to help validate bacterial genome assemblies on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM.

  10. Microbial minimalism: genome reduction in bacterial pathogens.

    PubMed

    Moran, Nancy A

    2002-03-08

    When bacterial lineages make the transition from free-living or facultatively parasitic life cycles to permanent associations with hosts, they undergo a major loss of genes and DNA. Complete genome sequences are providing an understanding of how extreme genome reduction affects evolutionary directions and metabolic capabilities of obligate pathogens and symbionts.

  11. A world without bacterial meningitis: how genomic epidemiology can inform vaccination strategy.

    PubMed

    Rodrigues, Charlene M C; Maiden, Martin C J

    2018-01-01

    Bacterial meningitis remains an important cause of global morbidity and mortality. Although effective vaccinations exist and are being increasingly used worldwide, bacterial diversity threatens their impact and the ultimate goal of eliminating the disease. Through genomic epidemiology, we can appreciate bacterial population structure and its consequences for transmission dynamics, virulence, antimicrobial resistance, and development of new vaccines. Here, we review what we have learned through genomic epidemiological studies, following the rapid implementation of whole genome sequencing that can help to optimise preventative strategies for bacterial meningitis.

  12. Assignment of simian rotavirus SA11 temperature-sensitive mutant groups B and E to genome segments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gombold, J.L.; Estes, M.K.; Ramig, R.F.

    1985-05-01

    Recombinant (reassortant) viruses were selected from crosses between temperature-sensitive (ts) mutants of simian rotavirus SA11 and wild-type human rotavirus Wa. The double-stranded genome RNAs of the reassortants were examined by electrophoresis in Tris-glycine-buffered polyacrylamide gels and by dot hybridization with a cloned DNA probe for genome segment 2. Analysis of replacements of genome segments in the reassortants allowed construction of a map correlating genome segments providing functions interchangeable between SA11 and Wa. The reassortants revealed a functional correspondence in order of increasing electrophoretic mobility of genome segments. Analysis of the parental origin of genome segments in ts+ SA11/Wa reassortants derivedmore » from the crosses SA11 tsB(339) X Wa and SA11 tsE(1400) X Wa revealed that the group B lesion of tsB(339) was located on genome segment 3 and the group E lesion of tsE(1400) was on segment 8.« less

  13. Comparative Genomic Analyses of the Bacterial Phosphotransferase System

    PubMed Central

    Barabote, Ravi D.; Saier, Milton H.

    2005-01-01

    We report analyses of 202 fully sequenced genomes for homologues of known protein constituents of the bacterial phosphoenolpyruvate-dependent phosphotransferase system (PTS). These included 174 bacterial, 19 archaeal, and 9 eukaryotic genomes. Homologues of PTS proteins were not identified in archaea or eukaryotes, showing that the horizontal transfer of genes encoding PTS proteins has not occurred between the three domains of life. Of the 174 bacterial genomes (136 bacterial species) analyzed, 30 diverse species have no PTS homologues, and 29 species have cytoplasmic PTS phosphoryl transfer protein homologues but lack recognizable PTS permeases. These soluble homologues presumably function in regulation. The remaining 77 species possess all PTS proteins required for the transport and phosphorylation of at least one sugar via the PTS. Up to 3.2% of the genes in a bacterium encode PTS proteins. These homologues were analyzed for family association, range of protein types, domain organization, and organismal distribution. Different strains of a single bacterial species often possess strikingly different complements of PTS proteins. Types of PTS protein domain fusions were analyzed, showing that certain types of domain fusions are common, while others are rare or prohibited. Select PTS proteins were analyzed from different phylogenetic standpoints, showing that PTS protein phylogeny often differs from organismal phylogeny. The results document the frequent gain and loss of PTS protein-encoding genes and suggest that the lateral transfer of these genes within the bacterial domain has played an important role in bacterial evolution. Our studies provide insight into the development of complex multicomponent enzyme systems and lead to predictions regarding the types of protein-protein interactions that promote efficient PTS-mediated phosphoryl transfer. PMID:16339738

  14. Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling

    DOE PAGES

    Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim; ...

    2015-02-05

    The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less

  15. Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim

    The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less

  16. MIPS bacterial genomes functional annotation benchmark dataset.

    PubMed

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  17. Structural constraints in the packaging of bluetongue virus genomic segments

    PubMed Central

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C.

    2014-01-01

    The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5′ and 3′ ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. PMID:24980574

  18. Sequence analysis of the PIP5K locus in Eimeria maxima provides further evidence for eimerian genome plasticity and segmental organization.

    PubMed

    Song, B K; Pan, M Z; Lau, Y L; Wan, K L

    2014-07-29

    Commercial flocks infected by Eimeria species parasites, including Eimeria maxima, have an increased risk of developing clinical or subclinical coccidiosis; an intestinal enteritis associated with increased mortality rates in poultry. Currently, infection control is largely based on chemotherapy or live vaccines; however, drug resistance is common and vaccines are relatively expensive. The development of new cost-effective intervention measures will benefit from unraveling the complex genetic mechanisms that underlie host-parasite interactions, including the identification and characterization of genes encoding proteins such as phosphatidylinositol 4-phosphate 5-kinase (PIP5K). We previously identified a PIP5K coding sequence within the E. maxima genome. In this study, we analyzed two bacterial artificial chromosome clones presenting a ~145-kb E. maxima (Weybridge strain) genomic region spanning the PIP5K gene locus. Sequence analysis revealed that ~95% of the simple sequence repeats detected were located within regions comparable to the previously described feature-rich segments of the Eimeria tenella genome. Comparative sequence analysis with the orthologous E. maxima (Houghton strain) region revealed a moderate level of conserved synteny. Unique segmental organizations and telomere-like repeats were also observed in both genomes. A number of incomplete transposable elements were detected and further scrutiny of these elements in both orthologous segments revealed interesting nesting events, which may play a role in facilitating genome plasticity in E. maxima. The current analysis provides more detailed information about the genome organization of E. maxima and may help to reveal genotypic differences that are important for expression of traits related to pathogenicity and virulence.

  19. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    PubMed Central

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  20. Xylella genomics and bacterial pathogenicity to plants.

    PubMed

    Dow, J M; Daniels, M J

    2000-12-01

    Xylella fastidiosa, a pathogen of citrus, is the first plant pathogenic bacterium for which the complete genome sequence has been published. Inspection of the sequence reveals high relatedness to many genes of other pathogens, notably Xanthomonas campestris. Based on this, we suggest that Xylella possesses certain easily testable properties that contribute to pathogenicity. We also present some general considerations for deriving information on pathogenicity from bacterial genomics. Copyright 2000 John Wiley & Sons, Ltd.

  1. Comparing genomes with rearrangements and segmental duplications.

    PubMed

    Shao, Mingfu; Moret, Bernard M E

    2015-06-15

    Large-scale evolutionary events such as genomic rearrange.ments and segmental duplications form an important part of the evolution of genomes and are widely studied from both biological and computational perspectives. A basic computational problem is to infer these events in the evolutionary history for given modern genomes, a task for which many algorithms have been proposed under various constraints. Algorithms that can handle both rearrangements and content-modifying events such as duplications and losses remain few and limited in their applicability. We study the comparison of two genomes under a model including general rearrangements (through double-cut-and-join) and segmental duplications. We formulate the comparison as an optimization problem and describe an exact algorithm to solve it by using an integer linear program. We also devise a sufficient condition and an efficient algorithm to identify optimal substructures, which can simplify the problem while preserving optimality. Using the optimal substructures with the integer linear program (ILP) formulation yields a practical and exact algorithm to solve the problem. We then apply our algorithm to assign in-paralogs and orthologs (a necessary step in handling duplications) and compare its performance with that of the state-of-the-art method MSOAR, using both simulations and real data. On simulated datasets, our method outperforms MSOAR by a significant margin, and on five well-annotated species, MSOAR achieves high accuracy, yet our method performs slightly better on each of the 10 pairwise comparisons. http://lcbb.epfl.ch/softwares/coser. © The Author 2015. Published by Oxford University Press.

  2. Microplitis demolitor bracovirus genome segments vary in abundance and are individually packaged in virions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beck, Markus H.; Inman, Ross B.; Strand, Michael R.

    2007-03-01

    Polydnaviruses (PDVs) are distinguished by their unique association with parasitoid wasps and their segmented, double-stranded (ds) DNA genomes that are non-equimolar in abundance. Relatively little is actually known, however, about genome packaging or segment abundance of these viruses. Here, we conducted electron microscopy (EM) and real-time polymerase chain reaction (PCR) studies to characterize packaging and segment abundance of Microplitis demolitor bracovirus (MdBV). Like other PDVs, MdBV replicates in the ovaries of females where virions accumulate to form a suspension called calyx fluid. Wasps then inject a quantity of calyx fluid when ovipositing into hosts. The MdBV genome consists of 15more » segments that range from 3.6 (segment A) to 34.3 kb (segment O). EM analysis indicated that MdBV virions contain a single nucleocapsid that encapsidates one circular DNA of variable size. We developed a semi-quantitative real-time PCR assay using SYBR Green I. This assay indicated that five (J, O, H, N and B) segments of the MdBV genome accounted for more than 60% of the viral DNAs in calyx fluid. Estimates of relative segment abundance using our real-time PCR assay were also very similar to DNA size distributions determined from micrographs. Analysis of parasitized Pseudoplusia includens larvae indicated that copy number of MdBV segments C, B and J varied between hosts but their relative abundance within a host was virtually identical to their abundance in calyx fluid. Among-tissue assays indicated that each viral segment was most abundant in hemocytes and least abundant in salivary glands. However, the relative abundance of each segment to one another was similar in all tissues. We also found no clear relationship between MdBV segment and transcript abundance in hemocytes and fat body.« less

  3. Defining the Estimated Core Genome of Bacterial Populations Using a Bayesian Decision Model

    PubMed Central

    van Tonder, Andries J.; Mistry, Shilan; Bray, James E.; Hill, Dorothea M. C.; Cody, Alison J.; Farmer, Chris L.; Klugman, Keith P.; von Gottberg, Anne; Bentley, Stephen D.; Parkhill, Julian; Jolley, Keith A.; Maiden, Martin C. J.; Brueggemann, Angela B.

    2014-01-01

    The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estimate the number of core genes, calculated pairwise evolutionary distances (p-distances) based on nucleotide sequence diversity, and plotted the median p-distance for each core gene relative to its genome location. We designed visually-informative genome diagrams to depict areas of interest in genomes. Case studies demonstrated how the model could identify areas for further study, e.g. 25% of the core genes with higher sequence diversity in the Campylobacter jejuni and Neisseria meningitidis genomes encoded hypothetical proteins. The core gene with the highest p-distance value in C. jejuni was annotated in the reference genome as a putative hydrolase, but further work revealed that it shared sequence homology with beta-lactamase/metallo-beta-lactamases (enzymes that provide resistance to a range of broad-spectrum antibiotics) and thioredoxin reductase genes (which reduce oxidative stress and are essential for DNA replication) in other C. jejuni genomes. Our Bayesian model of estimating the core genome is principled, easy to use and can be applied to large genome datasets. This study also highlighted the lack of knowledge currently available for many core genes in bacterial genomes of significant global public health importance. PMID:25144616

  4. Characterizing the bacterial microbiota in different gastrointestinal tract segments of the Bactrian camel.

    PubMed

    He, Jing; Yi, Li; Hai, Le; Ming, Liang; Gao, Wanting; Ji, Rimutu

    2018-01-12

    The bacterial community plays important roles in the gastrointestinal tracts (GITs) of animals. However, our understanding of the microbial communities in the GIT of Bactrian camels remains limited. Here, we describe the bacterial communities from eight different GIT segments (rumen, reticulum, abomasum, duodenum, ileum, jejunum, caecum, colon) and faeces determined from 11 Bactrian camels using 16S rRNA gene amplicon sequencing. Twenty-seven bacterial phyla were found in the GIT, with Firmicutes, Verrucomicrobia and Bacteroidetes predominating. However, there were significant differences in microbial community composition between segments of the GIT. In particular, a greater proportion of Akkermansia and Unclassified Ruminococcaceae were found in the large intestine and faecal samples, while more Unclassified Clostridiales and Unclassified Bacteroidales were present in the in forestomach and small intestine. Comparative analysis of the microbiota from different GIT segments revealed that the microbial profile in the large intestine was like that in faeces. We also predicted the metagenomic profiles for the different GIT regions. In forestomach, there was enrichment associated with replication and repair and amino acid metabolism, while carbohydrate metabolism was enriched in the large intestine and faeces. These results provide profound insights into the GIT microbiota of Bactrian camels.

  5. Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation.

    PubMed

    Qin, Qi-Long; Xie, Bin-Bin; Yu, Yong; Shu, Yan-Li; Rong, Jin-Cheng; Zhang, Yan-Jiao; Zhao, Dian-Li; Chen, Xiu-Lan; Zhang, Xi-Ying; Chen, Bo; Zhou, Bai-Cheng; Zhang, Yu-Zhong

    2014-06-01

    To what extent the genomes of different species belonging to one genus can be diverse and the relationship between genomic differentiation and environmental factor remain unclear for oceanic bacteria. With many new bacterial genera and species being isolated from marine environments, this question warrants attention. In this study, we sequenced all the type strains of the published species of Glaciecola, a recently defined cold-adapted genus with species from diverse marine locations, to study the genomic diversity and cold-adaptation strategy in this genus.The genome size diverged widely from 3.08 to 5.96 Mb, which can be explained by massive gene gain and loss events. Horizontal gene transfer and new gene emergence contributed substantially to the genome size expansion. The genus Glaciecola had an open pan-genome. Comparative genomic research indicated that species of the genus Glaciecola had high diversity in genome size, gene content and genetic relatedness. This may be prevalent in marine bacterial genera considering the dynamic and complex environments of the ocean. Species of Glaciecola had some common genomic features related to cold adaptation, which enable them to thrive and play a role in biogeochemical cycle in the cold marine environments.

  6. Identification of a precursor genomic segment that provided a sequence unique to glycophorin B and E genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Onda, M.; Kudo, S.; Fukuda, M.

    Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less

  7. Phages and the Evolution of Bacterial Pathogens: from Genomic Rearrangements to Lysogenic Conversion

    PubMed Central

    Brüssow, Harald; Canchaya, Carlos; Hardt, Wolf-Dietrich

    2004-01-01

    Comparative genomics demonstrated that the chromosomes from bacteria and their viruses (bacteriophages) are coevolving. This process is most evident for bacterial pathogens where the majority contain prophages or phage remnants integrated into the bacterial DNA. Many prophages from bacterial pathogens encode virulence factors. Two situations can be distinguished: Vibrio cholerae, Shiga toxin-producing Escherichia coli, Corynebacterium diphtheriae, and Clostridium botulinum depend on a specific prophage-encoded toxin for causing a specific disease, whereas Staphylococcus aureus, Streptococcus pyogenes, and Salmonella enterica serovar Typhimurium harbor a multitude of prophages and each phage-encoded virulence or fitness factor makes an incremental contribution to the fitness of the lysogen. These prophages behave like “swarms” of related prophages. Prophage diversification seems to be fueled by the frequent transfer of phage material by recombination with superinfecting phages, resident prophages, or occasional acquisition of other mobile DNA elements or bacterial chromosomal genes. Prophages also contribute to the diversification of the bacterial genome architecture. In many cases, they actually represent a large fraction of the strain-specific DNA sequences. In addition, they can serve as anchoring points for genome inversions. The current review presents the available genomics and biological data on prophages from bacterial pathogens in an evolutionary framework. PMID:15353570

  8. One Bacterial Cell, One Complete Genome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated frommore » the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.« less

  9. Bacterial genome engineering and synthetic biology: combating pathogens.

    PubMed

    Krishnamurthy, Malathy; Moore, Richard T; Rajamani, Sathish; Panchal, Rekha G

    2016-11-04

    The emergence and prevalence of multidrug resistant (MDR) pathogenic bacteria poses a serious threat to human and animal health globally. Nosocomial infections and common ailments such as pneumonia, wound, urinary tract, and bloodstream infections are becoming more challenging to treat due to the rapid spread of MDR pathogenic bacteria. According to recent reports by the World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC), there is an unprecedented increase in the occurrence of MDR infections worldwide. The rise in these infections has generated an economic strain worldwide, prompting the WHO to endorse a global action plan to improve awareness and understanding of antimicrobial resistance. This health crisis necessitates an immediate action to target the underlying mechanisms of drug resistance in bacteria. The advent of new bacterial genome engineering and synthetic biology (SB) tools is providing promising diagnostic and treatment plans to monitor and treat widespread recalcitrant bacterial infections. Key advances in genetic engineering approaches can successfully aid in targeting and editing pathogenic bacterial genomes for understanding and mitigating drug resistance mechanisms. In this review, we discuss the application of specific genome engineering and SB methods such as recombineering, clustered regularly interspaced short palindromic repeats (CRISPR), and bacterial cell-cell signaling mechanisms for pathogen targeting. The utility of these tools in developing antibacterial strategies such as novel antibiotic production, phage therapy, diagnostics and vaccine production to name a few, are also highlighted. The prevalent use of antibiotics and the spread of MDR bacteria raise the prospect of a post-antibiotic era, which underscores the need for developing novel therapeutics to target MDR pathogens. The development of enabling SB technologies offers promising solutions to deliver safe and effective antibacterial therapies.

  10. Computational Analysis of Uncharacterized Proteins of Environmental Bacterial Genome

    NASA Astrophysics Data System (ADS)

    Coxe, K. J.; Kumar, M.

    2017-12-01

    Betaproteobacteria strain CB is a gram-negative bacterium in the phylum Proteobacteria and are found naturally in soil and water. In this complex environment, bacteria play a key role in efficiently eliminating the organic material and other pollutants from wastewater. To investigate the process of pollutant removal from wastewater using bacteria, it is important to characterize the proteins encoded by the bacterial genome. Our study combines a number of bioinformatics tools to predict the function of unassigned proteins in the bacterial genome. The genome of Betaproteobacteria strain CB contains 2,112 proteins in which function of 508 proteins are unknown, termed as uncharacterized proteins (UPs). The localization of the UPs with in the cell was determined and the structure of 38 UPs was accurately predicted. These UPs were predicted to belong to various classes of proteins such as enzymes, transporters, binding proteins, signal peptides, transmembrane proteins and other proteins. The outcome of this work will help better understand wastewater treatment mechanism.

  11. IonGAP: integrative bacterial genome analysis for Ion Torrent sequence data.

    PubMed

    Baez-Ortega, Adrian; Lorenzo-Diaz, Fabian; Hernandez, Mariano; Gonzalez-Vila, Carlos Ignacio; Roda-Garcia, Jose Luis; Colebrook, Marcos; Flores, Carlos

    2015-09-01

    We introduce IonGAP, a publicly available Web platform designed for the analysis of whole bacterial genomes using Ion Torrent sequence data. Besides assembly, it integrates a variety of comparative genomics, annotation and bacterial classification routines, based on the widely used FASTQ, BAM and SRA file formats. Benchmarking with different datasets evidenced that IonGAP is a fast, powerful and simple-to-use bioinformatics tool. By releasing this platform, we aim to translate low-cost bacterial genome analysis for microbiological prevention and control in healthcare, agroalimentary and pharmaceutical industry applications. IonGAP is hosted by the ITER's Teide-HPC supercomputer and is freely available on the Web for non-commercial use at http://iongap.hpc.iter.es. mcolesan@ull.edu.es or cflores@ull.edu.es Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Segmental duplications: evolution and impact among the current Lepidoptera genomes.

    PubMed

    Zhao, Qian; Ma, Dongna; Vasseur, Liette; You, Minsheng

    2017-07-06

    Structural variation among genomes is now viewed to be as important as single nucleoid polymorphisms in influencing the phenotype and evolution of a species. Segmental duplication (SD) is defined as segments of DNA with homologous sequence. Here, we performed a systematic analysis of segmental duplications (SDs) among five lepidopteran reference genomes (Plutella xylostella, Danaus plexippus, Bombyx mori, Manduca sexta and Heliconius melpomene) to understand their potential impact on the evolution of these species. We find that the SDs content differed substantially among species, ranging from 1.2% of the genome in B. mori to 15.2% in H. melpomene. Most SDs formed very high identity (similarity higher than 90%) blocks but had very few large blocks. Comparative analysis showed that most of the SDs arose after the divergence of each linage and we found that P. xylostella and H. melpomene showed more duplications than other species, suggesting they might be able to tolerate extensive levels of variation in their genomes. Conserved ancestral and species specific SD events were assessed, revealing multiple examples of the gain, loss or maintenance of SDs over time. SDs content analysis showed that most of the genes embedded in SDs regions belonged to species-specific SDs ("Unique" SDs). Functional analysis of these genes suggested their potential roles in the lineage-specific evolution. SDs and flanking regions often contained transposable elements (TEs) and this association suggested some involvement in SDs formation. Further studies on comparison of gene expression level between SDs and non-SDs showed that the expression level of genes embedded in SDs was significantly lower, suggesting that structure changes in the genomes are involved in gene expression differences in species. The results showed that most of the SDs were "unique SDs", which originated after species formation. Functional analysis suggested that SDs might play different roles in different species. Our

  13. Segmenting the human genome based on states of neutral genetic divergence.

    PubMed

    Kuruppumullage Don, Prabhani; Ananda, Guruprasad; Chiaromonte, Francesca; Makova, Kateryna D

    2013-09-03

    Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states--each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants--including those associated with cancer and other diseases--and to improve computational predictions of noncoding functional elements.

  14. The Influenza A Virus PB2, PA, NP, and M Segments Play a Pivotal Role during Genome Packaging

    PubMed Central

    Gao, Qinshan; Chou, Yi-Ying; Doğanay, Sultan; Vafabakhsh, Reza; Ha, Taekjip

    2012-01-01

    The genomes of influenza A viruses consist of eight negative-strand RNA segments. Recent studies suggest that influenza viruses are able to specifically package their segmented genomes into the progeny virions. Segment-specific packaging signals of influenza virus RNAs (vRNAs) are located in the 5′ and 3′ noncoding regions, as well as in the terminal regions, of the open reading frames. How these packaging signals function during genome packaging remains unclear. Previously, we generated a 7-segmented virus in which the hemagglutinin (HA) and neuraminidase (NA) segments of the influenza A/Puerto Rico/8/34 virus were replaced by a chimeric influenza C virus hemagglutinin/esterase/fusion (HEF) segment carrying the HA packaging sequences. The robust growth of the HEF virus suggested that the NA segment is not required for the packaging of other segments. In this study, in order to determine the roles of the other seven segments during influenza A virus genome assembly, we continued to use this HEF virus as a tool and analyzed the effects of replacing the packaging sequences of other segments with those of the NA segment. Our results showed that deleting the packaging signals of the PB1, HA, or NS segment had no effect on the growth of the HEF virus, while growth was greatly impaired when the packaging sequence of the PB2, PA, nucleoprotein (NP), or matrix (M) segment was removed. These results indicate that the PB2, PA, NP, and M segments play a more important role than the remaining four vRNAs during the genome-packaging process. PMID:22532680

  15. Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.

    PubMed

    Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S

    2017-11-25

    Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial

  16. Segmentation, Splitting, and Classification of Overlapping Bacteria in Microscope Images for Automatic Bacterial Vaginosis Diagnosis.

    PubMed

    Song, Youyi; He, Liang; Zhou, Feng; Chen, Siping; Ni, Dong; Lei, Baiying; Wang, Tianfu

    2017-07-01

    Quantitative analysis of bacterial morphotypes in the microscope images plays a vital role in diagnosis of bacterial vaginosis (BV) based on the Nugent score criterion. However, there are two main challenges for this task: 1) It is quite difficult to identify the bacterial regions due to various appearance, faint boundaries, heterogeneous shapes, low contrast with the background, and small bacteria sizes with regards to the image. 2) There are numerous bacteria overlapping each other, which hinder us to conduct accurate analysis on individual bacterium. To overcome these challenges, we propose an automatic method in this paper to diagnose BV by quantitative analysis of bacterial morphotypes, which consists of a three-step approach, i.e., bacteria regions segmentation, overlapping bacteria splitting, and bacterial morphotypes classification. Specifically, we first segment the bacteria regions via saliency cut, which simultaneously evaluates the global contrast and spatial weighted coherence. And then Markov random field model is applied for high-quality unsupervised segmentation of small object. We then decompose overlapping bacteria clumps into markers, and associate a pixel with markers to identify evidence for eventual individual bacterium splitting. Next, we extract morphotype features from each bacterium to learn the descriptors and to characterize the types of bacteria using an Adaptive Boosting machine learning framework. Finally, BV diagnosis is implemented based on the Nugent score criterion. Experiments demonstrate that our proposed method achieves high accuracy and efficiency in computation for BV diagnosis.

  17. Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations

    PubMed Central

    Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter

    2017-01-01

    Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594

  18. Recombination-Driven Genome Evolution and Stability of Bacterial Species.

    PubMed

    Dixit, Purushottam D; Pang, Tin Yau; Maslov, Sergei

    2017-09-01

    While bacteria divide clonally, horizontal gene transfer followed by homologous recombination is now recognized as an important contributor to their evolution. However, the details of how the competition between clonality and recombination shapes genome diversity remains poorly understood. Using a computational model, we find two principal regimes in bacterial evolution and identify two composite parameters that dictate the evolutionary fate of bacterial species. In the divergent regime, characterized by either a low recombination frequency or strict barriers to recombination, cohesion due to recombination is not sufficient to overcome the mutational drift. As a consequence, the divergence between pairs of genomes in the population steadily increases in the course of their evolution. The species lacks genetic coherence with sexually isolated clonal subpopulations continuously formed and dissolved. In contrast, in the metastable regime, characterized by a high recombination frequency combined with low barriers to recombination, genomes continuously recombine with the rest of the population. The population remains genetically cohesive and temporally stable. Notably, the transition between these two regimes can be affected by relatively small changes in evolutionary parameters. Using the Multi Locus Sequence Typing (MLST) data, we classify a number of bacterial species to be either the divergent or the metastable type. Generalizations of our framework to include selection, ecologically structured populations, and horizontal gene transfer of nonhomologous regions are discussed as well. Copyright © 2017 by the Genetics Society of America.

  19. Creation of a Recombinant Rift Valley Fever Virus with a Two-Segmented Genome ▿ †

    PubMed Central

    Brennan, Benjamin; Welch, Stephen R.; McLees, Angela; Elliott, Richard M.

    2011-01-01

    Rift Valley fever virus (RVFV; family Bunyaviridae) is a clinically important, mosquito-borne pathogen of both livestock and humans, which is found mainly in sub-Saharan Africa and the Arabian Peninsula. RVFV has a trisegmented single-stranded RNA (ssRNA) genome. The L and M segments are negative sense and encode the L protein (viral polymerase) on the L segment and the virion glycoproteins Gn and Gc as well as two other proteins, NSm and 78K, on the M segment. The S segment uses an ambisense coding strategy to express the nucleocapsid protein, N, and the nonstructural protein, NSs. Both the NSs and NSm proteins are dispensable for virus growth in tissue culture. Using reverse genetics, we generated a recombinant virus, designated r2segMP12, containing a two-segmented genome in which the NSs coding sequence was replaced with that for the Gn and Gc precursor. Thus, r2segMP12 lacks an M segment, and although it was attenuated in comparison to the three-segmented parental virus in both mammalian and insect cell cultures, it was genetically stable over multiple passages. We further show that the virus can stably maintain an M-like RNA segment encoding the enhanced green fluorescent protein gene. The implications of these findings for RVFV genome packaging and the potential to develop multivalent live-attenuated vaccines are discussed. PMID:21795328

  20. Detection and correction of false segmental duplications caused by genome mis-assembly

    PubMed Central

    2010-01-01

    Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes. PMID:20219098

  1. Analysis of gene expression levels in individual bacterial cells without image segmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kwak, In Hae; Son, Minjun; Hagen, Stephen J., E-mail: sjhagen@ufl.edu

    2012-05-11

    Highlights: Black-Right-Pointing-Pointer We present a method for extracting gene expression data from images of bacterial cells. Black-Right-Pointing-Pointer The method does not employ cell segmentation and does not require high magnification. Black-Right-Pointing-Pointer Fluorescence and phase contrast images of the cells are correlated through the physics of phase contrast. Black-Right-Pointing-Pointer We demonstrate the method by characterizing noisy expression of comX in Streptococcus mutans. -- Abstract: Studies of stochasticity in gene expression typically make use of fluorescent protein reporters, which permit the measurement of expression levels within individual cells by fluorescence microscopy. Analysis of such microscopy images is almost invariably based on amore » segmentation algorithm, where the image of a cell or cluster is analyzed mathematically to delineate individual cell boundaries. However segmentation can be ineffective for studying bacterial cells or clusters, especially at lower magnification, where outlines of individual cells are poorly resolved. Here we demonstrate an alternative method for analyzing such images without segmentation. The method employs a comparison between the pixel brightness in phase contrast vs fluorescence microscopy images. By fitting the correlation between phase contrast and fluorescence intensity to a physical model, we obtain well-defined estimates for the different levels of gene expression that are present in the cell or cluster. The method reveals the boundaries of the individual cells, even if the source images lack the resolution to show these boundaries clearly.« less

  2. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis

    PubMed Central

    Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del

    2015-01-01

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096

  3. bcgTree: automatized phylogenetic tree building from bacterial core genomes.

    PubMed

    Ankenbrand, Markus J; Keller, Alexander

    2016-10-01

    The need for multi-gene analyses in scientific fields such as phylogenetics and DNA barcoding has increased in recent years. In particular, these approaches are increasingly important for differentiating bacterial species, where reliance on the standard 16S rDNA marker can result in poor resolution. Additionally, the assembly of bacterial genomes has become a standard task due to advances in next-generation sequencing technologies. We created a bioinformatic pipeline, bcgTree, which uses assembled bacterial genomes either from databases or own sequencing results from the user to reconstruct their phylogenetic history. The pipeline automatically extracts 107 essential single-copy core genes, found in a majority of bacteria, using hidden Markov models and performs a partitioned maximum-likelihood analysis. Here, we describe the workflow of bcgTree and, as a proof-of-concept, its usefulness in resolving the phylogeny of 293 publically available bacterial strains of the genus Lactobacillus. We also evaluate its performance in both low- and high-level taxonomy test sets. The tool is freely available at github ( https://github.com/iimog/bcgTree ) and our institutional homepage ( http://www.dna-analytics.biozentrum.uni-wuerzburg.de ).

  4. Alignment-free detection of horizontal gene transfer between closely related bacterial genomes.

    PubMed

    Domazet-Lošo, Mirjana; Haubold, Bernhard

    2011-09-01

    Bacterial epidemics are often caused by strains that have acquired their increased virulence through horizontal gene transfer. Due to this association with disease, the detection of horizontal gene transfer continues to receive attention from microbiologists and bioinformaticians alike. Most software for detecting transfer events is based on alignments of sets of genes or of entire genomes. But despite great advances in the design of algorithms and computer programs, genome alignment remains computationally challenging. We have therefore developed an alignment-free algorithm for rapidly detecting horizontal gene transfer between closely related bacterial genomes. Our implementation of this algorithm is called alfy for "ALignment Free local homologY" and is freely available from http://guanine.evolbio.mpg.de/alfy/. In this comment we demonstrate the application of alfy to the genomes of Staphylococcus aureus. We also argue that-contrary to popular belief and in spite of increasing computer speed-algorithmic optimization is becoming more, not less, important if genome data continues to accumulate at the present rate.

  5. Chemically synthesized silver nanoparticles as cell lysis agent for bacterial genomic DNA isolation

    NASA Astrophysics Data System (ADS)

    Goswami, Gunajit; Boruah, Himangshu; Gautom, Trishnamoni; Jyoti Hazarika, Dibya; Barooah, Madhumita; Boro, Robin Chandra

    2017-12-01

    Silver nanoparticles (AgNPs) have seen a recent spurt of use in varied fields of science. In this paper, we showed a novel application of AgNP as a promising microbial cell-lysis agent for genomic DNA isolation. We utilized chemically synthesized AgNPs for lysing bacterial cells to isolate their genomic DNA. The AgNPs efficiently lysed bacterial cells to yield good quality DNA that could be subsequently used for several molecular biology works.

  6. Bacterial genomes in epidemiology—present and future

    PubMed Central

    Croucher, Nicholas J.; Harris, Simon R.; Grad, Yonatan H.; Hanage, William P.

    2013-01-01

    Sequence data are well established in the reconstruction of the phylogenetic and demographic scenarios that have given rise to outbreaks of viral pathogens. The application of similar methods to bacteria has been hindered in the main by the lack of high-resolution nucleotide sequence data from quality samples. Developing and already available genomic methods have greatly increased the amount of data that can be used to characterize an isolate and its relationship to others. However, differences in sequencing platforms and data analysis mean that these enhanced data come with a cost in terms of portability: results from one laboratory may not be directly comparable with those from another. Moreover, genomic data for many bacteria bear the mark of a history including extensive recombination, which has the potential to greatly confound phylogenetic and coalescent analyses. Here, we discuss the exacting requirements of genomic epidemiology, and means by which the distorting signal of recombination can be minimized to permit the leverage of growing datasets of genomic data from bacterial pathogens. PMID:23382424

  7. The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

    PubMed

    Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

    2013-10-01

    The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.

  8. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    PubMed Central

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  9. Modeling the integration of bacterial rRNA fragments into the human cancer genome.

    PubMed

    Sieber, Karsten B; Gajer, Pawel; Dunning Hotopp, Julie C

    2016-03-21

    Cancer is a disease driven by the accumulation of genomic alterations, including the integration of exogenous DNA into the human somatic genome. We previously identified in silico evidence of DNA fragments from a Pseudomonas-like bacteria integrating into the 5'-UTR of four proto-oncogenes in stomach cancer sequencing data. The functional and biological consequences of these bacterial DNA integrations remain unknown. Modeling of these integrations suggests that the previously identified sequences cover most of the sequence flanking the junction between the bacterial and human DNA. Further examination of these reads reveals that these integrations are rich in guanine nucleotides and the integrated bacterial DNA may have complex transcript secondary structures. The models presented here lay the foundation for future experiments to test if bacterial DNA integrations alter the transcription of the human genes.

  10. Finishing bacterial genome assemblies with Mix.

    PubMed

    Soueidan, Hayssam; Maurier, Florence; Groppi, Alexis; Sirand-Pugnet, Pascal; Tardy, Florence; Citti, Christine; Dupuy, Virginie; Nikolski, Macha

    2013-01-01

    Among challenges that hamper reaping the benefits of genome assembly are both unfinished assemblies and the ensuing experimental costs. First, numerous software solutions for genome de novo assembly are available, each having its advantages and drawbacks, without clear guidelines as to how to choose among them. Second, these solutions produce draft assemblies that often require a resource intensive finishing phase. In this paper we address these two aspects by developing Mix , a tool that mixes two or more draft assemblies, without relying on a reference genome and having the goal to reduce contig fragmentation and thus speed-up genome finishing. The proposed algorithm builds an extension graph where vertices represent extremities of contigs and edges represent existing alignments between these extremities. These alignment edges are used for contig extension. The resulting output assembly corresponds to a set of paths in the extension graph that maximizes the cumulative contig length. We evaluate the performance of Mix on bacterial NGS data from the GAGE-B study and apply it to newly sequenced Mycoplasma genomes. Resulting final assemblies demonstrate a significant improvement in the overall assembly quality. In particular, Mix is consistent by providing better overall quality results even when the choice is guided solely by standard assembly statistics, as is the case for de novo projects. Mix is implemented in Python and is available at https://github.com/cbib/MIX, novel data for our Mycoplasma study is available at http://services.cbib.u-bordeaux2.fr/mix/.

  11. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    PubMed Central

    Gultyaev, Alexander P; Tsyganov-Bodounov, Anton; Spronken, Monique IJ; van der Kooij, Sander; Fouchier, Ron AM; Olsthoorn, René CL

    2014-01-01

    Conserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length, including protein-coding regions. Calculations of mutual information values at the paired nucleotide positions demonstrate that these structures impose considerable constraints on the virus genome evolution. Functional importance of a pseudoknot structure, predicted in the NP packaging signal region, was confirmed by plaque assays of the mutant viruses with disrupted structure and those with restored folding using compensatory substitutions. Possible functions of the conserved RNA folding patterns in the influenza A virus genome are discussed. PMID:25180940

  12. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis.

    PubMed

    Facey, Paul D; Méric, Guillaume; Hitchings, Matthew D; Pachebat, Justin A; Hegarty, Matt J; Chen, Xiaorui; Morgan, Laura V A; Hoeppner, James E; Whitten, Miranda M A; Kirk, William D J; Dyson, Paul J; Sheppard, Sam K; Del Sol, Ricardo

    2015-07-15

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. © The Author(s) 2015. Published by

  13. Comparative genomics of Mortierella elongata and its bacterial endosymbiont Mycoavidus cysteinexigens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Uehling, J.; Gryganskyi, A.; Hameed, K.

    Endosymbiosis of bacteria by eukaryotes is a defining feature of cellular evolution. In addition to well-known bacterial origins for mitochondria and chloroplasts, multiple origins of bacterial endosymbiosis are known within the cells of diverse animals, plants and fungi. Early-diverging lineages of terrestrial fungi harbor endosymbiotic bacteria belonging to the Burkholderiaceae. Furthermore, we sequenced the metagenome of the soil-inhabiting fungus Mortierella elongata and assembled the complete circular chromosome of its endosymbiont, Mycoavidus cysteinexigens, which we place within a lineage of endofungal symbionts that are sister clade to Burkholderia. The genome of M. elongata strain AG77 features a core set of primarymore » metabolic pathways for degradation of simple carbohydrates and lipid biosynthesis, while the M. cysteinexigens (AG77) genome is reduced in size and function. Experiments using antibiotics to cure the endobacterium from the host demonstrate that the fungal host metabolism is highly modulated by presence/ absence of M. cysteinexigens. In independent comparative phylogenomic analyses of fungal and bacterial genomes we find that they are consistent with an ancient origin for M. elongata M. cysteinexigens symbiosis, most likely over 350 million years ago and concomitant with the terrestrialization of Earth and diversification of land fungi and plants.« less

  14. Comparative genomics of Mortierella elongata and its bacterial endosymbiont Mycoavidus cysteinexigens

    DOE PAGES

    Uehling, J.; Gryganskyi, A.; Hameed, K.; ...

    2017-01-11

    Endosymbiosis of bacteria by eukaryotes is a defining feature of cellular evolution. In addition to well-known bacterial origins for mitochondria and chloroplasts, multiple origins of bacterial endosymbiosis are known within the cells of diverse animals, plants and fungi. Early-diverging lineages of terrestrial fungi harbor endosymbiotic bacteria belonging to the Burkholderiaceae. Furthermore, we sequenced the metagenome of the soil-inhabiting fungus Mortierella elongata and assembled the complete circular chromosome of its endosymbiont, Mycoavidus cysteinexigens, which we place within a lineage of endofungal symbionts that are sister clade to Burkholderia. The genome of M. elongata strain AG77 features a core set of primarymore » metabolic pathways for degradation of simple carbohydrates and lipid biosynthesis, while the M. cysteinexigens (AG77) genome is reduced in size and function. Experiments using antibiotics to cure the endobacterium from the host demonstrate that the fungal host metabolism is highly modulated by presence/ absence of M. cysteinexigens. In independent comparative phylogenomic analyses of fungal and bacterial genomes we find that they are consistent with an ancient origin for M. elongata M. cysteinexigens symbiosis, most likely over 350 million years ago and concomitant with the terrestrialization of Earth and diversification of land fungi and plants.« less

  15. The Extent of Genome Flux and Its Role in the Differentiation of Bacterial Lineages

    PubMed Central

    Nowell, Reuben W.; Green, Sarah; Laue, Bridget E.; Sharp, Paul M.

    2014-01-01

    Horizontal gene transfer (HGT) and gene loss are key processes in bacterial evolution. However, the role of gene gain and loss in the emergence and maintenance of ecologically differentiated bacterial populations remains an open question. Here, we use whole-genome sequence data to quantify gene gain and loss for 27 lineages of the plant-associated bacterium Pseudomonas syringae. We apply an extensive error-control procedure that accounts for errors in draft genome data and greatly improves the accuracy of patterns of gene occurrence among these genomes. We demonstrate a history of extensive genome fluctuation for this species and show that individual lineages could have acquired thousands of genes in the same period in which a 1% amino acid divergence accrues in the core genome. Elucidating the dynamics of genome fluctuation reveals the rapid turnover of gained genes, such that the majority of recently gained genes are quickly lost. Despite high observed rates of fluctuation, a phylogeny inferred from patterns of gene occurrence is similar to a phylogeny based on amino acid replacements within the core genome. Furthermore, the core genome phylogeny suggests that P. syringae should be considered a number of distinct species, with levels of divergence at least equivalent to those between recognized bacterial species. Gained genes are transferred from a variety of sources, reflecting the depth and diversity of the potential gene pool available via HGT. Overall, our results provide further insights into the evolutionary dynamics of genome fluctuation and implicate HGT as a major factor contributing to the diversification of P. syringae lineages. PMID:24923323

  16. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance.

    PubMed

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available.

  17. Whole-Genome Sequencing and Concordance Between Antimicrobial Susceptibility Genotypes and Phenotypes of Bacterial Isolates Associated with Bovine Respiratory Disease

    PubMed Central

    Owen, Joseph R.; Noyes, Noelle; Young, Amy E.; Prince, Daniel J.; Blanchard, Patricia C.; Lehenbauer, Terry W.; Aly, Sharif S.; Davis, Jessica H.; O’Rourke, Sean M.; Abdo, Zaid; Belk, Keith; Miller, Michael R.; Morley, Paul; Van Eenennaam, Alison L.

    2017-01-01

    Extended laboratory culture and antimicrobial susceptibility testing timelines hinder rapid species identification and susceptibility profiling of bacterial pathogens associated with bovine respiratory disease, the most prevalent cause of cattle mortality in the United States. Whole-genome sequencing offers a culture-independent alternative to current bacterial identification methods, but requires a library of bacterial reference genomes for comparison. To contribute new bacterial genome assemblies and evaluate genetic diversity and variation in antimicrobial resistance genotypes, whole-genome sequencing was performed on bovine respiratory disease–associated bacterial isolates (Histophilus somni, Mycoplasma bovis, Mannheimia haemolytica, and Pasteurella multocida) from dairy and beef cattle. One hundred genomically distinct assemblies were added to the NCBI database, doubling the available genomic sequences for these four species. Computer-based methods identified 11 predicted antimicrobial resistance genes in three species, with none being detected in M. bovis. While computer-based analysis can identify antibiotic resistance genes within whole-genome sequences (genotype), it may not predict the actual antimicrobial resistance observed in a living organism (phenotype). Antimicrobial susceptibility testing on 64 H. somni, M. haemolytica, and P. multocida isolates had an overall concordance rate between genotype and phenotypic resistance to the associated class of antimicrobials of 72.7% (P < 0.001), showing substantial discordance. Concordance rates varied greatly among different antimicrobial, antibiotic resistance gene, and bacterial species combinations. This suggests that antimicrobial susceptibility phenotypes are needed to complement genomically predicted antibiotic resistance gene genotypes to better understand how the presence of antibiotic resistance genes within a given bacterial species could potentially impact optimal bovine respiratory disease

  18. Recommendations for the classification of group A rotaviruses using all 11 genomic RNA segments.

    PubMed

    Matthijnssens, Jelle; Ciarlet, Max; Rahman, Mustafizur; Attoui, Houssam; Bányai, Krisztián; Estes, Mary K; Gentsch, Jon R; Iturriza-Gómara, Miren; Kirkwood, Carl D; Martella, Vito; Mertens, Peter P C; Nakagomi, Osamu; Patton, John T; Ruggeri, Franco M; Saif, Linda J; Santos, Norma; Steyer, Andrej; Taniguchi, Koki; Desselberger, Ulrich; Van Ranst, Marc

    2008-01-01

    Recently, a classification system was proposed for rotaviruses in which all the 11 genomic RNA segments are used (Matthijnssens et al. in J Virol 82:3204-3219, 2008). Based on nucleotide identity cut-off percentages, different genotypes were defined for each genome segment. A nomenclature for the comparison of complete rotavirus genomes was considered in which the notations Gx-P[x]-Ix-Rx-Cx-Mx-Ax-Nx-Tx-Ex-Hx are used for the VP7-VP4-VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 encoding genes, respectively. This classification system is an extension of the previously applied genotype-based system which made use of the rotavirus gene segments encoding VP4, VP7, VP6, and NSP4. In order to assign rotavirus strains to one of the established genotypes or a new genotype, a standard procedure is proposed in this report. As more human and animal rotavirus genomes will be completely sequenced, new genotypes for each of the 11 gene segments may be identified. A Rotavirus Classification Working Group (RCWG) including specialists in molecular virology, infectious diseases, epidemiology, and public health was formed, which can assist in the appropriate delineation of new genotypes, thus avoiding duplications and helping minimize errors. Scientists discovering a potentially new rotavirus genotype for any of the 11 gene segments are invited to send the novel sequence to the RCWG, where the sequence will be analyzed, and a new nomenclature will be advised as appropriate. The RCWG will update the list of classified strains regularly and make this accessible on a website. Close collaboration with the Study Group Reoviridae of the International Committee on the Taxonomy of Viruses will be maintained.

  19. A compositional segmentation of the human mitochondrial genome is related to heterogeneities in the guanine mutation rate

    PubMed Central

    Samuels, David C.; Boys, Richard J.; Henderson, Daniel A.; Chinnery, Patrick F.

    2003-01-01

    We applied a hidden Markov model segmentation method to the human mitochondrial genome to identify patterns in the sequence, to compare these patterns to the gene structure of mtDNA and to see whether these patterns reveal additional characteristics important for our understanding of genome evolution, structure and function. Our analysis identified three segmentation categories based upon the sequence transition probabilities. Category 2 segments corresponded to the tRNA and rRNA genes, with a greater strand-symmetry in these segments. Category 1 and 3 segments covered the protein- coding genes and almost all of the non-coding D-loop. Compared to category 1, the mtDNA segments assigned to category 3 had much lower guanine abundance. A comparison to two independent databases of mitochondrial mutations and polymorphisms showed that the high substitution rate of guanine in human mtDNA is largest in the category 3 segments. Analysis of synonymous mutations showed the same pattern. This suggests that this heterogeneity in the mutation rate is partly independent of respiratory chain function and is a direct property of the genome sequence itself. This has important implications for our understanding of mtDNA evolution and its use as a ‘molecular clock’ to determine the rate of population and species divergence. PMID:14530452

  20. Whole-Genome Sequencing and Concordance Between Antimicrobial Susceptibility Genotypes and Phenotypes of Bacterial Isolates Associated with Bovine Respiratory Disease.

    PubMed

    Owen, Joseph R; Noyes, Noelle; Young, Amy E; Prince, Daniel J; Blanchard, Patricia C; Lehenbauer, Terry W; Aly, Sharif S; Davis, Jessica H; O'Rourke, Sean M; Abdo, Zaid; Belk, Keith; Miller, Michael R; Morley, Paul; Van Eenennaam, Alison L

    2017-09-07

    Extended laboratory culture and antimicrobial susceptibility testing timelines hinder rapid species identification and susceptibility profiling of bacterial pathogens associated with bovine respiratory disease, the most prevalent cause of cattle mortality in the United States. Whole-genome sequencing offers a culture-independent alternative to current bacterial identification methods, but requires a library of bacterial reference genomes for comparison. To contribute new bacterial genome assemblies and evaluate genetic diversity and variation in antimicrobial resistance genotypes, whole-genome sequencing was performed on bovine respiratory disease-associated bacterial isolates ( Histophilus somni , Mycoplasma bovis , Mannheimia haemolytica , and Pasteurella multocida ) from dairy and beef cattle. One hundred genomically distinct assemblies were added to the NCBI database, doubling the available genomic sequences for these four species. Computer-based methods identified 11 predicted antimicrobial resistance genes in three species, with none being detected in M. bovis While computer-based analysis can identify antibiotic resistance genes within whole-genome sequences (genotype), it may not predict the actual antimicrobial resistance observed in a living organism (phenotype). Antimicrobial susceptibility testing on 64 H. somni , M. haemolytica , and P. multocida isolates had an overall concordance rate between genotype and phenotypic resistance to the associated class of antimicrobials of 72.7% ( P < 0.001), showing substantial discordance. Concordance rates varied greatly among different antimicrobial, antibiotic resistance gene, and bacterial species combinations. This suggests that antimicrobial susceptibility phenotypes are needed to complement genomically predicted antibiotic resistance gene genotypes to better understand how the presence of antibiotic resistance genes within a given bacterial species could potentially impact optimal bovine respiratory disease

  1. Self-organizing approach for meta-genomes.

    PubMed

    Zhu, Jianfeng; Zheng, Wei-Mou

    2014-12-01

    We extend the self-organizing approach for annotation of a bacterial genome to analyze the raw sequencing data of the human gut metagenome without sequence assembling. The original approach divides the genomic sequence of a bacterium into non-overlapping segments of equal length and assigns to each segment one of seven 'phases', among which one is for the noncoding regions, three for the direct coding regions to indicate the three possible codon positions of the segment starting site, and three for the reverse coding regions. The noncoding phase and the six coding phases are described by two frequency tables of the 64 triplet types or 'codon usages'. A set of codon usages can be used to update the phase assignment and vice versa. An iteration after an initialization leads to a convergent phase assignment to give an annotation of the genome. In the extension of the approach to a metagenome, we consider a mixture model of a number of categories described by different codon usages. The Illumina Genome Analyzer sequencing data of the total DNA from faecal samples are then examined to understand the diversity of the human gut microbiome. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Encyclopedia of bacterial gene circuits whose presence or absence correlate with pathogenicity--a large-scale system analysis of decoded bacterial genomes.

    PubMed

    Shestov, Maksim; Ontañón, Santiago; Tozeren, Aydin

    2015-10-13

    Bacterial infections comprise a global health challenge as the incidences of antibiotic resistance increase. Pathogenic potential of bacteria has been shown to be context dependent, varying in response to environment and even within the strains of the same genus. We used the KEGG repository and extensive literature searches to identify among the 2527 bacterial genomes in the literature those implicated as pathogenic to the host, including those which show pathogenicity in a context dependent manner. Using data on the gene contents of these genomes, we identified sets of genes highly abundant in pathogenic but relatively absent in commensal strains and vice versa. In addition, we carried out genome comparison within a genus for the seventeen largest genera in our genome collection. We projected the resultant lists of ortholog genes onto KEGG bacterial pathways to identify clusters and circuits, which can be linked to either pathogenicity or synergy. Gene circuits relatively abundant in nonpathogenic bacteria often mediated biosynthesis of antibiotics. Other synergy-linked circuits reduced drug-induced toxicity. Pathogen-abundant gene circuits included modules in one-carbon folate, two-component system, type-3 secretion system, and peptidoglycan biosynthesis. Antibiotics-resistant bacterial strains possessed genes modulating phagocytosis, vesicle trafficking, cytoskeletal reorganization, and regulation of the inflammatory response. Our study also identified bacterial genera containing a circuit, elements of which were previously linked to Alzheimer's disease. Present study produces for the first time, a signature, in the form of a robust list of gene circuitry whose presence or absence could potentially define the pathogenicity of a microbiome. Extensive literature search substantiated a bulk majority of the commensal and pathogenic circuitry in our predicted list. Scanning microbiome libraries for these circuitry motifs will provide further insights into the complex

  3. SuperSegger: robust image segmentation, analysis and lineage tracking of bacterial cells.

    PubMed

    Stylianidou, Stella; Brennan, Connor; Nissen, Silas B; Kuwada, Nathan J; Wiggins, Paul A

    2016-11-01

    Many quantitative cell biology questions require fast yet reliable automated image segmentation to identify and link cells from frame-to-frame, and characterize the cell morphology and fluorescence. We present SuperSegger, an automated MATLAB-based image processing package well-suited to quantitative analysis of high-throughput live-cell fluorescence microscopy of bacterial cells. SuperSegger incorporates machine-learning algorithms to optimize cellular boundaries and automated error resolution to reliably link cells from frame-to-frame. Unlike existing packages, it can reliably segment microcolonies with many cells, facilitating the analysis of cell-cycle dynamics in bacteria as well as cell-contact mediated phenomena. This package has a range of built-in capabilities for characterizing bacterial cells, including the identification of cell division events, mother, daughter and neighbouring cells, and computing statistics on cellular fluorescence, the location and intensity of fluorescent foci. SuperSegger provides a variety of postprocessing data visualization tools for single cell and population level analysis, such as histograms, kymographs, frame mosaics, movies and consensus images. Finally, we demonstrate the power of the package by analyzing lag phase growth with single cell resolution. © 2016 John Wiley & Sons Ltd.

  4. Identification and analysis of integrons and cassette arrays in bacterial genomes

    PubMed Central

    Touchon, Marie; Néron, Bertrand; Rocha, Eduardo PC

    2016-01-01

    Abstract Integrons recombine gene arrays and favor the spread of antibiotic resistance. Their broader roles in bacterial adaptation remain mysterious, partly due to lack of computational tools. We made a program – IntegronFinder – to identify integrons with high accuracy and sensitivity. IntegronFinder is available as a standalone program and as a web application. It searches for attC sites using covariance models, for integron-integrases using HMM profiles, and for other features (promoters, attI site) using pattern matching. We searched for integrons, integron-integrases lacking attC sites, and clusters of attC sites lacking a neighboring integron-integrase in bacterial genomes. All these elements are especially frequent in genomes of intermediate size. They are missing in some key phyla, such as α-Proteobacteria, which might reflect selection against cell lineages that acquire integrons. The similarity between attC sites is proportional to the number of cassettes in the integron, and is particularly low in clusters of attC sites lacking integron-integrases. The latter are unexpectedly abundant in genomes lacking integron-integrases or their remains, and have a large novel pool of cassettes lacking homologs in the databases. They might represent an evolutionary step between the acquisition of genes within integrons and their stabilization in the new genome. PMID:27130947

  5. The Consequences of Reconfiguring the Ambisense S Genome Segment of Rift Valley Fever Virus on Viral Replication in Mammalian and Mosquito Cells and for Genome Packaging

    PubMed Central

    Elliott, Richard M.

    2014-01-01

    Rift Valley fever virus (RVFV, family Bunyaviridae) is a mosquito-borne pathogen of both livestock and humans, found primarily in Sub-Saharan Africa and the Arabian Peninsula. The viral genome comprises two negative-sense (L and M segments) and one ambisense (S segment) RNAs that encode seven proteins. The S segment encodes the nucleocapsid (N) protein in the negative-sense and a nonstructural (NSs) protein in the positive-sense, though NSs cannot be translated directly from the S segment but rather from a specific subgenomic mRNA. Using reverse genetics we generated a virus, designated rMP12:S-Swap, in which the N protein is expressed from the NSs locus and NSs from the N locus within the genomic S RNA. In cells infected with rMP12:S-Swap NSs is expressed at higher levels with respect to N than in cells infected with the parental rMP12 virus. Despite NSs being the main interferon antagonist and determinant of virulence, growth of rMP12:S-Swap was attenuated in mammalian cells and gave a small plaque phenotype. The increased abundance of the NSs protein did not lead to faster inhibition of host cell protein synthesis or host cell transcription in infected mammalian cells. In cultured mosquito cells, however, infection with rMP12:S-Swap resulted in cell death rather than establishment of persistence as seen with rMP12. Finally, altering the composition of the S segment led to a differential packaging ratio of genomic to antigenomic RNA into rMP12:S-Swap virions. Our results highlight the plasticity of the RVFV genome and provide a useful experimental tool to investigate further the packaging mechanism of the segmented genome. PMID:24550727

  6. SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information

    PubMed Central

    2014-01-01

    Background The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblies constructed from Next Generation Sequencing (NGS) data. Results Here we propose a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone. On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes. Conclusions The current work describes our SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner. PMID:24950923

  7. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    PubMed

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.

  8. The bacterial species definition in the genomic era

    PubMed Central

    Konstantinidis, Konstantinos T; Ramette, Alban; Tiedje, James M

    2006-01-01

    The bacterial species definition, despite its eminent practical significance for identification, diagnosis, quarantine and diversity surveys, remains a very difficult issue to advance. Genomics now offers novel insights into intra-species diversity and the potential for emergence of a more soundly based system. Although we share the excitement, we argue that it is premature for a universal change to the definition because current knowledge is based on too few phylogenetic groups and too few samples of natural populations. Our analysis of five important bacterial groups suggests, however, that more stringent standards for species may be justifiable when a solid understanding of gene content and ecological distinctiveness becomes available. Our analysis also reveals what is actually encompassed in a species according to the current standards, in terms of whole-genome sequence and gene-content diversity, and shows that this does not correspond to coherent clusters for the environmental Burkholderia and Shewanella genera examined. In contrast, the obligatory pathogens, which have a very restricted ecological niche, do exhibit clusters. Therefore, the idea of biologically meaningful clusters of diversity that applies to most eukaryotes may not be universally applicable in the microbial world, or if such clusters exist, they may be found at different levels of distinction. PMID:17062412

  9. Genome-wide identification of bacterial plant colonization genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.

    Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less

  10. Genome-wide identification of bacterial plant colonization genes

    DOE PAGES

    Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.; ...

    2017-09-22

    Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less

  11. Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

    PubMed

    Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

    2012-05-10

    The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the

  12. Defense islands in bacterial and archaeal genomes and prediction of novel defense systems.

    PubMed

    Makarova, Kira S; Wolf, Yuri I; Snir, Sagi; Koonin, Eugene V

    2011-11-01

    The arms race between cellular life forms and viruses is a major driving force of evolution. A substantial fraction of bacterial and archaeal genomes is dedicated to antivirus defense. We analyzed the distribution of defense genes and typical mobilome components (such as viral and transposon genes) in bacterial and archaeal genomes and demonstrated statistically significant clustering of antivirus defense systems and mobile genes and elements in genomic islands. The defense islands are enriched in putative operons and contain numerous overrepresented gene families. A detailed sequence analysis of the proteins encoded by genes in these families shows that many of them are diverged variants of known defense system components, whereas others show features, such as characteristic operonic organization, that are suggestive of novel defense systems. Thus, genomic islands provide abundant material for the experimental study of bacterial and archaeal antivirus defense. Except for the CRISPR-Cas systems, different classes of defense systems, in particular toxin-antitoxin and restriction-modification systems, show nonrandom clustering in defense islands. It remains unclear to what extent these associations reflect functional cooperation between different defense systems and to what extent the islands are genomic "sinks" that accumulate diverse nonessential genes, particularly those acquired via horizontal gene transfer. The characteristics of defense islands resemble those of mobilome islands. Defense and mobilome genes are nonrandomly associated in islands, suggesting nonadaptive evolution of the islands via a preferential attachment-like mechanism underpinned by the addictive properties of defense systems such as toxins-antitoxins and an important role of horizontal mobility in the evolution of these islands.

  13. Perspectives on the Transition From Bacterial Phytopathogen Genomics Studies to Applications Enhancing Disease Management: From Promise to Practice.

    PubMed

    Sundin, George W; Wang, Nian; Charkowski, Amy O; Castiblanco, Luisa F; Jia, Hongge; Zhao, Youfu

    2016-10-01

    The advent of genomics has advanced science into a new era, providing a plethora of "toys" for researchers in many related and disparate fields. Genomics has also spawned many new fields, including proteomics and metabolomics, furthering our ability to gain a more comprehensive view of individual organisms and of interacting organisms. Genomic information of both bacterial pathogens and their hosts has provided the critical starting point in understanding the molecular bases of how pathogens disrupt host cells to cause disease. In addition, knowledge of the complete genome sequence of the pathogen provides a potentially broad slate of targets for the development of novel virulence inhibitors that are desperately needed for disease management. Regarding plant bacterial pathogens and disease management, the potential for utilizing genomics resources in the development of durable resistance is enhanced because of developing technologies that enable targeted modification of the host. Here, we summarize the role of genomics studies in furthering efforts to manage bacterial plant diseases and highlight novel genomics-enabled strategies heading down this path.

  14. Impacts of Chromatin States and Long-Range Genomic Segments on Aging and DNA Methylation

    PubMed Central

    Sun, Dan; Yi, Soojin V.

    2015-01-01

    Understanding the fundamental dynamics of epigenome variation during normal aging is critical for elucidating key epigenetic alterations that affect development, cell differentiation and diseases. Advances in the field of aging and DNA methylation strongly support the aging epigenetic drift model. Although this model aligns with previous studies, the role of other epigenetic marks, such as histone modification, as well as the impact of sampling specific CpGs, must be evaluated. Ultimately, it is crucial to investigate how all CpGs in the human genome change their methylation with aging in their specific genomic and epigenomic contexts. Here, we analyze whole genome bisulfite sequencing DNA methylation maps of brain frontal cortex from individuals of diverse ages. Comparisons with blood data reveal tissue-specific patterns of epigenetic drift. By integrating chromatin state information, divergent degrees and directions of aging-associated methylation in different genomic regions are revealed. Whole genome bisulfite sequencing data also open a new door to investigate whether adjacent CpG sites exhibit coordinated DNA methylation changes with aging. We identified significant ‘aging-segments’, which are clusters of nearby CpGs that respond to aging by similar DNA methylation changes. These segments not only capture previously identified aging-CpGs but also include specific functional categories of genes with implications on epigenetic regulation of aging. For example, genes associated with development are highly enriched in positive aging segments, which are gradually hyper-methylated with aging. On the other hand, regions that are gradually hypo-methylated with aging (‘negative aging segments’) in the brain harbor genes involved in metabolism and protein ubiquitination. Given the importance of protein ubiquitination in proteome homeostasis of aging brains and neurodegenerative disorders, our finding suggests the significance of epigenetic regulation of this

  15. Modeling the relaxation of internal DNA segments during genome mapping in nanochannels.

    PubMed

    Jain, Aashish; Sheats, Julian; Reifenberger, Jeffrey G; Cao, Han; Dorfman, Kevin D

    2016-09-01

    We have developed a multi-scale model describing the dynamics of internal segments of DNA in nanochannels used for genome mapping. In addition to the channel geometry, the model takes as its inputs the DNA properties in free solution (persistence length, effective width, molecular weight, and segmental hydrodynamic radius) and buffer properties (temperature and viscosity). Using pruned-enriched Rosenbluth simulations of a discrete wormlike chain model with circa 10 base pair resolution and a numerical solution for the hydrodynamic interactions in confinement, we convert these experimentally available inputs into the necessary parameters for a one-dimensional, Rouse-like model of the confined chain. The resulting coarse-grained model resolves the DNA at a length scale of approximately 6 kilobase pairs in the absence of any global hairpin folds, and is readily studied using a normal-mode analysis or Brownian dynamics simulations. The Rouse-like model successfully reproduces both the trends and order of magnitude of the relaxation time of the distance between labeled segments of DNA obtained in experiments. The model also provides insights that are not readily accessible from experiments, such as the role of the molecular weight of the DNA and location of the labeled segments that impact the statistical models used to construct genome maps from data acquired in nanochannels. The multi-scale approach used here, while focused towards a technologically relevant scenario, is readily adapted to other channel sizes and polymers.

  16. Insights from genomic comparisons of genetically monomorphic bacterial pathogens

    PubMed Central

    Achtman, Mark

    2012-01-01

    Some of the most deadly bacterial diseases, including leprosy, anthrax and plague, are caused by bacterial lineages with extremely low levels of genetic diversity, the so-called ‘genetically monomorphic bacteria’. It has only become possible to analyse the population genetics of such bacteria since the recent advent of high-throughput comparative genomics. The genomes of genetically monomorphic lineages contain very few polymorphic sites, which often reflect unambiguous clonal genealogies. Some genetically monomorphic lineages have evolved in the last decades, e.g. antibiotic-resistant Staphylococcus aureus, whereas others have evolved over several millennia, e.g. the cause of plague, Yersinia pestis. Based on recent results, it is now possible to reconstruct the sources and the history of pandemic waves of plague by a combined analysis of phylogeographic signals in Y. pestis plus polymorphisms found in ancient DNA. Different from historical accounts based exclusively on human disease, Y. pestis evolved in China, or the vicinity, and has spread globally on multiple occasions. These routes of transmission can be reconstructed from the genealogy, most precisely for the most recent pandemic that was spread from Hong Kong in multiple independent waves in 1894. PMID:22312053

  17. Identification and analysis of integrons and cassette arrays in bacterial genomes.

    PubMed

    Cury, Jean; Jové, Thomas; Touchon, Marie; Néron, Bertrand; Rocha, Eduardo Pc

    2016-06-02

    Integrons recombine gene arrays and favor the spread of antibiotic resistance. Their broader roles in bacterial adaptation remain mysterious, partly due to lack of computational tools. We made a program - IntegronFinder - to identify integrons with high accuracy and sensitivity. IntegronFinder is available as a standalone program and as a web application. It searches for attC sites using covariance models, for integron-integrases using HMM profiles, and for other features (promoters, attI site) using pattern matching. We searched for integrons, integron-integrases lacking attC sites, and clusters of attC sites lacking a neighboring integron-integrase in bacterial genomes. All these elements are especially frequent in genomes of intermediate size. They are missing in some key phyla, such as α-Proteobacteria, which might reflect selection against cell lineages that acquire integrons. The similarity between attC sites is proportional to the number of cassettes in the integron, and is particularly low in clusters of attC sites lacking integron-integrases. The latter are unexpectedly abundant in genomes lacking integron-integrases or their remains, and have a large novel pool of cassettes lacking homologs in the databases. They might represent an evolutionary step between the acquisition of genes within integrons and their stabilization in the new genome. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. First Complete Squash leaf curl China virus Genomic Segment DNA-A Sequence from East Timor

    PubMed Central

    Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

    2017-01-01

    ABSTRACT We present here the first complete Squash leaf curl China virus (SLCCV) genomic segment DNA-A sequence from East Timor. It was isolated from a pumpkin plant. When compared with 15 complete SLCCV DNA-A genome sequences from other world regions, it most resembled the Malaysian isolate MC1 sequence. PMID:28619789

  19. SuperPhy: predictive genomics for the bacterial pathogen Escherichia coli.

    PubMed

    Whiteside, Matthew D; Laing, Chad R; Manji, Akiff; Kruczkiewicz, Peter; Taboada, Eduardo N; Gannon, Victor P J

    2016-04-12

    Predictive genomics is the translation of raw genome sequence data into a phenotypic assessment of the organism. For bacterial pathogens, these phenotypes can range from environmental survivability, to the severity of human disease. Significant progress has been made in the development of generic tools for genomic analyses that are broadly applicable to all microorganisms; however, a fundamental missing component is the ability to analyze genomic data in the context of organism-specific phenotypic knowledge, which has been accumulated from decades of research and can provide a meaningful interpretation of genome sequence data. In this study, we present SuperPhy, an online predictive genomics platform ( http://lfz.corefacility.ca/superphy/ ) for Escherichia coli. The platform integrates the analytical tools and genome sequence data for all publicly available E. coli genomes and facilitates the upload of new genome sequences from users under public or private settings. SuperPhy provides real-time analyses of thousands of genome sequences with results that are understandable and useful to a wide community, including those in the fields of clinical medicine, epidemiology, ecology, and evolution. SuperPhy includes identification of: 1) virulence and antimicrobial resistance determinants 2) statistical associations between genotypes, biomarkers, geospatial distribution, host, source, and phylogenetic clade; 3) the identification of biomarkers for groups of genomes on the based presence/absence of specific genomic regions and single-nucleotide polymorphisms and 4) in silico Shiga-toxin subtype. SuperPhy is a predictive genomics platform that attempts to provide an essential link between the vast amounts of genome information currently being generated and phenotypic knowledge in an organism-specific context.

  20. Evaluation of a Phylogenetic Marker Based on Genomic Segment B of Infectious Bursal Disease Virus: Facilitating a Feasible Incorporation of this Segment to the Molecular Epidemiology Studies for this Viral Agent.

    PubMed

    Alfonso-Morales, Abdulahi; Rios, Liliam; Martínez-Pérez, Orlando; Dolz, Roser; Valle, Rosa; Perera, Carmen L; Bertran, Kateri; Frías, Maria T; Ganges, Llilianne; Díaz de Arce, Heidy; Majó, Natàlia; Núñez, José I; Pérez, Lester J

    2015-01-01

    Infectious bursal disease (IBD) is a highly contagious and acute viral disease, which has caused high mortality rates in birds and considerable economic losses in different parts of the world for more than two decades and it still represents a considerable threat to poultry. The current study was designed to rigorously measure the reliability of a phylogenetic marker included into segment B. This marker can facilitate molecular epidemiology studies, incorporating this segment of the viral genome, to better explain the links between emergence, spreading and maintenance of the very virulent IBD virus (vvIBDV) strains worldwide. Sequences of the segment B gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank Database; Cuban sequences were obtained in the current work. A phylogenetic marker named B-marker was assessed by different phylogenetic principles such as saturation of substitution, phylogenetic noise and high consistency. This last parameter is based on the ability of B-marker to reconstruct the same topology as the complete segment B of the viral genome. From the results obtained from B-marker, demographic history for both main lineages of IBDV regarding segment B was performed by Bayesian skyline plot analysis. Phylogenetic analysis for both segments of IBDV genome was also performed, revealing the presence of a natural reassortant strain with segment A from vvIBDV strains and segment B from non-vvIBDV strains within Cuban IBDV population. This study contributes to a better understanding of the emergence of vvIBDV strains, describing molecular epidemiology of IBDV using the state-of-the-art methodology concerning phylogenetic reconstruction. This study also revealed the presence of a novel natural reassorted strain as possible manifest of change in the genetic structure and stability of the vvIBDV strains. Therefore, it highlights the need to obtain information about both genome segments of IBDV for molecular

  1. Analysis of gene expression levels in individual bacterial cells without image segmentation.

    PubMed

    Kwak, In Hae; Son, Minjun; Hagen, Stephen J

    2012-05-11

    Studies of stochasticity in gene expression typically make use of fluorescent protein reporters, which permit the measurement of expression levels within individual cells by fluorescence microscopy. Analysis of such microscopy images is almost invariably based on a segmentation algorithm, where the image of a cell or cluster is analyzed mathematically to delineate individual cell boundaries. However segmentation can be ineffective for studying bacterial cells or clusters, especially at lower magnification, where outlines of individual cells are poorly resolved. Here we demonstrate an alternative method for analyzing such images without segmentation. The method employs a comparison between the pixel brightness in phase contrast vs fluorescence microscopy images. By fitting the correlation between phase contrast and fluorescence intensity to a physical model, we obtain well-defined estimates for the different levels of gene expression that are present in the cell or cluster. The method reveals the boundaries of the individual cells, even if the source images lack the resolution to show these boundaries clearly. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Family-specific scaling laws in bacterial genomes.

    PubMed

    De Lazzari, Eleonora; Grilli, Jacopo; Maslov, Sergei; Cosentino Lagomarsino, Marco

    2017-07-27

    Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs.

    PubMed

    Auch, Alexander F; Klenk, Hans-Peter; Göker, Markus

    2010-01-28

    DNA-DNA hybridization (DDH) is a widely applied wet-lab technique to obtain an estimate of the overall similarity between the genomes of two organisms. To base the species concept for prokaryotes ultimately on DDH was chosen by microbiologists as a pragmatic approach for deciding about the recognition of novel species, but also allowed a relatively high degree of standardization compared to other areas of taxonomy. However, DDH is tedious and error-prone and first and foremost cannot be used to incrementally establish a comparative database. Recent studies have shown that in-silico methods for the comparison of genome sequences can be used to replace DDH. Considering the ongoing rapid technological progress of sequencing methods, genome-based prokaryote taxonomy is coming into reach. However, calculating distances between genomes is dependent on multiple choices for software and program settings. We here provide an overview over the modifications that can be applied to distance methods based in high-scoring segment pairs (HSPs) or maximally unique matches (MUMs) and that need to be documented. General recommendations on determining HSPs using BLAST or other algorithms are also provided. As a reference implementation, we introduce the GGDC web server (http://ggdc.gbdp.org).

  4. Group-theoretic models of the inversion process in bacterial genomes.

    PubMed

    Egri-Nagy, Attila; Gebhardt, Volker; Tanaka, Mark M; Francis, Andrew R

    2014-07-01

    The variation in genome arrangements among bacterial taxa is largely due to the process of inversion. Recent studies indicate that not all inversions are equally probable, suggesting, for instance, that shorter inversions are more frequent than longer, and those that move the terminus of replication are less probable than those that do not. Current methods for establishing the inversion distance between two bacterial genomes are unable to incorporate such information. In this paper we suggest a group-theoretic framework that in principle can take these constraints into account. In particular, we show that by lifting the problem from circular permutations to the affine symmetric group, the inversion distance can be found in polynomial time for a model in which inversions are restricted to acting on two regions. This requires the proof of new results in group theory, and suggests a vein of new combinatorial problems concerning permutation groups on which group theorists will be needed to collaborate with biologists. We apply the new method to inferring distances and phylogenies for published Yersinia pestis data.

  5. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    PubMed

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  6. Assessment of Recombination in the S-segment Genome of Crimean-Congo Hemorrhagic Fever Virus in Iran.

    PubMed

    Chinikar, Sadegh; Shah-Hosseini, Nariman; Bouzari, Saeid; Shokrgozar, Mohammad Ali; Mostafavi, Ehsan; Jalali, Tahmineh; Khakifirouz, Sahar; Groschup, Martin H; Niedrig, Matthias

    2016-03-01

    Crimean-Congo Hemorrhagic Fever Virus (CCHFV) belongs to genus Nairovirus and family Bunyaviridae. The main aim of this study was to investigate the extent of recombination in S-segment genome of CCHFV in Iran. Samples were isolated from Iranian patients and those available in GenBank, and analyzed by phylogenetic and bootscan methods. Through comparison of the phylogenetic trees based on full length sequences and partial fragments in the S-segment genome of CCHFV, genetic switch was evident, due to recombination event. Moreover, evidence of multiple recombination events was detected in query isolates when bootscan analysis was used by SimPlot software. Switch of different genomic regions between different strains by recombination could contribute to CCHFV diversification and evolution. The occurrence of recombination in CCHFV has a critical impact on epidemiological investigations and vaccine design.

  7. Microbial Genomics: The Expanding Universe of Bacterial Defense Systems.

    PubMed

    Forsberg, Kevin J; Malik, Harmit S

    2018-04-23

    Bacteria protect themselves against infection using multiple defensive systems that move by horizontal gene transfer and accumulate in genomic 'defense islands'. A recent study exploited these features to uncover ten novel defense systems, substantially expanding the catalog of bacterial defense systems and predicting the discovery of many more. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. GFinisher: a new strategy to refine and finish bacterial genome assemblies

    NASA Astrophysics Data System (ADS)

    Guizelini, Dieval; Raittz, Roberto T.; Cruz, Leonardo M.; Souza, Emanuel M.; Steffens, Maria B. R.; Pedrosa, Fabio O.

    2016-10-01

    Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.

  9. GFinisher: a new strategy to refine and finish bacterial genome assemblies.

    PubMed

    Guizelini, Dieval; Raittz, Roberto T; Cruz, Leonardo M; Souza, Emanuel M; Steffens, Maria B R; Pedrosa, Fabio O

    2016-10-10

    Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.

  10. Evidence of codon usage in the nearest neighbor spacing distribution of bases in bacterial genomes

    NASA Astrophysics Data System (ADS)

    Higareda, M. F.; Geiger, O.; Mendoza, L.; Méndez-Sánchez, R. A.

    2012-02-01

    Statistical analysis of whole genomic sequences usually assumes a homogeneous nucleotide density throughout the genome, an assumption that has been proved incorrect for several organisms since the nucleotide density is only locally homogeneous. To avoid giving a single numerical value to this variable property, we propose the use of spectral statistics, which characterizes the density of nucleotides as a function of its position in the genome. We show that the cumulative density of bases in bacterial genomes can be separated into an average (or secular) plus a fluctuating part. Bacterial genomes can be divided into two groups according to the qualitative description of their secular part: linear and piecewise linear. These two groups of genomes show different properties when their nucleotide spacing distribution is studied. In order to analyze genomes having a variable nucleotide density, statistically, the use of unfolding is necessary, i.e., to get a separation between the secular part and the fluctuations. The unfolding allows an adequate comparison with the statistical properties of other genomes. With this methodology, four genomes were analyzed Burkholderia, Bacillus, Clostridium and Corynebacterium. Interestingly, the nearest neighbor spacing distributions or detrended distance distributions are very similar for species within the same genus but they are very different for species from different genera. This difference can be attributed to the difference in the codon usage.

  11. Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing

    PubMed Central

    Eastman, Alexander W.; Yuan, Ze-Chun

    2015-01-01

    Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID

  12. Assessment of Recombination in the S-segment Genome of Crimean-Congo Hemorrhagic Fever Virus in Iran

    PubMed Central

    Chinikar, Sadegh; Shah-Hosseini, Nariman; Bouzari, Saeid; Shokrgozar, Mohammad Ali; Mostafavi, Ehsan; Jalali, Tahmineh; Khakifirouz, Sahar; Groschup, Martin H; Niedrig, Matthias

    2016-01-01

    Background: Crimean-Congo Hemorrhagic Fever Virus (CCHFV) belongs to genus Nairovirus and family Bunyaviridae. The main aim of this study was to investigate the extent of recombination in S-segment genome of CCHFV in Iran. Methods: Samples were isolated from Iranian patients and those available in GenBank, and analyzed by phylogenetic and bootscan methods. Results: Through comparison of the phylogenetic trees based on full length sequences and partial fragments in the S-segment genome of CCHFV, genetic switch was evident, due to recombination event. Moreover, evidence of multiple recombination events was detected in query isolates when bootscan analysis was used by SimPlot software. Conclusion: Switch of different genomic regions between different strains by recombination could contribute to CCHFV diversification and evolution. The occurrence of recombination in CCHFV has a critical impact on epidemiological investigations and vaccine design. PMID:27047968

  13. Reconstruction of a Bacterial Genome from DNA Cassettes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Christopher Dupont; John Glass; Laura Sheahan

    2011-12-31

    This basic research program comprised two major areas: (1) acquisition and analysis of marine microbial metagenomic data and development of genomic analysis tools for broad, external community use; (2) development of a minimal bacterial genome. Our Marine Metagenomic Diversity effort generated and analyzed shotgun sequencing data from microbial communities sampled from over 250 sites around the world. About 40% of the 26 Gbp of sequence data has been made publicly available to date with a complete release anticipated in six months. Our results and those mining the deposited data have revealed a vast diversity of genes coding for critical metabolicmore » processes whose phylogenetic and geographic distributions will enable a deeper understanding of carbon and nutrient cycling, microbial ecology, and rapid rate evolutionary processes such as horizontal gene transfer by viruses and plasmids. A global assembly of the generated dataset resulted in a massive set (5Gbp) of genome fragments that provide context to the majority of the generated data that originated from uncultivated organisms. Our Synthetic Biology team has made significant progress towards the goal of synthesizing a minimal mycoplasma genome that will have all of the machinery for independent life. This project, once completed, will provide fundamentally new knowledge about requirements for microbial life and help to lay a basic research foundation for developing microbiological approaches to bioenergy.« less

  14. Universal and idiosyncratic characteristic lengths in bacterial genomes

    NASA Astrophysics Data System (ADS)

    Junier, Ivan; Frémont, Paul; Rivoire, Olivier

    2018-05-01

    In condensed matter physics, simplified descriptions are obtained by coarse-graining the features of a system at a certain characteristic length, defined as the typical length beyond which some properties are no longer correlated. From a physics standpoint, in vitro DNA has thus a characteristic length of 300 base pairs (bp), the Kuhn length of the molecule beyond which correlations in its orientations are typically lost. From a biology standpoint, in vivo DNA has a characteristic length of 1000 bp, the typical length of genes. Since bacteria live in very different physico-chemical conditions and since their genomes lack translational invariance, whether larger, universal characteristic lengths exist is a non-trivial question. Here, we examine this problem by leveraging the large number of fully sequenced genomes available in public databases. By analyzing GC content correlations and the evolutionary conservation of gene contexts (synteny) in hundreds of bacterial chromosomes, we conclude that a fundamental characteristic length around 10–20 kb can be defined. This characteristic length reflects elementary structures involved in the coordination of gene expression, which are present all along the genome of nearly all bacteria. Technically, reaching this conclusion required us to implement methods that are insensitive to the presence of large idiosyncratic genomic features, which may co-exist along these fundamental universal structures.

  15. SIMBA: a web tool for managing bacterial genome assembly generated by Ion PGM sequencing technology.

    PubMed

    Mariano, Diego C B; Pereira, Felipe L; Aguiar, Edgar L; Oliveira, Letícia C; Benevides, Leandro; Guimarães, Luís C; Folador, Edson L; Sousa, Thiago J; Ghosh, Preetam; Barh, Debmalya; Figueiredo, Henrique C P; Silva, Artur; Ramos, Rommel T J; Azevedo, Vasco A C

    2016-12-15

    The evolution of Next-Generation Sequencing (NGS) has considerably reduced the cost per sequenced-base, allowing a significant rise of sequencing projects, mainly in prokaryotes. However, the range of available NGS platforms requires different strategies and software to correctly assemble genomes. Different strategies are necessary to properly complete an assembly project, in addition to the installation or modification of various software. This requires users to have significant expertise in these software and command line scripting experience on Unix platforms, besides possessing the basic expertise on methodologies and techniques for genome assembly. These difficulties often delay the complete genome assembly projects. In order to overcome this, we developed SIMBA (SImple Manager for Bacterial Assemblies), a freely available web tool that integrates several component tools for assembling and finishing bacterial genomes. SIMBA provides a friendly and intuitive user interface so bioinformaticians, even with low computational expertise, can work under a centralized administrative control system of assemblies managed by the assembly center head. SIMBA guides the users to execute assembly process through simple and interactive pages. SIMBA workflow was divided in three modules: (i) projects: allows a general vision of genome sequencing projects, in addition to data quality analysis and data format conversions; (ii) assemblies: allows de novo assemblies with the software Mira, Minia, Newbler and SPAdes, also assembly quality validations using QUAST software; and (iii) curation: presents methods to finishing assemblies through tools for scaffolding contigs and close gaps. We also presented a case study that validated the efficacy of SIMBA to manage bacterial assemblies projects sequenced using Ion Torrent PGM. Besides to be a web tool for genome assembly, SIMBA is a complete genome assemblies project management system, which can be useful for managing of several

  16. Evaluation of a Phylogenetic Marker Based on Genomic Segment B of Infectious Bursal Disease Virus: Facilitating a Feasible Incorporation of this Segment to the Molecular Epidemiology Studies for this Viral Agent

    PubMed Central

    Martínez-Pérez, Orlando; Dolz, Roser; Valle, Rosa; Perera, Carmen L.; Bertran, Kateri; Frías, Maria T.; Ganges, Llilianne; Díaz de Arce, Heidy; Majó, Natàlia; Núñez, José I.; Pérez, Lester J.

    2015-01-01

    Background Infectious bursal disease (IBD) is a highly contagious and acute viral disease, which has caused high mortality rates in birds and considerable economic losses in different parts of the world for more than two decades and it still represents a considerable threat to poultry. The current study was designed to rigorously measure the reliability of a phylogenetic marker included into segment B. This marker can facilitate molecular epidemiology studies, incorporating this segment of the viral genome, to better explain the links between emergence, spreading and maintenance of the very virulent IBD virus (vvIBDV) strains worldwide. Methodology/Principal Findings Sequences of the segment B gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank Database; Cuban sequences were obtained in the current work. A phylogenetic marker named B-marker was assessed by different phylogenetic principles such as saturation of substitution, phylogenetic noise and high consistency. This last parameter is based on the ability of B-marker to reconstruct the same topology as the complete segment B of the viral genome. From the results obtained from B-marker, demographic history for both main lineages of IBDV regarding segment B was performed by Bayesian skyline plot analysis. Phylogenetic analysis for both segments of IBDV genome was also performed, revealing the presence of a natural reassortant strain with segment A from vvIBDV strains and segment B from non-vvIBDV strains within Cuban IBDV population. Conclusions/Significance This study contributes to a better understanding of the emergence of vvIBDV strains, describing molecular epidemiology of IBDV using the state-of-the-art methodology concerning phylogenetic reconstruction. This study also revealed the presence of a novel natural reassorted strain as possible manifest of change in the genetic structure and stability of the vvIBDV strains. Therefore, it highlights the need to obtain

  17. Prokaryote genome fluidity: toward a system approach of the mobilome.

    PubMed

    Toussaint, Ariane; Chandler, Mick

    2012-01-01

    The importance of horizontal/lateral gene transfer (LGT) in shaping the genomes of prokaryotic organisms has been recognized in recent years as a result of analysis of the increasing number of available genome sequences. LGT is largely due to the transfer and recombination activities of mobile genetic elements (MGEs). Bacterial and archaeal genomes are mosaics of vertically and horizontally transmitted DNA segments. This generates reticulate relationships between members of the prokaryotic world that are better represented by networks than by "classical" phylogenetic trees. In this review we summarize the nature and activities of MGEs, and the problems that presently limit their analysis on a large scale. We propose routes to improve their annotation in the flow of genomic and metagenomic sequences that currently exist and those that become available. We describe network analysis of evolutionary relationships among some MGE categories and sketch out possible developments of this type of approach to get more insight into the role of the mobilome in bacterial adaptation and evolution.

  18. Drivers of bacterial genomes plasticity and roles they play in pathogen virulence, persistence and drug resistance.

    PubMed

    Patel, Seema

    2016-11-01

    Despite the advent of next-generation sequencing (NGS) technologies, sophisticated data analysis and drug development efforts, bacterial drug resistance persists and is escalating in magnitude. To better control the pathogens, a thorough understanding of their genomic architecture and dynamics is vital. Bacterial genome is extremely complex, a mosaic of numerous co-operating and antagonizing components, altruistic and self-interested entities, behavior of which are predictable and conserved to some extent, yet largely dictated by an array of variables. In this regard, mobile genetic elements (MGE), DNA repair systems, post-segregation killing systems, toxin-antitoxin (TA) systems, restriction-modification (RM) systems etc. are dominant agents and horizontal gene transfer (HGT), gene redundancy, epigenetics, phase and antigenic variation etc. processes shape the genome. By illegitimate recombinations, deletions, insertions, duplications, amplifications, inversions, conversions, translocations, modification of intergenic regions and other alterations, bacterial genome is modified to tackle stressors like drugs, and host immune effectors. Over the years, thousands of studies have investigated this aspect and mammoth amount of insights have been accumulated. This review strives to distillate the existing information, formulate hypotheses and to suggest directions, that might contribute towards improved mitigation of the vicious pathogens. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Discovery of novel bacterial toxins by genomics and computational biology.

    PubMed

    Doxey, Andrew C; Mansfield, Michael J; Montecucco, Cesare

    2018-06-01

    Hundreds and hundreds of bacterial protein toxins are presently known. Traditionally, toxin identification begins with pathological studies of bacterial infectious disease. Following identification and cultivation of a bacterial pathogen, the protein toxin is purified from the culture medium and its pathogenic activity is studied using the methods of biochemistry and structural biology, cell biology, tissue and organ biology, and appropriate animal models, supplemented by bioimaging techniques. The ongoing and explosive development of high-throughput DNA sequencing and bioinformatic approaches have set in motion a revolution in many fields of biology, including microbiology. One consequence is that genes encoding novel bacterial toxins can be identified by bioinformatic and computational methods based on previous knowledge accumulated from studies of the biology and pathology of thousands of known bacterial protein toxins. Starting from the paradigmatic cases of diphtheria toxin, tetanus and botulinum neurotoxins, this review discusses traditional experimental approaches as well as bioinformatics and genomics-driven approaches that facilitate the discovery of novel bacterial toxins. We discuss recent work on the identification of novel botulinum-like toxins from genera such as Weissella, Chryseobacterium, and Enteroccocus, and the implications of these computationally identified toxins in the field. Finally, we discuss the promise of metagenomics in the discovery of novel toxins and their ecological niches, and present data suggesting the existence of uncharacterized, botulinum-like toxin genes in insect gut metagenomes. Copyright © 2018. Published by Elsevier Ltd.

  20. Transforming clinical microbiology with bacterial genome sequencing.

    PubMed

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  1. Transforming clinical microbiology with bacterial genome sequencing

    PubMed Central

    2016-01-01

    Whole genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here we review the current status of clinical microbiology and how it has already begun to be transformed by the use of next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. The application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow. PMID:22868263

  2. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations

    DOE PAGES

    Bendall, Matthew L.; Stevens, Sarah L.R.; Chan, Leong-Keat; ...

    2016-01-08

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Using a 9-year metagenomic study of a freshwater lake (2005–2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of genemore » gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. Furthermore, these patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the ‘ecotype model’ of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Finally, evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment.« less

  3. MAGNAMWAR: an R package for genome-wide association studies of bacterial orthologs.

    PubMed

    Sexton, Corinne E; Smith, Hayden Z; Newell, Peter D; Douglas, Angela E; Chaston, John M

    2018-06-01

    Here we report on an R package for genome-wide association studies of orthologous genes in bacteria. Before using the software, orthologs from bacterial genomes or metagenomes are defined using local or online implementations of OrthoMCL. These presence-absence patterns are statistically associated with variation in user-collected phenotypes using the Mono-Associated GNotobiotic Animals Metagenome-Wide Association R package (MAGNAMWAR). Genotype-phenotype associations can be performed with several different statistical tests based on the type and distribution of the data. MAGNAMWAR is available on CRAN. john_chaston@byu.edu.

  4. The CRISPR-Cas system - from bacterial immunity to genome engineering.

    PubMed

    Czarnek, Maria; Bereta, Joanna

    2016-09-01

    Precise and efficient genome modifications present a great value in attempts to comprehend the roles of particular genes and other genetic elements in biological processes as well as in various pathologies. In recent years novel methods of genome modification known as genome editing, which utilize so called "programmable" nucleases, came into use. A true revolution in genome editing has been brought about by the introduction of the CRISP-Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) system, in which one of such nucleases, i.e. Cas9, plays a major role. This system is based on the elements of the bacterial and archaeal mechanism responsible for acquired immunity against phage infections and transfer of foreign genetic material. Microorganisms incorporate fragments of foreign DNA into CRISPR loci present in their genomes, which enables fast recognition and elimination of future infections. There are several types of CRISPR-Cas systems among prokaryotes but only elements of CRISPR type II are employed in genome engineering. CRISPR-Cas type II utilizes small RNA molecules (crRNA and tracrRNA) to precisely direct the effector nuclease - Cas9 - to a specific site in the genome, i.e. to the sequence complementary to crRNA. Cas9 may be used to: (i) introduce stable changes into genomes e.g. in the process of generation of knock-out and knock-in animals and cell lines, (ii) activate or silence the expression of a gene of interest, and (iii) visualize specific sites in genomes of living cells. The CRISPR-Cas-based tools have been successfully employed for generation of animal and cell models of a number of diseases, e.g. specific types of cancer. In the future, the genome editing by programmable nucleases may find wide application in medicine e.g. in the therapies of certain diseases of genetic origin and in the therapy of HIV-infected patients.

  5. [Plasticity of bacterial genomes: pathogenicity islands and the locus of enterocyte effacement (LEE)].

    PubMed

    Kirsch, Petra; Jores, Jörg; Wieler, Lothar H

    2004-01-01

    Many bacterial virulence attributes, like toxins, adhesins, invasins, iron uptake systems, are encoded within specific regions of the bacterial genome. These in size varying regions are termed pathogenicity islands (PAIs) since they confer pathogenic properties to the respective micro-organism. Per definition PAIs are exclusively found in pathogenic strains and are often inserted near transfer-RNA genes. Nevertheless, non-pathogenic bacteria also possess foreign DNA elements that confer advantageous features, leading to improved fitness. These additional DNA elements as well as PAIs are termed genomic islands and were acquired during bacterial evolution. Significant G+C content deviation in pathogenicity islands with respect to the rest of the genome, the presence of direct repeat sequences at the flanking regions, the presence of integrase gene determinants as other mobility features,the particular insertion site (tRNA gene) as well as the observed genetic instability suggests that pathogenicity islands were acquired by horizontal gene transfer. PAIs are the fascinating proof of the plasticity of bacterial genomes. PAIs were originally described in human pathogenic Escherichia (E.) coli strains. In the meantime PAIs have been found in various pathogenic bacteria of humans, animals and even plants. The Locus of Enterocyte Effacement (LEE) is one particular widely distributed PAI of E coli. In addition, it also confers pathogenicity to the related species Citrobacter (C.) rodentium and Escherichia (E.) alvei. The LEE is an important virulence feature of several animal pathogens. It is an obligate PAI of all animal and human enteropathogenic E. coli (EPEC), and most enterohaemorrhegic E. coli (EHEC) also harbor the LEE. The LEE encodes a type III secretion system, an adhesion (intimin) that mediates the intimate contact between the bacterium and the epithelial cell, as well as various proteins which are secreted via the type III secretion system. The LEE encoded

  6. Pre_GI: a global map of ontological links between horizontally transferred genomic islands in bacterial and archaeal genomes

    PubMed Central

    Pierneef, Rian; Cronje, Louis; Bezuidt, Oliver; Reva, Oleg N.

    2015-01-01

    Abstract The Predicted Genomic Islands database (Pre_GI) is a comprehensive repository of prokaryotic genomic islands (islands, GIs) freely accessible at http://pregi.bi.up.ac.za/index.php . Pre_GI, Version 2015, catalogues 26 744 islands identified in 2407 bacterial/archaeal chromosomes and plasmids. It provides an easy-to-use interface which allows users the ability to query against the database with a variety of fields, parameters and associations. Pre_GI is constructed to be a web-resource for the analysis of ontological roads between islands and cartographic analysis of the global fluxes of mobile genetic elements through bacterial and archaeal taxonomic borders. Comparison of newly identified islands against Pre_GI presents an alternative avenue to identify their ontology, origin and relative time of acquisition. Pre_GI aims to aid research on horizontal transfer events and materials through providing data and tools for holistic investigation of migration of genes through ecological niches and taxonomic boundaries. Database URL: http://pregi.bi.up.ac.za/index.php , Version 2015 PMID:26200753

  7. Essentiality, conservation, evolutionary pressure and codon bias in bacterial genomes.

    PubMed

    Dilucca, Maddalena; Cimini, Giulio; Giansanti, Andrea

    2018-07-15

    Essential genes constitute the core of genes which cannot be mutated too much nor lost along the evolutionary history of a species. Natural selection is expected to be stricter on essential genes and on conserved (highly shared) genes, than on genes that are either nonessential or peculiar to a single or a few species. In order to further assess this expectation, we study here how essentiality of a gene is connected with its degree of conservation among several unrelated bacterial species, each one characterised by its own codon usage bias. Confirming previous results on E. coli, we show the existence of a universal exponential relation between gene essentiality and conservation in bacteria. Moreover, we show that, within each bacterial genome, there are at least two groups of functionally distinct genes, characterised by different levels of conservation and codon bias: i) a core of essential genes, mainly related to cellular information processing; ii) a set of less conserved nonessential genes with prevalent functions related to metabolism. In particular, the genes in the first group are more retained among species, are subject to a stronger purifying conservative selection and display a more limited repertoire of synonymous codons. The core of essential genes is close to the minimal bacterial genome, which is in the focus of recent studies in synthetic biology, though we confirm that orthologs of genes that are essential in one species are not necessarily essential in other species. We also list a set of highly shared genes which, reasonably, could constitute a reservoir of targets for new anti-microbial drugs. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. The Mitochondrial Genome and a 60-kb Nuclear DNA Segment from Naegleria fowleri, the Causative Agent of Primary Amoebic Meningoencephalitis

    PubMed Central

    Herman, Emily K.; Greninger, Alexander L.; Visvesvara, Govinda S.; Marciano-Cabral, Francine; Dacks, Joel B.; Chiu, Charles Y.

    2013-01-01

    Naegleria fowleri is a unicellular eukaryote causing primary amoebic meningoencephalitis, a neuropathic disease killing 99% of those infected, usually within 7–14 days. N. fowleri is found globally in regions including the US and Australia. The genome of the related non-pathogenic species Naegleria gruberi has been sequenced, but the genetic basis for N. fowleri pathogenicity is unclear. To generate such insight, we sequenced and assembled the mitochondrial genome and a 60-kb segment of nuclear genome from N. fowleri. The mitochondrial genome is highly similar to its counterpart in N. gruberi in gene complement and organization, while distinct lack of synteny is observed for the nuclear segments. Even in this short (60-kb) segment, we identified examples of potential factors for pathogenesis, including ten novel N. fowleri-specific genes. We also identified a homologue of cathepsin B; proteases proposed to be involved in the pathogenesis of diverse eukaryotic pathogens, including N. fowleri. Finally, we demonstrate a likely case of horizontal gene transfer between N. fowleri and two unrelated amoebae, one of which causes granulomatous amoebic encephalitis. This initial look into the N. fowleri nuclear genome has revealed several examples of potential pathogenesis factors, improving our understanding of a neglected pathogen of increasing global importance. PMID:23360210

  9. Programmable Removal of Bacterial Strains by Use of Genome-Targeting CRISPR-Cas Systems

    PubMed Central

    Gomaa, Ahmed A.; Klumpe, Heidi E.; Luo, Michelle L.; Selle, Kurt; Barrangou, Rodolphe; Beisel, Chase L.

    2014-01-01

    ABSTRACT CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems in bacteria and archaea employ CRISPR RNAs to specifically recognize the complementary DNA of foreign invaders, leading to sequence-specific cleavage or degradation of the target DNA. Recent work has shown that the accidental or intentional targeting of the bacterial genome is cytotoxic and can lead to cell death. Here, we have demonstrated that genome targeting with CRISPR-Cas systems can be employed for the sequence-specific and titratable removal of individual bacterial strains and species. Using the type I-E CRISPR-Cas system in Escherichia coli as a model, we found that this effect could be elicited using native or imported systems and was similarly potent regardless of the genomic location, strand, or transcriptional activity of the target sequence. Furthermore, the specificity of targeting with CRISPR RNAs could readily distinguish between even highly similar strains in pure or mixed cultures. Finally, varying the collection of delivered CRISPR RNAs could quantitatively control the relative number of individual strains within a mixed culture. Critically, the observed selectivity and programmability of bacterial removal would be virtually impossible with traditional antibiotics, bacteriophages, selectable markers, or tailored growth conditions. Once delivery challenges are addressed, we envision that this approach could offer a novel means to quantitatively control the composition of environmental and industrial microbial consortia and may open new avenues for the development of “smart” antibiotics that circumvent multidrug resistance and differentiate between pathogenic and beneficial microorganisms. PMID:24473129

  10. Defense Islands in Bacterial and Archaeal Genomes and Prediction of Novel Defense Systems ▿†‡

    PubMed Central

    Makarova, Kira S.; Wolf, Yuri I.; Snir, Sagi; Koonin, Eugene V.

    2011-01-01

    The arms race between cellular life forms and viruses is a major driving force of evolution. A substantial fraction of bacterial and archaeal genomes is dedicated to antivirus defense. We analyzed the distribution of defense genes and typical mobilome components (such as viral and transposon genes) in bacterial and archaeal genomes and demonstrated statistically significant clustering of antivirus defense systems and mobile genes and elements in genomic islands. The defense islands are enriched in putative operons and contain numerous overrepresented gene families. A detailed sequence analysis of the proteins encoded by genes in these families shows that many of them are diverged variants of known defense system components, whereas others show features, such as characteristic operonic organization, that are suggestive of novel defense systems. Thus, genomic islands provide abundant material for the experimental study of bacterial and archaeal antivirus defense. Except for the CRISPR-Cas systems, different classes of defense systems, in particular toxin-antitoxin and restriction-modification systems, show nonrandom clustering in defense islands. It remains unclear to what extent these associations reflect functional cooperation between different defense systems and to what extent the islands are genomic “sinks” that accumulate diverse nonessential genes, particularly those acquired via horizontal gene transfer. The characteristics of defense islands resemble those of mobilome islands. Defense and mobilome genes are nonrandomly associated in islands, suggesting nonadaptive evolution of the islands via a preferential attachment-like mechanism underpinned by the addictive properties of defense systems such as toxins-antitoxins and an important role of horizontal mobility in the evolution of these islands. PMID:21908672

  11. Evaluation of genome-enabled selection for bacterial cold water disease resistance using progeny performance data in Rainbow Trout: Insights on genotyping methods and genomic prediction models

    USDA-ARS?s Scientific Manuscript database

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic br...

  12. The Importance of Bacterial Culture to Food Microbiology in the Age of Genomics.

    PubMed

    Gill, Alexander

    2017-01-01

    Culture-based and genomics methods provide different insights into the nature and behavior of bacteria. Maximizing the usefulness of both approaches requires recognizing their limitations and employing them appropriately. Genomic analysis excels at identifying bacteria and establishing the relatedness of isolates. Culture-based methods remain necessary for detection and enumeration, to determine viability, and to validate phenotype predictions made on the bias of genomic analysis. The purpose of this short paper is to discuss the application of culture-based analysis and genomics to the questions food microbiologists routinely need to ask regarding bacteria to ensure the safety of food and its economic production and distribution. To address these issues appropriate tools are required for the detection and enumeration of specific bacterial populations and the characterization of isolates for, identification, phylogenetics, and phenotype prediction.

  13. Both Genome Segments Contribute to the Pathogenicity of Very Virulent Infectious Bursal Disease Virus

    PubMed Central

    Escaffre, Olivier; Le Nouën, Cyril; Amelot, Michel; Ambroggio, Xavier; Ogden, Kristen M.; Guionie, Olivier; Toquin, Didier; Müller, Hermann; Islam, Mohammed R.

    2013-01-01

    Infectious bursal disease virus (IBDV) causes an economically significant disease of chickens worldwide. Very virulent IBDV (vvIBDV) strains have emerged and induce as much as 60% mortality. The molecular basis for vvIBDV pathogenicity is not understood, and the relative contributions of the two genome segments, A and B, to this phenomenon are not known. Isolate 94432 has been shown previously to be genetically related to vvIBDVs but exhibits atypical antigenicity and does not cause mortality. Here the full-length genome of 94432 was determined, and a reverse genetics system was established. The molecular clone was rescued and exhibited the same antigenicity and reduced pathogenicity as isolate 94432. Genetically modified viruses derived from 94432, whose vvIBDV consensus nucleotide sequence was restored in segment A and/or B, were produced, and their pathogenicity was assessed in specific-pathogen-free chickens. We found that a valine (position 321) that modifies the most exposed part of the capsid protein VP2 critically modified the antigenicity and partially reduced the pathogenicity of 94432. However, a threonine (position 276) located in the finger domain of the virus polymerase (VP1) contributed even more significantly to attenuation. This threonine is partially exposed in a hydrophobic groove on the VP1 surface, suggesting possible interactions between VP1 and another, as yet unidentified molecule at this amino acid position. The restored vvIBDV-like pathogenicity was associated with increased replication and lesions in the thymus and spleen. These results demonstrate that both genome segments influence vvIBDV pathogenicity and may provide new targets for the attenuation of vvIBDVs. PMID:23269788

  14. Genomes of the T4-related bacteriophages as windows on microbial genome evolution.

    PubMed

    Petrov, Vasiliy M; Ratnayaka, Swarnamala; Nolan, James M; Miller, Eric S; Karam, Jim D

    2010-10-28

    The T4-related bacteriophages are a group of bacterial viruses that share morphological similarities and genetic homologies with the well-studied Escherichia coli phage T4, but that diverge from T4 and each other by a number of genetically determined characteristics including the bacterial hosts they infect, the sizes of their linear double-stranded (ds) DNA genomes and the predicted compositions of their proteomes. The genomes of about 40 of these phages have been sequenced and annotated over the last several years and are compared here in the context of the factors that have determined their diversity and the diversity of other microbial genomes in evolution. The genomes of the T4 relatives analyzed so far range in size between ~160,000 and ~250,000 base pairs (bp) and are mosaics of one another, consisting of clusters of homology between them that are interspersed with segments that vary considerably in genetic composition between the different phage lineages. Based on the known biological and biochemical properties of phage T4 and the proteins encoded by the T4 genome, the T4 relatives reviewed here are predicted to share a genetic core, or "Core Genome" that determines the structural design of their dsDNA chromosomes, their distinctive morphology and the process of their assembly into infectious agents (phage morphogenesis). The Core Genome appears to be the most ancient genetic component of this phage group and constitutes a mere 12-15% of the total protein encoding potential of the typical T4-related phage genome. The high degree of genetic heterogeneity that exists outside of this shared core suggests that horizontal DNA transfer involving many genetic sources has played a major role in diversification of the T4-related phages and their spread to a wide spectrum of bacterial species domains in evolution. We discuss some of the factors and pathways that might have shaped the evolution of these phages and point out several parallels between their diversity

  15. Genomes of the T4-related bacteriophages as windows on microbial genome evolution

    PubMed Central

    2010-01-01

    The T4-related bacteriophages are a group of bacterial viruses that share morphological similarities and genetic homologies with the well-studied Escherichia coli phage T4, but that diverge from T4 and each other by a number of genetically determined characteristics including the bacterial hosts they infect, the sizes of their linear double-stranded (ds) DNA genomes and the predicted compositions of their proteomes. The genomes of about 40 of these phages have been sequenced and annotated over the last several years and are compared here in the context of the factors that have determined their diversity and the diversity of other microbial genomes in evolution. The genomes of the T4 relatives analyzed so far range in size between ~160,000 and ~250,000 base pairs (bp) and are mosaics of one another, consisting of clusters of homology between them that are interspersed with segments that vary considerably in genetic composition between the different phage lineages. Based on the known biological and biochemical properties of phage T4 and the proteins encoded by the T4 genome, the T4 relatives reviewed here are predicted to share a genetic core, or "Core Genome" that determines the structural design of their dsDNA chromosomes, their distinctive morphology and the process of their assembly into infectious agents (phage morphogenesis). The Core Genome appears to be the most ancient genetic component of this phage group and constitutes a mere 12-15% of the total protein encoding potential of the typical T4-related phage genome. The high degree of genetic heterogeneity that exists outside of this shared core suggests that horizontal DNA transfer involving many genetic sources has played a major role in diversification of the T4-related phages and their spread to a wide spectrum of bacterial species domains in evolution. We discuss some of the factors and pathways that might have shaped the evolution of these phages and point out several parallels between their diversity

  16. Segment-Wise Genome-Wide Association Analysis Identifies a Candidate Region Associated with Schizophrenia in Three Independent Samples

    PubMed Central

    Rietschel, Marcella; Mattheisen, Manuel; Breuer, René; Schulze, Thomas G.; Nöthen, Markus M.; Levinson, Douglas; Shi, Jianxin; Gejman, Pablo V.; Cichon, Sven; Ophoff, Roel A.

    2012-01-01

    Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions. PMID:22723893

  17. Merging chemical ecology with bacterial genome mining for secondary metabolite discovery.

    PubMed

    Vizcaino, Maria I; Guo, Xun; Crawford, Jason M

    2014-02-01

    The integration of chemical ecology and bacterial genome mining can enhance the discovery of structurally diverse natural products in functional contexts. By examining bacterial secondary metabolism in the framework of its ecological niche, insights into the upregulation of orphan biosynthetic pathways and the enhancement of the enzyme substrate supply can be obtained, leading to the discovery of new secondary metabolic pathways that would otherwise be silent or undetected under typical laboratory cultivation conditions. Access to these new natural products (i.e., the chemotypes) facilitates experimental genotype-to-phenotype linkages. Here, we describe certain functional natural products produced by Xenorhabdus and Photorhabdus bacteria with experimentally linked biosynthetic gene clusters as illustrative examples of the synergy between chemical ecology and bacterial genome mining in connecting genotypes to phenotypes through chemotype characterization. These Gammaproteobacteria share a mutualistic relationship with nematodes and a pathogenic relationship with insects and, in select cases, humans. The natural products encoded by these bacteria distinguish their interactions with their animal hosts and other microorganisms in their multipartite symbiotic lifestyles. Though both genera have similar lifestyles, their genetic, chemical, and physiological attributes are distinct. Both undergo phenotypic variation and produce a profuse number of bioactive secondary metabolites. We provide further detail in the context of regulation, production, processing, and function for these genetically encoded small molecules with respect to their roles in mutualism and pathogenicity. These collective insights more widely promote the discovery of atypical orphan biosynthetic pathways encoding novel small molecules in symbiotic systems, which could open up new avenues for investigating and exploiting microbial chemical signaling in host-bacteria interactions.

  18. Complete Genomic Sequence and Comparative Analysis of the Genome Segments of Sweet Potato Chlorotic Stunt Virus in China

    PubMed Central

    Qin, Yanhong; Wang, Li; Zhang, Zhenchen; Qiao, Qi; Zhang, Desheng; Tian, Yuting; Wang, Shuang; Wang, Yongjiang; Yan, Zhaoling

    2014-01-01

    Background Sweet potato chlorotic stunt virus (family Closteroviridae, genus Crinivirus) features a large bipartite, single-stranded, positive-sense RNA genome. To date, only three complete genomic sequences of SPCSV can be accessed through GenBank. SPCSV was first detected from China in 2011, only partial genomic sequences have been determined in the country. No report on the complete genomic sequence and genome structure of Chinese SPCSV isolates or the genetic relation between isolates from China and other countries is available. Methodology/Principal Findings The complete genomic sequences of five isolates from different areas in China were characterized. This study is the first to report the complete genome sequences of SPCSV from whitefly vectors. Genome structure analysis showed that isolates of WA and EA strains from China have the same coding protein as isolates Can181-9 and m2-47, respectively. Twenty cp genes and four RNA1 partial segments were sequenced and analyzed, and the nucleotide identities of complete genomic, cp, and RNA1 partial sequences were determined. Results indicated high conservation among strains and significant differences between WA and EA strains. Genetic analysis demonstrated that, except for isolates from Guangdong Province, SPCSVs from other areas belong to the WA strain. Genome organization analysis showed that the isolates in this study lack the p22 gene. Conclusions/Significance We presented the complete genome sequences of SPCSV in China. Comparison of nucleotide identities and genome structures between these isolates and previously reported isolates showed slight differences. The nucleotide identities of different SPCSV isolates showed high conservation among strains and significant differences between strains. All nine isolates in this study lacked p22 gene. WA strains were more extensively distributed than EA strains in China. These data provide important insights into the molecular variation and genomic structure of SPCSV

  19. Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.

    PubMed

    Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L

    2018-01-01

    Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.

  20. The mitochondrial genome and a 60-kb nuclear DNA segment from Naegleria fowleri, the causative agent of primary amoebic meningoencephalitis.

    PubMed

    Herman, Emily K; Greninger, Alexander L; Visvesvara, Govinda S; Marciano-Cabral, Francine; Dacks, Joel B; Chiu, Charles Y

    2013-01-01

    Naegleria fowleri is a unicellular eukaryote causing primary amoebic meningoencephalitis, a neuropathic disease killing 99% of those infected, usually within 7-14 days. Naegleria fowleri is found globally in regions including the US and Australia. The genome of the related nonpathogenic species Naegleria gruberi has been sequenced, but the genetic basis for N. fowleri pathogenicity is unclear. To generate such insight, we sequenced and assembled the mitochondrial genome and a 60-kb segment of nuclear genome from N. fowleri. The mitochondrial genome is highly similar to its counterpart in N. gruberi in gene complement and organization, while distinct lack of synteny is observed for the nuclear segments. Even in this short (60-kb) segment, we identified examples of potential factors for pathogenesis, including ten novel N. fowleri-specific genes. We also identified a homolog of cathepsin B; proteases proposed to be involved in the pathogenesis of diverse eukaryotic pathogens, including N. fowleri. Finally, we demonstrate a likely case of horizontal gene transfer between N. fowleri and two unrelated amoebae, one of which causes granulomatous amoebic encephalitis. This initial look into the N. fowleri nuclear genome has revealed several examples of potential pathogenesis factors, improving our understanding of a neglected pathogen of increasing global importance. © 2013 The Author(s) Journal of Eukaryotic Microbiology © 2013 International Society of Protistologists.

  1. Non-canonical ribosomal DNA segments in the human genome, and nucleoli functioning.

    PubMed

    Kupriyanova, Natalia S; Netchvolodov, Kirill K; Sadova, Anastasia A; Cherepanova, Marina D; Ryskov, Alexei P

    2015-11-10

    Ribosomal DNA (rDNA) in the human genome is represented by tandem repeats of 43 kb nucleotide sequences that form nucleoli organizers (NORs) on each of five pairs of acrocentric chromosomes. RDNA-similar segments of different lengths are also present on (NOR)(-) chromosomes. Many of these segments contain nucleotide substitutions, supplementary microsatellite clusters, and extended deletions. Recently, it was shown that, in addition to ribosome biogenesis, nucleoli exhibit additional functions, such as cell-cycle regulation and response to stresses. In particular, several stress-inducible loci located in the ribosomal intergenic spacer (rIGS) produce stimuli-specific noncoding nucleolus RNAs. By mapping the 5'/3' ends of the rIGS segments scattered throughout (NOR)(-) chromosomes, we discovered that the bonds in the rIGS that were most often susceptible to disruption in the rIGS were adjacent to, or overlapped with stimuli-specific inducible loci. This suggests the interconnection of the two phenomena - nucleoli functioning and the scattering of rDNA-like sequences on (NOR)(-) chromosomes. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Whole Genome Sequence Analysis of Pig Respiratory Bacterial Pathogens with Elevated Minimum Inhibitory Concentrations for Macrolides.

    PubMed

    Dayao, Denise Ann Estarez; Seddon, Jennifer M; Gibson, Justine S; Blackall, Patrick J; Turni, Conny

    2016-10-01

    Macrolides are often used to treat and control bacterial pathogens causing respiratory disease in pigs. This study analyzed the whole genome sequences of one clinical isolate of Actinobacillus pleuropneumoniae, Haemophilus parasuis, Pasteurella multocida, and Bordetella bronchiseptica, all isolated from Australian pigs to identify the mechanism underlying the elevated minimum inhibitory concentrations (MICs) for erythromycin, tilmicosin, or tulathromycin. The H. parasuis assembled genome had a nucleotide transition at position 2059 (A to G) in the six copies of the 23S rRNA gene. This mutation has previously been associated with macrolide resistance but this is the first reported mechanism associated with elevated macrolide MICs in H. parasuis. There was no known macrolide resistance mechanism identified in the other three bacterial genomes. However, strA and sul2, aminoglycoside and sulfonamide resistance genes, respectively, were detected in one contiguous sequence (contig 1) of A. pleuropneumoniae assembled genome. This contig was identical to plasmids previously identified in Pasteurellaceae. This study has provided one possible explanation of elevated MICs to macrolides in H. parasuis. Further studies are necessary to clarify the mechanism causing the unexplained macrolide resistance in other Australian pig respiratory pathogens including the role of efflux systems, which were detected in all analyzed genomes.

  3. Operon-mapper: A Web Server for Precise Operon Identification in Bacterial and Archaeal Genomes.

    PubMed

    Taboada, Blanca; Estrada, Karel; Ciria, Ricardo; Merino, Enrique

    2018-06-19

    Operon-mapper is a web server that accurately, easily, and directly predicts the operons of any bacterial or archaeal genome sequence. The operon predictions are based on the intergenic distance of neighboring genes as well as the functional relationships of their protein-coding products. To this end, Operon-mapper finds all the ORFs within a given nucleotide sequence, along with their genomic coordinates, orthology groups, and functional relationships. We believe that Operon-mapper, due to its accuracy, simplicity and speed, as well as the relevant information that it generates, will be a useful tool for annotating and characterizing genomic sequences. http://biocomputo.ibt.unam.mx/operon_mapper/.

  4. Bacterial genospecies that are not ecologically coherent: population genomics of Rhizobium leguminosarum

    PubMed Central

    Kumar, Nitin; Lad, Ganesh; Giuntini, Elisa; Kaye, Maria E.; Udomwong, Piyachat; Shamsani, N. Jannah; Young, J. Peter W.; Bailly, Xavier

    2015-01-01

    Biological species may remain distinct because of genetic isolation or ecological adaptation, but these two aspects do not always coincide. To establish the nature of the species boundary within a local bacterial population, we characterized a sympatric population of the bacterium Rhizobium leguminosarum by genomic sequencing of 72 isolates. Although all strains have 16S rRNA typical of R. leguminosarum, they fall into five genospecies by the criterion of average nucleotide identity (ANI). Many genes, on plasmids as well as the chromosome, support this division: recombination of core genes has been largely within genospecies. Nevertheless, variation in ecological properties, including symbiotic host range and carbon-source utilization, cuts across these genospecies, so that none of these phenotypes is diagnostic of genospecies. This phenotypic variation is conferred by mobile genes. The genospecies meet the Mayr criteria for biological species in respect of their core genes, but do not correspond to coherent ecological groups, so periodic selection may not be effective in purging variation within them. The population structure is incompatible with traditional ‘polyphasic taxonomy′ that requires bacterial species to have both phylogenetic coherence and distinctive phenotypes. More generally, genomics has revealed that many bacterial species share adaptive modules by horizontal gene transfer, and we envisage a more consistent taxonomic framework that explicitly recognizes this. Significant phenotypes should be recognized as ‘biovars' within species that are defined by core gene phylogeny. PMID:25589577

  5. Ancient bacterial endosymbionts of insects: Genomes as sources of insight and springboards for inquiry.

    PubMed

    Wernegreen, Jennifer J

    2017-09-15

    Ancient associations between insects and bacteria provide models to study intimate host-microbe interactions. Currently, a wealth of genome sequence data for long-term, obligately intracellular (primary) endosymbionts of insects reveals profound genomic consequences of this specialized bacterial lifestyle. Those consequences include severe genome reduction and extreme base compositions. This minireview highlights the utility of genome sequence data to understand how, and why, endosymbionts have been pushed to such extremes, and to illuminate the functional consequences of such extensive genome change. While the static snapshots provided by individual endosymbiont genomes are valuable, comparative analyses of multiple genomes have shed light on evolutionary mechanisms. Namely, genome comparisons have told us that selection is important in fine-tuning gene content, but at the same time, mutational pressure and genetic drift contribute to genome degradation. Examples from Blochmannia, the primary endosymbiont of the ant tribe Camponotini, illustrate the value and constraints of genome sequence data, and exemplify how genomes can serve as a springboard for further comparative and experimental inquiry. Copyright © 2017. Published by Elsevier Inc.

  6. Genomics-enabled analysis of the emergent disease cotton bacterial blight

    PubMed Central

    Phillips, Anne Z.; Burke, Jillian; Bunn, J. Imani; Allen, Tom W.; Wheeler, Terry

    2017-01-01

    Cotton bacterial blight (CBB), an important disease of (Gossypium hirsutum) in the early 20th century, had been controlled by resistant germplasm for over half a century. Recently, CBB re-emerged as an agronomic problem in the United States. Here, we report analysis of cotton variety planting statistics that indicate a steady increase in the percentage of susceptible cotton varieties grown each year since 2009. Phylogenetic analysis revealed that strains from the current outbreak cluster with race 18 Xanthomonas citri pv. malvacearum (Xcm) strains. Illumina based draft genomes were generated for thirteen Xcm isolates and analyzed along with 4 previously published Xcm genomes. These genomes encode 24 conserved and nine variable type three effectors. Strains in the race 18 clade contain 3 to 5 more effectors than other Xcm strains. SMRT sequencing of two geographically and temporally diverse strains of Xcm yielded circular chromosomes and accompanying plasmids. These genomes encode eight and thirteen distinct transcription activator-like effector genes. RNA-sequencing revealed 52 genes induced within two cotton cultivars by both tested Xcm strains. This gene list includes a homeologous pair of genes, with homology to the known susceptibility gene, MLO. In contrast, the two strains of Xcm induce different clade III SWEET sugar transporters. Subsequent genome wide analysis revealed patterns in the overall expression of homeologous gene pairs in cotton after inoculation by Xcm. These data reveal important insights into the Xcm-G. hirsutum disease complex and strategies for future development of resistant cultivars. PMID:28910288

  7. BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data

    PubMed Central

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  8. Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies

    PubMed Central

    Zeng, Lu; Kortschak, R. Daniel; Raison, Joy M.

    2018-01-01

    Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package. PMID:29538441

  9. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

    PubMed Central

    Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N.; Matsunaga, James; Mechaly, Ariel E.; Monk, Jonathan M.; Nascimento, Ana L. T.; Nelson, Karen E.; Palsson, Bernhard; Peacock, Sharon J.; Picardeau, Mathieu; Ricaldi, Jessica N.; Thaipandungpanit, Janjira; Wunder, Elsio A.; Yang, X. Frank; Zhang, Jun-Jie; Vinetz, Joseph M.

    2016-01-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade’s refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  10. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    PubMed

    Fouts, Derrick E; Matthias, Michael A; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L; Haake, David A; Haft, Daniel H; Hartskeerl, Rudy; Ko, Albert I; Levett, Paul N; Matsunaga, James; Mechaly, Ariel E; Monk, Jonathan M; Nascimento, Ana L T; Nelson, Karen E; Palsson, Bernhard; Peacock, Sharon J; Picardeau, Mathieu; Ricaldi, Jessica N; Thaipandungpanit, Janjira; Wunder, Elsio A; Yang, X Frank; Zhang, Jun-Jie; Vinetz, Joseph M

    2016-02-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  11. A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes.

    PubMed

    Tsirigos, Aristotelis; Rigoutsos, Isidore

    2005-01-01

    In earlier work, we introduced and discussed a generalized computational framework for identifying horizontal transfers. This framework relied on a gene's nucleotide composition, obviated the need for knowledge of codon boundaries and database searches, and was shown to perform very well across a wide range of archaeal and bacterial genomes when compared with previously published approaches, such as Codon Adaptation Index and C + G content. Nonetheless, two considerations remained outstanding: we wanted to further increase the sensitivity of detecting horizontal transfers and also to be able to apply the method to increasingly smaller genomes. In the discussion that follows, we present such a method, Wn-SVM, and show that it exhibits a very significant improvement in sensitivity compared with earlier approaches. Wn-SVM uses a one-class support-vector machine and can learn using rather small training sets. This property makes Wn-SVM particularly suitable for studying small-size genomes, similar to those of viruses, as well as the typically larger archaeal and bacterial genomes. We show experimentally that the new method results in a superior performance across a wide range of organisms and that it improves even upon our own earlier method by an average of 10% across all examined genomes. As a small-genome case study, we analyze the genome of the human cytomegalovirus and demonstrate that Wn-SVM correctly identifies regions that are known to be conserved and prototypical of all beta-herpesvirinae, regions that are known to have been acquired horizontally from the human host and, finally, regions that had not up to now been suspected to be horizontally transferred. Atypical region predictions for many eukaryotic viruses, including the alpha-, beta- and gamma-herpesvirinae, and 123 archaeal and bacterial genomes, have been made available online at http://cbcsrv.watson.ibm.com/HGT_SVM/.

  12. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

    PubMed Central

    Damienikan, Aliaksandr U.

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  13. Programmable removal of bacterial strains by use of genome-targeting CRISPR-Cas systems.

    PubMed

    Gomaa, Ahmed A; Klumpe, Heidi E; Luo, Michelle L; Selle, Kurt; Barrangou, Rodolphe; Beisel, Chase L

    2014-01-28

    CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems in bacteria and archaea employ CRISPR RNAs to specifically recognize the complementary DNA of foreign invaders, leading to sequence-specific cleavage or degradation of the target DNA. Recent work has shown that the accidental or intentional targeting of the bacterial genome is cytotoxic and can lead to cell death. Here, we have demonstrated that genome targeting with CRISPR-Cas systems can be employed for the sequence-specific and titratable removal of individual bacterial strains and species. Using the type I-E CRISPR-Cas system in Escherichia coli as a model, we found that this effect could be elicited using native or imported systems and was similarly potent regardless of the genomic location, strand, or transcriptional activity of the target sequence. Furthermore, the specificity of targeting with CRISPR RNAs could readily distinguish between even highly similar strains in pure or mixed cultures. Finally, varying the collection of delivered CRISPR RNAs could quantitatively control the relative number of individual strains within a mixed culture. Critically, the observed selectivity and programmability of bacterial removal would be virtually impossible with traditional antibiotics, bacteriophages, selectable markers, or tailored growth conditions. Once delivery challenges are addressed, we envision that this approach could offer a novel means to quantitatively control the composition of environmental and industrial microbial consortia and may open new avenues for the development of "smart" antibiotics that circumvent multidrug resistance and differentiate between pathogenic and beneficial microorganisms. Controlling the composition of microbial populations is a critical aspect in medicine, biotechnology, and environmental cycles. While different antimicrobial strategies, such as antibiotics, antimicrobial peptides, and lytic bacteriophages, offer partial solutions

  14. Analysis of bacterial populations in the environment using two-dimensional gel electrophoresis of genomic DNA and complementary DNA.

    PubMed

    Liu, Guo-Hua; Nakamura, Tatsuo; Amemiya, Takashi; Rajendran, Narasimmalu; Itoh, Kiminori

    2011-01-01

    Two-dimensional gel electrophoresis (2-DGE) mapping of genomic DNA and complementary DNA (cDNA) amplicons was attempted to analyze total and active bacterial populations within soil and activated sludge samples. Distinct differences in the number and species of bacterial populations and those that were metabolically active at the time of sampling were visually observed especially for the soil community. Statistical analyses and sequencing based on the 2-DGE data further revealed the relationships between total and active bacterial populations within each community. This high-resolution technique would be useful for obtaining a better understanding of bacterial population structures in the environment.

  15. Construction of a Llama Bacterial Artificial Chromosome Library with Approximately 9-Fold Genome Equivalent Coverage

    PubMed Central

    Airmet, K. W.; Hinckley, J. D.; Tree, L. T.; Moss, M.; Blumell, S.; Ulicny, K.; Gustafson, A. K.; Weed, M.; Theodosis, R.; Lehnardt, M.; Genho, J.; Stevens, M. R.; Kooyman, D. L.

    2012-01-01

    The Ilama is an important agricultural livestock in much of South America. The llama is increasing in popularity in the United States as a companion animal. Little work has been done to improve llama production using modern technology. A paucity of information is available regarding the llama genome. We report the construction of a llama bacterial artificial chromosome (BAC) library of about 196,224 clones in the vector pECBAC1. Using flow cytometry and bovine, human, mouse, and chicken as controls, we determined the llama genome size to be 2.4 × 109 bp. The average insert size of the library is 137.8 kb corresponding to approximately 9-fold genome coverage. Further studies are needed to further characterize the library and llama genome. We anticipate that this new library will help facilitate future genomic studies in the llama. PMID:22811594

  16. Segtor: Rapid Annotation of Genomic Coordinates and Single Nucleotide Variations Using Segment Trees

    PubMed Central

    Renaud, Gabriel; Neves, Pedro; Folador, Edson Luiz; Ferreira, Carlos Gil; Passetti, Fabio

    2011-01-01

    Various research projects often involve determining the relative position of genomic coordinates, intervals, single nucleotide variations (SNVs), insertions, deletions and translocations with respect to genes and their potential impact on protein translation. Due to the tremendous increase in throughput brought by the use of next-generation sequencing, investigators are routinely faced with the need to annotate very large datasets. We present Segtor, a tool to annotate large sets of genomic coordinates, intervals, SNVs, indels and translocations. Our tool uses segment trees built using the start and end coordinates of the genomic features the user wishes to use instead of storing them in a database management system. The software also produces annotation statistics to allow users to visualize how many coordinates were found within various portions of genes. Our system currently can be made to work with any species available on the UCSC Genome Browser. Segtor is a suitable tool for groups, especially those with limited access to programmers or with interest to analyze large amounts of individual genomes, who wish to determine the relative position of very large sets of mapped reads and subsequently annotate observed mutations between the reads and the reference. Segtor (http://lbbc.inca.gov.br/segtor/) is an open-source tool that can be freely downloaded for non-profit use. We also provide a web interface for testing purposes. PMID:22069465

  17. Statistical Analysis of Hurst Exponents of Essential/Nonessential Genes in 33 Bacterial Genomes

    PubMed Central

    Liu, Xiao; Wang, Baojin; Xu, Luo

    2015-01-01

    Methods for identifying essential genes currently depend predominantly on biochemical experiments. However, there is demand for improved computational methods for determining gene essentiality. In this study, we used the Hurst exponent, a characteristic parameter to describe long-range correlation in DNA, and analyzed its distribution in 33 bacterial genomes. In most genomes (31 out of 33) the significance levels of the Hurst exponents of the essential genes were significantly higher than for the corresponding full-gene-set, whereas the significance levels of the Hurst exponents of the nonessential genes remained unchanged or increased only slightly. All of the Hurst exponents of essential genes followed a normal distribution, with one exception. We therefore propose that the distribution feature of Hurst exponents of essential genes can be used as a classification index for essential gene prediction in bacteria. For computer-aided design in the field of synthetic biology, this feature can build a restraint for pre- or post-design checking of bacterial essential genes. Moreover, considering the relationship between gene essentiality and evolution, the Hurst exponents could be used as a descriptive parameter related to evolutionary level, or be added to the annotation of each gene. PMID:26067107

  18. Large-Scale Bioinformatics Analysis of Bacillus Genomes Uncovers Conserved Roles of Natural Products in Bacterial Physiology.

    PubMed

    Grubbs, Kirk J; Bleich, Rachel M; Santa Maria, Kevin C; Allen, Scott E; Farag, Sherif; Shank, Elizabeth A; Bowers, Albert A

    2017-01-01

    Bacteria possess an amazing capacity to synthesize a diverse range of structurally complex, bioactive natural products known as specialized (or secondary) metabolites. Many of these specialized metabolites are used as clinical therapeutics, while others have important ecological roles in microbial communities. The biosynthetic gene clusters (BGCs) that generate these metabolites can be identified in bacterial genome sequences using their highly conserved genetic features. We analyzed an unprecedented 1,566 bacterial genomes from Bacillus species and identified nearly 20,000 BGCs. By comparing these BGCs to one another as well as a curated set of known specialized metabolite BGCs, we discovered that the majority of Bacillus natural products are comprised of a small set of highly conserved, well-distributed, known natural product compounds. Most of these metabolites have important roles influencing the physiology and development of Bacillus species. We identified, in addition to these characterized compounds, many unique, weakly conserved BGCs scattered across the genus that are predicted to encode unknown natural products. Many of these "singleton" BGCs appear to have been acquired via horizontal gene transfer. Based on this large-scale characterization of metabolite production in the Bacilli , we go on to connect the alkylpyrones, natural products that are highly conserved but previously biologically uncharacterized, to a role in Bacillus physiology: inhibiting spore development. IMPORTANCE Bacilli are capable of producing a diverse array of specialized metabolites, many of which have gained attention for their roles as signals that affect bacterial physiology and development. Up to this point, however, the Bacillus genus's metabolic capacity has been underexplored. We undertook a deep genomic analysis of 1,566 Bacillus genomes to understand the full spectrum of metabolites that this bacterial group can make. We discovered that the majority of the specialized

  19. Biological and immunological characterization of a simian rotavirus SA11 variant with an altered genome segment 4.

    PubMed

    Burns, J W; Chen, D; Estes, M K; Ramig, R F

    1989-04-01

    We have studied a variant virus isolated from a stock of SA11 virus (H. G. Pereira, R. S. Azeredo, A. M. Fialho, and M. N. P. Vidal, 1984, J. Gen. Virol. 65, 815-818). This virus, designated 4F, was initially identified by its faster electrophoretic mobility for genome segment 4. The variant was analyzed to determine if the altered electrophoretic mobility of genome segment 4 could be correlated with phenotypic changes. Comparison of our standard laboratory SA11 virus (clone 3) with the 4F variant showed the following: (i) The 4F variant possesses a viral hemagglutinin (VP4) with a higher apparent molecular weight than clone 3. (ii) The 4F variant produces large plaques when assayed in vitro, as compared to clone 3. (iii) The 4F variant produces plaques in the absence of proteolytic enzymes, whereas clone 3 does not. (iv) The 4F variant reacts with serotype-specific neutralizing monoclonal antibodies to VP7, but fails to react with several neutralizing anti-VP4 monoclonal antibodies generated to SA11 clone 3. (v) The 4F variant grows to a higher titer and is more stable than clone 3. (vi) The 4F variant produces a VP4 that appears to be more susceptible to cleavage by trypsin than is the VP4 of clone 3. Further analyses with the 4F variant may lead to an understanding of the molecular basis for these altered phenotypes that appear to be related, at least in part, to the product of genome segment 4.

  20. CRISPR-Cas: From the Bacterial Adaptive Immune System to a Versatile Tool for Genome Engineering.

    PubMed

    Kirchner, Marion; Schneider, Sabine

    2015-11-09

    The field of biology has been revolutionized by the recent advancement of an adaptive bacterial immune system as a universal genome engineering tool. Bacteria and archaea use repetitive genomic elements termed clustered regularly interspaced short palindromic repeats (CRISPR) in combination with an RNA-guided nuclease (CRISPR-associated nuclease: Cas) to target and destroy invading DNA. By choosing the appropriate sequence of the guide RNA, this two-component system can be used to efficiently modify, target, and edit genomic loci of interest in plants, insects, fungi, mammalian cells, and whole organisms. This has opened up new frontiers in genome engineering, including the potential to treat or cure human genetic disorders. Now the potential risks as well as the ethical, social, and legal implications of this powerful new technique move into the limelight. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. In situ structures of the segmented genome and RNA polymerase complex inside a dsRNA virus

    NASA Astrophysics Data System (ADS)

    Zhang, Xing; Ding, Ke; Yu, Xuekui; Chang, Winston; Sun, Jingchen; Hong Zhou, Z.

    2015-11-01

    Viruses in the Reoviridae, like the triple-shelled human rotavirus and the single-shelled insect cytoplasmic polyhedrosis virus (CPV), all package a genome of segmented double-stranded RNAs (dsRNAs) inside the viral capsid and carry out endogenous messenger RNA synthesis through a transcriptional enzyme complex (TEC). By direct electron-counting cryoelectron microscopy and asymmetric reconstruction, we have determined the organization of the dsRNA genome inside quiescent CPV (q-CPV) and the in situ atomic structures of TEC within CPV in both quiescent and transcribing (t-CPV) states. We show that the ten segmented dsRNAs in CPV are organized with ten TECs in a specific, non-symmetric manner, with each dsRNA segment attached directly to a TEC. The TEC consists of two extensively interacting subunits: an RNA-dependent RNA polymerase (RdRP) and an NTPase VP4. We find that the bracelet domain of RdRP undergoes marked conformational change when q-CPV is converted to t-CPV, leading to formation of the RNA template entry channel and access to the polymerase active site. An amino-terminal helix from each of two subunits of the capsid shell protein (CSP) interacts with VP4 and RdRP. These findings establish the link between sensing of environmental cues by the external proteins and activation of endogenous RNA transcription by the TEC inside the virus.

  2. An efficient and high fidelity method for amplification, cloning and sequencing of complete tospovirus genomic RNA segments

    USDA-ARS?s Scientific Manuscript database

    Amplification and sequencing of the complete M- and S-RNA segments of Tomato spotted wilt virus and Impatiens necrotic spot virus as a single fragment is useful for whole genome sequencing of tospoviruses co-infecting a single host plant. It avoids issues associated with overlapping amplicon-based ...

  3. Metabolic Complementarity and Genomics of the Dual Bacterial Symbiosis of Sharpshooters

    PubMed Central

    Wu, Dongying; Daugherty, Sean C; Van Aken, Susan E; Pai, Grace H; Watkins, Kisha L; Khouri, Hoda; Tallon, Luke J; Zaborsky, Jennifer M; Dunbar, Helen E; Tran, Phat L; Moran, Nancy A

    2006-01-01

    Mutualistic intracellular symbiosis between bacteria and insects is a widespread phenomenon that has contributed to the global success of insects. The symbionts, by provisioning nutrients lacking from diets, allow various insects to occupy or dominate ecological niches that might otherwise be unavailable. One such insect is the glassy-winged sharpshooter (Homalodisca coagulata), which feeds on xylem fluid, a diet exceptionally poor in organic nutrients. Phylogenetic studies based on rRNA have shown two types of bacterial symbionts to be coevolving with sharpshooters: the gamma-proteobacterium Baumannia cicadellinicola and the Bacteroidetes species Sulcia muelleri. We report here the sequencing and analysis of the 686,192–base pair genome of B. cicadellinicola and approximately 150 kilobase pairs of the small genome of S. muelleri, both isolated from H. coagulata. Our study, which to our knowledge is the first genomic analysis of an obligate symbiosis involving multiple partners, suggests striking complementarity in the biosynthetic capabilities of the two symbionts: B. cicadellinicola devotes a substantial portion of its genome to the biosynthesis of vitamins and cofactors required by animals and lacks most amino acid biosynthetic pathways, whereas S. muelleri apparently produces most or all of the essential amino acids needed by its host. This finding, along with other results of our genome analysis, suggests the existence of metabolic codependency among the two unrelated endosymbionts and their insect host. This dual symbiosis provides a model case for studying correlated genome evolution and genome reduction involving multiple organisms in an intimate, obligate mutualistic relationship. In addition, our analysis provides insight for the first time into the differences in symbionts between insects (e.g., aphids) that feed on phloem versus those like H. coagulata that feed on xylem. Finally, the genomes of these two symbionts provide potential targets for

  4. 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

    DOE PAGES

    Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.; ...

    2017-06-12

    We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster withmore » potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.« less

  5. 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.

    We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster withmore » potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.« less

  6. Draft Genome Sequence of Xanthomonas arboricola pv. pruni Strain Xap33, Causal Agent of Bacterial Spot Disease on Almond

    PubMed Central

    Garita-Cambronero, J.; Sena-Vélez, M.; Palacio-Bielsa, A.

    2014-01-01

    We report the annotated genome sequence of Xanthomonas arboricola pv. pruni strain Xap33, isolated from almond leaves showing bacterial spot disease symptoms in Spain. The availability of this genome sequence will aid our understanding of the infection mechanism of this bacterium as well as its relationship to other species of the same genus. PMID:24903863

  7. Construction of an infectious clone of canine herpesvirus genome as a bacterial artificial chromosome.

    PubMed

    Arii, Jun; Hushur, Orkash; Kato, Kentaro; Kawaguchi, Yasushi; Tohya, Yukinobu; Akashi, Hiroomi

    2006-04-01

    Canine herpesvirus (CHV) is an attractive candidate not only for use as a recombinant vaccine to protect dogs from a variety of canine pathogens but also as a viral vector for gene therapy in domestic animals. However, developments in this area have been impeded by the complicated techniques used for eukaryotic homologous recombination. To overcome these problems, we used bacterial artificial chromosomes (BACs) to generate infectious BACs. Our findings may be summarized as follows: (i) the CHV genome (pCHV/BAC), in which a BAC flanked by loxP sites was inserted into the thymidine kinase gene, was maintained in Escherichia coli; (ii) transfection of pCHV/BAC into A-72 cells resulted in the production of infectious virus; (iii) the BAC vector sequence was almost perfectly excisable from the genome of the reconstituted virus CHV/BAC by co-infection with CHV/BAC and a recombinant adenovirus that expressed the Cre recombinase; and (iv) a recombinant virus in which the glycoprotein C gene was deleted was generated by lambda recombination followed by Flp recombination, which resulted in a reduction in viral titer compared with that of the wild-type virus. The infectious clone pCHV/BAC is useful for the modification of the CHV genome using bacterial genetics, and CHV/BAC should have multiple applications in the rapid generation of genetically engineered CHV recombinants and the development of CHV vectors for vaccination and gene therapy in domestic animals.

  8. Genus-wide comparison of Pseudovibrio bacterial genomes reveal diverse adaptations to different marine invertebrate hosts.

    PubMed

    Alex, Anoop; Antunes, Agostinho

    2018-01-01

    Bacteria belonging to the genus Pseudovibrio have been frequently found in association with a wide variety of marine eukaryotic invertebrate hosts, indicative of their versatile and symbiotic lifestyle. A recent comparison of the sponge-associated Pseudovibrio genomes has shed light on the mechanisms influencing a successful symbiotic association with sponges. In contrast, the genomic architecture of Pseudovibrio bacteria associated with other marine hosts has received less attention. Here, we performed genus-wide comparative analyses of 18 Pseudovibrio isolated from sponges, coral, tunicates, flatworm, and seawater. The analyses revealed a certain degree of commonality among the majority of sponge- and coral-associated bacteria. Isolates from other marine invertebrate host, tunicates, exhibited a genetic repertoire for cold adaptation and specific metabolic abilities including mucin degradation in the Antarctic tunicate-associated bacterium Pseudovibrio sp. Tun.PHSC04_5.I4. Reductive genome evolution was simultaneously detected in the flatworm-associated bacteria and the sponge-associated bacterium P. axinellae AD2, through the loss of major secretion systems (type III/VI) and virulence/symbioses factors such as proteins involved in adhesion and attachment to the host. Our study also unraveled the presence of a CRISPR-Cas system in P. stylochi UST20140214-052 a flatworm-associated bacterium possibly suggesting the role of CRISPR-based adaptive immune system against the invading virus particles. Detection of mobile elements and genomic islands (GIs) in all bacterial members highlighted the role of horizontal gene transfer for the acquisition of novel genetic features, likely enhancing the bacterial ecological fitness. These findings are insightful to understand the role of genome diversity in Pseudovibrio as an evolutionary strategy to increase their colonizing success across a wide range of marine eukaryotic hosts.

  9. Genus-wide comparison of Pseudovibrio bacterial genomes reveal diverse adaptations to different marine invertebrate hosts

    PubMed Central

    Alex, Anoop

    2018-01-01

    Bacteria belonging to the genus Pseudovibrio have been frequently found in association with a wide variety of marine eukaryotic invertebrate hosts, indicative of their versatile and symbiotic lifestyle. A recent comparison of the sponge-associated Pseudovibrio genomes has shed light on the mechanisms influencing a successful symbiotic association with sponges. In contrast, the genomic architecture of Pseudovibrio bacteria associated with other marine hosts has received less attention. Here, we performed genus-wide comparative analyses of 18 Pseudovibrio isolated from sponges, coral, tunicates, flatworm, and seawater. The analyses revealed a certain degree of commonality among the majority of sponge- and coral-associated bacteria. Isolates from other marine invertebrate host, tunicates, exhibited a genetic repertoire for cold adaptation and specific metabolic abilities including mucin degradation in the Antarctic tunicate-associated bacterium Pseudovibrio sp. Tun.PHSC04_5.I4. Reductive genome evolution was simultaneously detected in the flatworm-associated bacteria and the sponge-associated bacterium P. axinellae AD2, through the loss of major secretion systems (type III/VI) and virulence/symbioses factors such as proteins involved in adhesion and attachment to the host. Our study also unraveled the presence of a CRISPR-Cas system in P. stylochi UST20140214-052 a flatworm-associated bacterium possibly suggesting the role of CRISPR-based adaptive immune system against the invading virus particles. Detection of mobile elements and genomic islands (GIs) in all bacterial members highlighted the role of horizontal gene transfer for the acquisition of novel genetic features, likely enhancing the bacterial ecological fitness. These findings are insightful to understand the role of genome diversity in Pseudovibrio as an evolutionary strategy to increase their colonizing success across a wide range of marine eukaryotic hosts. PMID:29775460

  10. A Year of Infection in the Intensive Care Unit: Prospective Whole Genome Sequencing of Bacterial Clinical Isolates Reveals Cryptic Transmissions and Novel Microbiota

    PubMed Central

    Roach, David J.; Burton, Joshua N.; Lee, Choli; Stackhouse, Bethany; Butler-Wu, Susan M.; Cookson, Brad T.

    2015-01-01

    Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital’s intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate Staphylococcus epidermidis clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care. PMID:26230489

  11. Conserved gene clusters in bacterial genomes provide further support for the primacy of RNA

    NASA Technical Reports Server (NTRS)

    Siefert, J. L.; Martin, K. A.; Abdi, F.; Widger, W. R.; Fox, G. E.

    1997-01-01

    Five complete bacterial genome sequences have been released to the scientific community. These include four (eu)Bacteria, Haemophilus influenzae, Mycoplasma genitalium, M. pneumoniae, and Synechocystis PCC 6803, as well as one Archaeon, Methanococcus jannaschii. Features of organization shared by these genomes are likely to have arisen very early in the history of the bacteria and thus can be expected to provide further insight into the nature of early ancestors. Results of a genome comparison of these five organisms confirm earlier observations that gene order is remarkably unpreserved. There are, nevertheless, at least 16 clusters of two or more genes whose order remains the same among the four (eu)Bacteria and these are presumed to reflect conserved elements of coordinated gene expression that require gene proximity. Eight of these gene orders are essentially conserved in the Archaea as well. Many of these clusters are known to be regulated by RNA-level mechanisms in Escherichia coli, which supports the earlier suggestion that this type of regulation of gene expression may have arisen very early. We conclude that although the last common ancestor may have had a DNA genome, it likely was preceded by progenotes with an RNA genome.

  12. Large scale genomic analysis shows no evidence for pathogen adaptation between the blood and cerebrospinal fluid niches during bacterial meningitis

    PubMed Central

    Lees, John A.; Kremer, Philip H. C.; Manso, Ana S.; Croucher, Nicholas J.; Ferwerda, Bart; Serón, Mercedes Valls; Oggioni, Marco R.; Parkhill, Julian; Brouwer, Matthijs C.; van der Ende, Arie; van de Beek, Diederik

    2017-01-01

    Recent studies have provided evidence for rapid pathogen genome diversification, some of which could potentially affect the course of disease. We have previously described such variation seen between isolates infecting the blood and cerebrospinal fluid (CSF) of a single patient during a case of bacterial meningitis. Here, we performed whole-genome sequencing of paired isolates from the blood and CSF of 869 meningitis patients to determine whether such variation frequently occurs between these two niches in cases of bacterial meningitis. Using a combination of reference-free variant calling approaches, we show that no genetic adaptation occurs in either invaded niche during bacterial meningitis for two major pathogen species, Streptococcus pneumoniae and Neisseria meningitidis. This study therefore shows that the bacteria capable of causing meningitis are already able to do this upon entering the blood, and no further sequence change is necessary to cross the blood–brain barrier. Our findings place the focus back on bacterial evolution between nasopharyngeal carriage and invasion, or diversity of the host, as likely mechanisms for determining invasiveness. PMID:28348877

  13. Nonviral Genome Editing Based on a Polymer-Derivatized CRISPR Nanocomplex for Targeting Bacterial Pathogens and Antibiotic Resistance.

    PubMed

    Kang, Yoo Kyung; Kwon, Kyu; Ryu, Jea Sung; Lee, Ha Neul; Park, Chankyu; Chung, Hyun Jung

    2017-04-19

    The overuse of antibiotics plays a major role in the emergence and spread of multidrug-resistant bacteria. A molecularly targeted, specific treatment method for bacterial pathogens can prevent this problem by reducing the selective pressure during microbial growth. Herein, we introduce a nonviral treatment strategy delivering genome editing material for targeting antibacterial resistance. We apply the CRISPR-Cas9 system, which has been recognized as an innovative tool for highly specific and efficient genome engineering in different organisms, as the delivery cargo. We utilize polymer-derivatized Cas9, by direct covalent modification of the protein with cationic polymer, for subsequent complexation with single-guide RNA targeting antibiotic resistance. We show that nanosized CRISPR complexes (= Cr-Nanocomplex) were successfully formed, while maintaining the functional activity of Cas9 endonuclease to induce double-strand DNA cleavage. We also demonstrate that the Cr-Nanocomplex designed to target mecA-the major gene involved in methicillin resistance-can be efficiently delivered into Methicillin-resistant Staphylococcus aureus (MRSA), and allow the editing of the bacterial genome with much higher efficiency compared to using native Cas9 complexes or conventional lipid-based formulations. The present study shows for the first time that a covalently modified CRISPR system allows nonviral, therapeutic genome editing, and can be potentially applied as a target specific antimicrobial.

  14. The Statistical Segment Length of DNA: Opportunities for Biomechanical Modeling in Polymer Physics and Next-Generation Genomics.

    PubMed

    Dorfman, Kevin D

    2018-02-01

    The development of bright bisintercalating dyes for deoxyribonucleic acid (DNA) in the 1990s, most notably YOYO-1, revolutionized the field of polymer physics in the ensuing years. These dyes, in conjunction with modern molecular biology techniques, permit the facile observation of polymer dynamics via fluorescence microscopy and thus direct tests of different theories of polymer dynamics. At the same time, they have played a key role in advancing an emerging next-generation method known as genome mapping in nanochannels. The effect of intercalation on the bending energy of DNA as embodied by a change in its statistical segment length (or, alternatively, its persistence length) has been the subject of significant controversy. The precise value of the statistical segment length is critical for the proper interpretation of polymer physics experiments and controls the phenomena underlying the aforementioned genomics technology. In this perspective, we briefly review the model of DNA as a wormlike chain and a trio of methods (light scattering, optical or magnetic tweezers, and atomic force microscopy (AFM)) that have been used to determine the statistical segment length of DNA. We then outline the disagreement in the literature over the role of bisintercalation on the bending energy of DNA, and how a multiscale biomechanical approach could provide an important model for this scientifically and technologically relevant problem.

  15. Characterization of the complete genome segments from BmCPV-SZ, a novel Bombyx mori cypovirus 1 isolate.

    PubMed

    Cao, Guangli; Meng, Xiangkun; Xue, Renyu; Zhu, Yuexiong; Zhang, Xiaorong; Pan, Zhonghua; Zheng, Xiaojian; Gong, Chengliang

    2012-07-01

    A novel Bombyx mori cypovirus 1 isolated from infected silkworm larvae and tentatively assigned as Bombyx mori cypovirus 1 isolate Suzhou (BmCPV-SZ). The complete nucleotide sequences of genomic segments S1-S10 from BmCPV-SZ were determined. All segments possessed a single open reading frame; however, bioinformatic evidence suggested a short overlapping coding sequence in S1. Each BmCPV-SZ segment possessed the conserved terminal sequences AGUAA and GUUAGCC at the 5' and 3' ends, respectively. The conserved A/G at the -3 position in relation to the AUG codon could be found in the BmCPV-SZ genome, and it was postulated that this conserved A/G may be the most important nucleotide for efficient translation initiation in cypoviruses (CPVs). Examination of the putative amino acid sequences encoded by BmCPV-SZ revealed some characteristic motifs. Homology searches showed that viral structural proteins VP1, VP3, and VP4 had localized homologies with proteins of Rice ragged stunt virus , a member of the genus Oryzavirus within the family Reoviridae. A phylogenetic tree based on RNA-dependent RNA polymerase sequences demonstrated that CPV is more closely related to Rice ragged stunt virus and Aedes pseudoscutellaris reovirus than to other members of Reoviridae, suggesting that they may have originated from common ancestors.

  16. Identification of Novel Genomic Islands in Liverpool Epidemic Strain of Pseudomonas aeruginosa Using Segmentation and Clustering

    PubMed Central

    Jani, Mehul; Mathee, Kalai; Azad, Rajeev K.

    2016-01-01

    Pseudomonas aeruginosa is an opportunistic pathogen implicated in a myriad of infections and a leading pathogen responsible for mortality in patients with cystic fibrosis (CF). Horizontal transfers of genes among the microorganisms living within CF patients have led to highly virulent and multi-drug resistant strains such as the Liverpool epidemic strain of P. aeruginosa, namely the LESB58 strain that has the propensity to acquire virulence and antibiotic resistance genes. Often these genes are acquired in large clusters, referred to as “genomic islands (GIs).” To decipher GIs and understand their contributions to the evolution of virulence and antibiotic resistance in P. aeruginosa LESB58, we utilized a recursive segmentation and clustering procedure, presented here as a genome-mining tool, “GEMINI.” GEMINI was validated on experimentally verified islands in the LESB58 strain before examining its potential to decipher novel islands. Of the 6062 genes in P. aeruginosa LESB58, 596 genes were identified to be resident on 20 GIs of which 12 have not been previously reported. Comparative genomics provided evidence in support of our novel predictions. Furthermore, GEMINI unraveled the mosaic structure of islands that are composed of segments of likely different evolutionary origins, and demonstrated its ability to identify potential strain biomarkers. These newly found islands likely have contributed to the hyper-virulence and multidrug resistance of the Liverpool epidemic strain of P. aeruginosa. PMID:27536294

  17. Pan genome and CRISPR analyses of the bacterial fish pathogen Moritella viscosa.

    PubMed

    Karlsen, Christian; Hjerde, Erik; Klemetsen, Terje; Willassen, Nils Peder

    2017-04-20

    Winter-ulcer Moritella viscosa infections continue to be a significant burden in Atlantic salmon (Salmo salar L.) farming. M. viscosa comprises two main clusters that differ in genetic variation and phenotypes including virulence. Horizontal gene transfer through acquisition and loss of mobile genetic elements (MGEs) is a major driving force of bacterial diversification. To gain insight into genomic traits that could affect sublineage evolution within this bacterium we examined the genome sequences of twelve M. viscosa strains. Matches between M. viscosa clustered, regularly interspaced, short palindromic, repeats and associated cas genes (CRISPR-Cas) were analysed to correlate CRISPR-Cas with adaptive immunity against MGEs. The comparative genomic analysis of M. viscosa isolates from across the North Atlantic region and from different fish species support delineation of M. viscosa into four phylogenetic lineages. The results showed that M. viscosa carries two distinct variants of the CRISPR-Cas subtype I-F systems and that CRISPR features follow the phylogenetic lineages. A subset of the spacer content match prophage and plasmid genes dispersed among the M. viscosa strains. Further analysis revealed that prophage and plasmid-like element distribution were reflected in the content of the CRISPR-spacer profiles. Our data suggests that CRISPR-Cas mediated interactions with MGEs impact genome properties among M. viscosa, and that patterns in spacer and MGE distributions are linked to strain relationships.

  18. Whole-genome sequencing in bacteriology: state of the art

    PubMed Central

    Dark, Michael J

    2013-01-01

    Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

  19. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    PubMed Central

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813

  20. The most conserved genome segments for life detection on Earth and other planets.

    PubMed

    Isenbarger, Thomas A; Carr, Christopher E; Johnson, Sarah Stewart; Finney, Michael; Church, George M; Gilbert, Walter; Zuber, Maria T; Ruvkun, Gary

    2008-12-01

    On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA sequence segments across phylogeny. This set of sequences defines a core set of DNA regions that have changed the least over billions of years of evolution and provides a means to identify and classify divergent life, including ancestrally related life on other planets.

  1. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    PubMed Central

    Neave, Matthew J.; Michell, Craig T.; Apprill, Amy; Voolstra, Christian R.

    2017-01-01

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts. PMID:28094347

  2. Bacterial genome replication at subzero temperatures in permafrost

    PubMed Central

    Tuorto, Steven J; Darias, Phillip; McGuinness, Lora R; Panikov, Nicolai; Zhang, Tingjun; Häggblom, Max M; Kerkhof, Lee J

    2014-01-01

    Microbial metabolic activity occurs at subzero temperatures in permafrost, an environment representing ∼25% of the global soil organic matter. Although much of the observed subzero microbial activity may be due to basal metabolism or macromolecular repair, there is also ample evidence for cellular growth. Unfortunately, most metabolic measurements or culture-based laboratory experiments cannot elucidate the specific microorganisms responsible for metabolic activities in native permafrost, nor, can bulk approaches determine whether different members of the microbial community modulate their responses as a function of changing subzero temperatures. Here, we report on the use of stable isotope probing with 13C-acetate to demonstrate bacterial genome replication in Alaskan permafrost at temperatures of 0 to −20 °C. We found that the majority (80%) of operational taxonomic units detected in permafrost microcosms were active and could synthesize 13C-labeled DNA when supplemented with 13C-acetate at temperatures of 0 to −20 °C during a 6-month incubation. The data indicated that some members of the bacterial community were active across all of the experimental temperatures, whereas many others only synthesized DNA within a narrow subzero temperature range. Phylogenetic analysis of 13C-labeled 16S rRNA genes revealed that the subzero active bacteria were members of the Acidobacteria, Actinobacteria, Chloroflexi, Gemmatimonadetes and Proteobacteria phyla and were distantly related to currently cultivated psychrophiles. These results imply that small subzero temperature changes may lead to changes in the active microbial community, which could have consequences for biogeochemical cycling in permanently frozen systems. PMID:23985750

  3. Rewriting the blueprint of life by synthetic genomics and genome engineering.

    PubMed

    Annaluru, Narayana; Ramalingam, Sivaprakash; Chandrasegaran, Srinivasan

    2015-06-16

    Advances in DNA synthesis and assembly methods over the past decade have made it possible to construct genome-size fragments from oligonucleotides. Early work focused on synthesis of small viral genomes, followed by hierarchical synthesis of wild-type bacterial genomes and subsequently on transplantation of synthesized bacterial genomes into closely related recipient strains. More recently, a synthetic designer version of yeast Saccharomyces cerevisiae chromosome III has been generated, with numerous changes from the wild-type sequence without having an impact on cell fitness and phenotype, suggesting plasticity of the yeast genome. A project to generate the first synthetic yeast genome--the Sc2.0 Project--is currently underway.

  4. The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

    PubMed

    Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

    2011-01-01

    The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.

  5. Non-Enzymatic Detection of Bacterial Genomic DNA Using the Bio-Barcode Assay

    PubMed Central

    Hill, Haley D.; Vega, Rafael A.; Mirkin, Chad A.

    2011-01-01

    The detection of bacterial genomic DNA through a non-enzymatic nanomaterials based amplification method, the bio-barcode assay, is reported. The assay utilizes oligonucleotide functionalized magnetic microparticles to capture the target of interest from the sample. A critical step in the new assay involves the use of blocking oligonucleotides during heat denaturation of the double stranded DNA. These blockers bind to specific regions of the target DNA upon cooling, and prevent the duplex DNA from re-hybridizing, which allows the particle probes to bind. Following target isolation using the magnetic particles, oligonucleotide functionalized gold nanoparticles act as target recognition agents. The oligonucleotides on the nanoparticle (barcodes) act as amplification surrogates. The barcodes are then detected using the Scanometric method. The limit of detection for this assay was determined to be 2.5 femtomolar, and this is the first demonstration of a barcode type assay for the detection of double stranded, genomic DNA. PMID:17927207

  6. PhyloFlu, a DNA microarray for determining the phylogenetic origin of influenza A virus gene segments and the genomic fingerprint of viral strains.

    PubMed

    Paulin, Luis F; de los D Soto-Del Río, María; Sánchez, Iván; Hernández, Jesús; Gutiérrez-Ríos, Rosa M; López-Martínez, Irma; Wong-Chew, Rosa M; Parissi-Crivelli, Aurora; Isa, P; López, Susana; Arias, Carlos F

    2014-03-01

    Recent evidence suggests that most influenza A virus gene segments can contribute to the pathogenicity of the virus. In this regard, the hemagglutinin (HA) subtype of the circulating strains has been closely surveyed, but the reassortment of internal gene segments is usually not monitored as a potential source of an increased pathogenicity. In this work, an oligonucleotide DNA microarray (PhyloFlu) designed to determine the phylogenetic origins of the eight segments of the influenza virus genome was constructed and validated. Clades were defined for each segment and also for the 16 HA and 9 neuraminidase (NA) subtypes. Viral genetic material was amplified by reverse transcription-PCR (RT-PCR) with primers specific to the conserved 5' and 3' ends of the influenza A virus genes, followed by PCR amplification with random primers and Cy3 labeling. The microarray unambiguously determined the clades for all eight influenza virus genes in 74% (28/38) of the samples. The microarray was validated with reference strains from different animal origins, as well as from human, swine, and avian viruses from field or clinical samples. In most cases, the phylogenetic clade of each segment defined its animal host of origin. The genomic fingerprint deduced by the combined information of the individual clades allowed for the determination of the time and place that strains with the same genomic pattern were previously reported. PhyloFlu is useful for characterizing and surveying the genetic diversity and variation of animal viruses circulating in different environmental niches and for obtaining a more detailed surveillance and follow up of reassortant events that can potentially modify virus pathogenicity.

  7. Phylogenetic and Protein Sequence Analysis of Bacterial Chemoreceptors.

    PubMed

    Ortega, Davi R; Zhulin, Igor B

    2018-01-01

    Identifying chemoreceptors in sequenced bacterial genomes, revealing their domain architecture, inferring their evolutionary relationships, and comparing them to chemoreceptors of known function become important steps in genome annotation and chemotaxis research. Here, we describe bioinformatics procedures that enable such analyses, using two closely related bacterial genomes as examples.

  8. Relations between Shannon entropy and genome order index in segmenting DNA sequences.

    PubMed

    Zhang, Yi

    2009-04-01

    Shannon entropy H and genome order index S are used in segmenting DNA sequences. Zhang [Phys. Rev. E 72, 041917 (2005)] found that the two schemes are equivalent when a DNA sequence is converted to a binary sequence of S (strong H bond) and W (weak H bond). They left the mathematical proof to mathematicians who are interested in this issue. In this paper, a possible mathematical explanation is given. Moreover, we find that Chargaff parity rule 2 is the necessary condition of the equivalence, and the equivalence disappears when a DNA sequence is regarded as a four-symbol sequence. At last, we propose that S-2(-H) may be related to species evolution.

  9. Elucidating the role of transcription in shaping the 3D structure of the bacterial genome

    NASA Astrophysics Data System (ADS)

    Brandao, Hugo B.; Wang, Xindan; Rudner, David Z.; Mirny, Leonid

    Active transcription has been linked to several genome conformation changes in bacteria, including the recruitment of chromosomal DNA to the cell membrane and formation of nucleoid clusters. Using genomic and imaging data as input into mathematical models and polymer simulations, we sought to explore the extent to which bacterial 3D genome structure could be explained by 1D transcription tracks. Using B. subtilis as a model organism, we investigated via polymer simulations the role of loop extrusion and DNA super-coiling on the formation of interaction domains and other fine-scale features that are visible in chromosome conformation capture (Hi-C) data. We then explored the role of the condensin structural maintenance of chromosome complex on the alignment of chromosomal arms. A parameter-free transcription traffic model demonstrated that mean chromosomal arm alignment can be quantitatively explained, and the effects on arm alignment in genomically rearranged strains of B. subtilis were accurately predicted. H.B. acknowledges support from the Natural Sciences and Engineering Research Council of Canada for a PGS-D fellowship.

  10. Complete Genome Sequence of a Putative New Bacterial Strain, I507, Isolated from the Indian Ocean

    PubMed Central

    Wang, Shu-yan; Wei, Jia-qiang

    2018-01-01

    ABSTRACT Bacterial strain I507 was isolated from the central Indian Ocean and may be a potential novel species, according to the 16S rRNA gene sequence. Here, we present its complete genome sequence and expect that it will provide researchers with valuable information to further understand its classification and function in the future. PMID:29674539

  11. Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

    PubMed Central

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-01-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  12. Genomics of Bacterial and Archaeal Viruses: Dynamics within the Prokaryotic Virosphere

    PubMed Central

    Krupovic, Mart; Prangishvili, David; Hendrix, Roger W.; Bamford, Dennis H.

    2011-01-01

    Summary: Prokaryotes, bacteria and archaea, are the most abundant cellular organisms among those sharing the planet Earth with human beings (among others). However, numerous ecological studies have revealed that it is actually prokaryotic viruses that predominate on our planet and outnumber their hosts by at least an order of magnitude. An understanding of how this viral domain is organized and what are the mechanisms governing its evolution is therefore of great interest and importance. The vast majority of characterized prokaryotic viruses belong to the order Caudovirales, double-stranded DNA (dsDNA) bacteriophages with tails. Consequently, these viruses have been studied (and reviewed) extensively from both genomic and functional perspectives. However, albeit numerous, tailed phages represent only a minor fraction of the prokaryotic virus diversity. Therefore, the knowledge which has been generated for this viral system does not offer a comprehensive view of the prokaryotic virosphere. In this review, we discuss all families of bacterial and archaeal viruses that contain more than one characterized member and for which evolutionary conclusions can be attempted by use of comparative genomic analysis. We focus on the molecular mechanisms of their genome evolution as well as on the relationships between different viral groups and plasmids. It becomes clear that evolutionary mechanisms shaping the genomes of prokaryotic viruses vary between different families and depend on the type of the nucleic acid, characteristics of the virion structure, as well as the mode of the life cycle. We also point out that horizontal gene transfer is not equally prevalent in different virus families and is not uniformly unrestricted for diverse viral functions. PMID:22126996

  13. Revealing the Bacterial Butyrate Synthesis Pathways by Analyzing (Meta)genomic Data

    PubMed Central

    Vital, Marius; Howe, Adina Chuang

    2014-01-01

    ABSTRACT Butyrate-producing bacteria have recently gained attention, since they are important for a healthy colon and when altered contribute to emerging diseases, such as ulcerative colitis and type II diabetes. This guild is polyphyletic and cannot be accurately detected by 16S rRNA gene sequencing. Consequently, approaches targeting the terminal genes of the main butyrate-producing pathway have been developed. However, since additional pathways exist and alternative, newly recognized enzymes catalyzing the terminal reaction have been described, previous investigations are often incomplete. We undertook a broad analysis of butyrate-producing pathways and individual genes by screening 3,184 sequenced bacterial genomes from the Integrated Microbial Genome database. Genomes of 225 bacteria with a potential to produce butyrate were identified, including many previously unknown candidates. The majority of candidates belong to distinct families within the Firmicutes, but members of nine other phyla, especially from Actinobacteria, Bacteroidetes, Fusobacteria, Proteobacteria, Spirochaetes, and Thermotogae, were also identified as potential butyrate producers. The established gene catalogue (3,055 entries) was used to screen for butyrate synthesis pathways in 15 metagenomes derived from stool samples of healthy individuals provided by the HMP (Human Microbiome Project) consortium. A high percentage of total genomes exhibited a butyrate-producing pathway (mean, 19.1%; range, 3.2% to 39.4%), where the acetyl-coenzyme A (CoA) pathway was the most prevalent (mean, 79.7% of all pathways), followed by the lysine pathway (mean, 11.2%). Diversity analysis for the acetyl-CoA pathway showed that the same few firmicute groups associated with several Lachnospiraceae and Ruminococcaceae were dominating in most individuals, whereas the other pathways were associated primarily with Bacteroidetes. PMID:24757212

  14. Weighted ssGBLUP improves genomic selection accuracy for bacterial cold water disease resistance in a rainbow trout population

    USDA-ARS?s Scientific Manuscript database

    The objective of this study was to compare methods for genomic evaluation in a Rainbow Trout (Oncorhynchus mykiss) population for survival when challenged by Flavobacterium psychrophilum, the causative agent of bacterial cold water disease (BCWD). The used methods were: 1)regular ssGBLUP that assume...

  15. Position-based scanning for comparative genomics and identification of genetic islands in Haemophilus influenzae type b.

    PubMed

    Bergman, Nicholas H; Akerley, Brian J

    2003-03-01

    Bacteria exhibit extensive genetic heterogeneity within species. In many cases, these differences account for virulence properties unique to specific strains. Several such loci have been discovered in the genome of the type b serotype of Haemophilus influenzae, a human pathogen able to cause meningitis, pneumonia, and septicemia. Here we report application of a PCR-based scanning procedure to compare the genome of a virulent type b (Hib) strain with that of the laboratory-passaged Rd KW20 strain for which a complete genome sequence is available. We have identified seven DNA segments or H. influenzae genetic islands (HiGIs) present in the type b genome and absent from the Rd genome. These segments vary in size and content and show signs of horizontal gene transfer in that their percent G+C content differs from that of the rest of the H. influenzae genome, they contain genes similar to those found on phages or other mobile elements, or they are flanked by DNA repeats. Several of these loci represent potential pathogenicity islands, because they contain genes likely to mediate interactions with the host. These newly identified genetic islands provide areas of investigation into both the evolution and pathogenesis of H. influenzae. In addition, the genome scanning approach developed to identify these islands provides a rapid means to compare the genomes of phenotypically diverse bacterial strains once the genome sequence of one representative strain has been determined.

  16. The layout of a bacterial genome.

    PubMed

    Képès, François; Jester, Brian C; Lepage, Thibaut; Rafiei, Nafiseh; Rosu, Bianca; Junier, Ivan

    2012-07-16

    Recently the mismatch between our newly acquired capacity to synthetize DNA at genome scale, and our low capacity to design ab initio a functional genome has become conspicuous. This essay gathers a variety of constraints that globally shape natural genomes, with a focus on eubacteria. These constraints originate from chromosome replication (leading/lagging strand asymmetry; gene dosage gradient from origin to terminus; collisions with the transcription complexes), from biased codon usage, from noise control in gene expression, and from genome layout for co-functional genes. On the basis of this analysis, lessons are drawn for full genome design. Copyright © 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  17. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome

  18. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  19. ABCdb: an online resource for ABC transporter repertories from sequenced archaeal and bacterial genomes.

    PubMed

    Fichant, Gwennaele; Basse, Marie-Jeanne; Quentin, Yves

    2006-03-01

    The ATP-binding cassette (ABC) transporters are one of the major classes of active transporters. They are widespread in archaea, bacteria, and eukaryota, indicating that they have arisen early in evolution. They are involved in many essential physiological processes, but the majority import or export a wide variety of compounds across cellular membranes. These systems share a common architecture composed of four (exporters) or five (importers) domains. To identify and reconstruct functional ABC transporters encoded by archaeal and bacterial genomes, we have developed a bioinformatic strategy. Cross-reference to the transport classification system is used to predict the type of compound transported. A high quality of annotation is achieved by manual verification of the predictions. However, in order to face the rapid increase in the number of published genomes, we also include analyses of genomes issuing directly from the automated strategy. Querying the database (http://www-abcdb.biotoul.fr) allows to easily retrieve ABC transporter repertories and related data. Additional query tools have been developed for the analysis of the ABC family from both functional and evolutionary perspectives.

  20. Genome-Centric Analysis of a Thermophilic and Cellulolytic Bacterial Consortium Derived from Composting

    PubMed Central

    Lemos, Leandro N.; Pereira, Roberta V.; Quaggio, Ronaldo B.; Martins, Layla F.; Moura, Livia M. S.; da Silva, Amanda R.; Antunes, Luciana P.; da Silva, Aline M.; Setubal, João C.

    2017-01-01

    , using compost metagenome and metatranscriptome datasets generated in a previous study. We obtained strong evidence that five of the six recovered genomes are indeed present and active in that composting process. We have thus discovered three (perhaps four) new thermophillic bacterial species that add to the increasing repertoire of known lignocellulose degraders, whose biotechnological potential can now be investigated in further studies. PMID:28469608

  1. Genetic analysis of a bacterial genetic exchange element: The gene transfer agent of Rhodobacter capsulatus

    PubMed Central

    Lang, Andrew S.; Beatty, J. T.

    2000-01-01

    An unusual system of genetic exchange exists in the purple nonsulfur bacterium Rhodobacter capsulatus. DNA transmission is mediated by a small bacteriophage-like particle called the gene transfer agent (GTA) that transfers random 4.5-kb segments of the producing cell's genome to recipient cells, where allelic replacement occurs. This paper presents the results of gene cloning, analysis, and mutagenesis experiments that show that GTA resembles a defective prophage related to bacteriophages from diverse genera of bacteria, which has been adopted by R. capsulatus for genetic exchange. A pair of cellular proteins, CckA and CtrA, appear to constitute part of a sensor kinase/response regulator signaling pathway that is required for expression of GTA structural genes. This signaling pathway controls growth-phase-dependent regulation of GTA gene messages, yielding maximal gene expression in the stationary phase. We suggest that GTA is an ancient prophage remnant that has evolved in concert with the bacterial genome, resulting in a genetic exchange process controlled by the bacterial cell. PMID:10639170

  2. Comparing genome versus proteome-based identification of clinical bacterial isolates.

    PubMed

    Galata, Valentina; Backes, Christina; Laczny, Cédric Christian; Hemmrich-Stanisak, Georg; Li, Howard; Smoot, Laura; Posch, Andreas Emanuel; Schmolke, Susanne; Bischoff, Markus; von Müller, Lutz; Plum, Achim; Franke, Andre; Keller, Andreas

    2018-05-01

    Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.

  3. Deciphering Cyanide-Degrading Potential of Bacterial Community Associated with the Coking Wastewater Treatment Plant with a Novel Draft Genome.

    PubMed

    Wang, Zhiping; Liu, Lili; Guo, Feng; Zhang, Tong

    2015-10-01

    Biotreatment processes fed with coking wastewater often encounter insufficient removal of pollutants, such as ammonia, phenols, and polycyclic aromatic hydrocarbons (PAHs), especially for cyanides. However, only a limited number of bacterial species in pure cultures have been confirmed to metabolize cyanides, which hinders the improvement of these processes. In this study, a microbial community of activated sludge enriched in a coking wastewater treatment plant was analyzed using 454 pyrosequencing and Illumina sequencing to characterize the potential cyanide-degrading bacteria. According to the classification of these pyro-tags, targeting V3/V4 regions of 16S rRNA gene, half of them were assigned to the family Xanthomonadaceae, implying that Xanthomonadaceae bacteria are well-adapted to coking wastewater. A nearly complete draft genome of the dominant bacterium was reconstructed from metagenome of this community to explore cyanide metabolism based on analysis of the genome. The assembled 16S rRNA gene from this draft genome showed that this bacterium was a novel species of Thermomonas within Xanthomonadaceae, which was further verified by comparative genomics. The annotation using KEGG and Pfam identified genes related to cyanide metabolism, including genes responsible for the iron-harvesting system, cyanide-insensitive terminal oxidase, cyanide hydrolase/nitrilase, and thiosulfate:cyanide transferase. Phylogenetic analysis showed that these genes had homologs in previously identified genomes of bacteria within Xanthomonadaceae and even presented similar gene cassettes, thus implying an inherent cyanide-decomposing potential. The findings of this study expand our knowledge about the bacterial degradation of cyanide compounds and will be helpful in the remediation of cyanides contamination.

  4. Bacterial genomics reveal the complex epidemiology of an emerging pathogen in arctic and boreal ungulates

    USGS Publications Warehouse

    Forde, Taya L.; Orsel, Karin; Zadoks, Ruth N.; Biek, Roman; Adams, Layne G.; Checkley, Sylvia L.; Davison, Tracy; De Buck, Jeroen; Dumond, Mathieu; Elkin, Brett T.; Finnegan, Laura; Macbeth, Bryan J.; Nelson, Cait; Niptanatiak, Amanda; Sather, Shane; Schwantje, Helen M.; van der Meer, Frank; Kutz, Susan J.

    2016-01-01

    Northern ecosystems are currently experiencing unprecedented ecological change, largely driven by a rapidly changing climate. Pathogen range expansion, and emergence and altered patterns of infectious disease, are increasingly reported in wildlife at high latitudes. Understanding the causes and consequences of shifting pathogen diversity and host-pathogen interactions in these ecosystems is important for wildlife conservation, and for indigenous populations that depend on wildlife. Among the key questions are whether disease events are associated with endemic or recently introduced pathogens, and whether emerging strains are spreading throughout the region. In this study, we used a phylogenomic approach to address these questions of pathogen endemicity and spread for Erysipelothrix rhusiopathiae, an opportunistic multi-host bacterial pathogen associated with recent mortalities in arctic and boreal ungulate populations in North America. We isolated E. rhusiopathiae from carcasses associated with large-scale die-offs of muskoxen in the Canadian Arctic Archipelago, and from contemporaneous mortality events and/or population declines among muskoxen in northwestern Alaska and caribou and moose in western Canada. Bacterial genomic diversity differed markedly among these locations; minimal divergence was present among isolates from muskoxen in the Canadian Arctic, while in caribou and moose populations, strains from highly divergent clades were isolated from the same location, or even from within a single carcass. These results indicate that mortalities among northern ungulates are not associated with a single emerging strain of E. rhusiopathiae, and that alternate hypotheses need to be explored. Our study illustrates the value and limitations of bacterial genomic data for discriminating between ecological hypotheses of disease emergence, and highlights the importance of studying emerging pathogens within the broader context of environmental and host factors.

  5. Asymmetric histone modifications between the original and derived loci of human segmental duplications

    PubMed Central

    Zheng, Deyou

    2008-01-01

    Background Sequencing and annotation of several mammalian genomes have revealed that segmental duplications are a common architectural feature of primate genomes; in fact, about 5% of the human genome is composed of large blocks of interspersed segmental duplications. These segmental duplications have been implicated in genomic copy-number variation, gene novelty, and various genomic disorders. However, the molecular processes involved in the evolution and regulation of duplicated sequences remain largely unexplored. Results In this study, the profile of about 20 histone modifications within human segmental duplications was characterized using high-resolution, genome-wide data derived from a ChIP-Seq study. The analysis demonstrates that derivative loci of segmental duplications often differ significantly from the original with respect to many histone methylations. Further investigation showed that genes are present three times more frequently in the original than in the derivative, whereas pseudogenes exhibit the opposite trend. These asymmetries tend to increase with the age of segmental duplications. The uneven distribution of genes and pseudogenes does not, however, fully account for the asymmetry in the profile of histone modifications. Conclusion The first systematic analysis of histone modifications between segmental duplications demonstrates that two seemingly 'identical' genomic copies are distinct in their epigenomic properties. Results here suggest that local chromatin environments may be implicated in the discrimination of derived copies of segmental duplications from their originals, leading to a biased pseudogenization of the new duplicates. The data also indicate that further exploration of the interactions between histone modification and sequence degeneration is necessary in order to understand the divergence of duplicated sequences. PMID:18598352

  6. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

    PubMed

    Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

    2018-04-01

    Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  7. Stimulation and inhibition of bacterial growth by caffeine dependent on chloramphenicol and a phenolic uncoupler--a ternary toxicity study using microfluid segment technique.

    PubMed

    Cao, Jialan; Kürsten, Dana; Schneider, Steffen; Köhler, J Michael

    2012-10-01

    A droplet-based microfluidic technique for the fast generation of three dimensional concentration spaces within nanoliter segments was introduced. The technique was applied for the evaluation of the effect of two selected antibiotic substances on the toxicity and activation of bacterial growth by caffeine. Therefore a three-dimensional concentration space was completely addressed by generating large sequences with about 1150 well separated microdroplets containing 216 different combinations of concentrations. To evaluate the toxicity of the ternary mixtures a time-resolved miniaturized optical double endpoint detection unit using a microflow-through fluorimeter and a two channel microflow-through photometer was used for the simultaneous analysis of changes on the endogenous cellular fluorescence signal and on the cell density of E. coli cultivated inside 500 nL microfluid segments. Both endpoints supplied similar results for the dose related cellular response. Strong non-linear combination effects, concentration dependent stimulation and the formation of activity summits on bolographic maps were determined. The results reflect a complex response of growing bacterial cultures in dependence on the combined effectors. A strong caffeine induced enhancement of bacterial growth was found at sublethal chloramphenicol and sublethal 2,4-dinitrophenol concentrations. The reliability of the method was proved by a high redundancy of fluidic experiments. The results indicate the importance of multi-parameter investigations for toxicological studies and prove the potential of the microsegmented flow technique for such requirements.

  8. Holotransformations of bacterial colonies and genome cybernetics

    NASA Astrophysics Data System (ADS)

    Ben-Jacob, Eshel; Tenenbaum, Adam; Shochet, Ofer; Avidan, Orna

    1994-01-01

    We present a study of colony transformations during growth of Bacillus subtilis under adverse environmental conditions. It is a continuation of our pilot study of “Adaptive self-organization during growth of bacterial colonies” (Physica A 187 (1992) 378). First we identify and describe the transformations pathway, i.e. the excitation of the branching modes from Bacillus subtilis 168 (grown under diffusion limited conditions) and the phase transformations between the tip-splitting phase (phase T) and the chiral phase (phase C) which belong to the same mode. This pathway shows the evolution of complexity as the bacteria are exposed to adverse growth conditions. We present the morphology diagram of phases T and C as a function of agar concentration and pepton level. As expected, the growth of phase T is ramified (fractal-like or DLA-like) at low pepton level (about 1 g/1) and turns compact at high pepton level (about 10 g/1). The growth of phase C is also ramified at low pepton level and turns denser and finally compact as the pepton level increases. Generally speaking, the colonies develop more complex patterns and higher micro-level organization for more adverse environments. We use the growth velocity as a response function to describe the growth. At low agar concentration (and low pepton level) phase C grows faster than phase T, and for a high agar concentration (about 2%) phase T grows faster. We observe colony transformations between the two phases (phase transformations). They are found to be consistent with the “fastest growing morphology” selection principle adopted from azoic systems. The transformations are always from the slower phase to the faster one. Hence, we observe T→ C transformations at low agar concentrations and C→ T transformations at high agar concentrations. We have observed both localized and extended transformations. Usually, the transformations are localized for more adverse growth conditions, and extended for growth conditions

  9. Bacterial communities in different locations, seasons and segments of a dairy wastewater treatment system consisting of six segments.

    PubMed

    Hirota, Kikue; Yokota, Yuji; Sekimura, Toru; Uchiumi, Hiroshi; Guo, Yong; Ohta, Hiroyuki; Yumoto, Isao

    2016-08-01

    A dairy wastewater treatment system composed of the 1st segment (no aeration) equipped with a facility for the destruction of milk fat particles, four successive aerobic treatment segments with activated sludge and a final sludge settlement segment was developed. The activated sludge is circulated through the six segments by settling sediments (activated sludge) in the 6th segment and sending the sediments beck to the 1st and 2nd segments. Microbiota was examined using samples from the non-aerated 1st and aerated 2nd segments obtained from two farms using the same system in summer or winter. Principal component analysis showed that the change in microbiota from the 1st to 2nd segments concomitant with effective wastewater treatment is affected by the concentrations of activated sludge and organic matter (biological oxygen demand [BOD]), and dissolved oxygen (DO) content. Microbiota from five segments (1st and four successive aerobic segments) in one location was also examined. Although the activated sludge is circulating throughout all the segments, microbiota fluctuation was observed. The observed successive changes in microbiota reflected the changes in the concentrations of organic matter and other physicochemical conditions (such as DO), suggesting that the microbiota is flexibly changeable depending on the environmental condition in the segments. The genera Dechloromonas, Zoogloea and Leptothrix are frequently observed in this wastewater treatment system throughout the analyses of microbiota in this study. Copyright © 2016. Published by Elsevier B.V.

  10. Genomic context drives transcription of insertion sequences in the bacterial endosymbiont Wolbachia wVulC.

    PubMed

    Cerveau, Nicolas; Gilbert, Clément; Liu, Chao; Garrett, Roger A; Grève, Pierre; Bouchon, Didier; Cordaux, Richard

    2015-06-10

    Transposable elements (TEs) are DNA pieces that are present in almost all the living world at variable genomic density. Due to their mobility and density, TEs are involved in a large array of genomic modifications. In eukaryotes, TE expression has been studied in detail in several species. In prokaryotes, studies of IS expression are generally linked to particular copies that induce a modification of neighboring gene expression. Here we investigated global patterns of IS transcription in the Alphaproteobacterial endosymbiont Wolbachia wVulC, using both RT-PCR and bioinformatic analyses. We detected several transcriptional promoters in all IS groups. Nevertheless, only one of the potentially functional IS groups possesses a promoter located upstream of the transposase gene, that could lead up to the production of a functional protein. We found that the majority of IS groups are expressed whatever their functional status. RT-PCR analyses indicate that the transcription of two IS groups lacking internal promoters upstream of the transposase start codon may be driven by the genomic environment. We confirmed this observation with the transcription analysis of individual copies of one IS group. These results suggest that the genomic environment is important for IS expression and it could explain, at least partly, copy number variability of the various IS groups present in the wVulC genome and, more generally, in bacterial genomes. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. A bacterial genome in transition - an exceptional enrichment of IS elements but lack of evidence for recent transposition in the symbiont Amoebophilus asiaticus

    PubMed Central

    2011-01-01

    Background Insertion sequence (IS) elements are important mediators of genome plasticity and are widespread among bacterial and archaeal genomes. The 1.88 Mbp genome of the obligate intracellular amoeba symbiont Amoebophilus asiaticus contains an unusually large number of transposase genes (n = 354; 23% of all genes). Results The transposase genes in the A. asiaticus genome can be assigned to 16 different IS elements termed ISCaa1 to ISCaa16, which are represented by 2 to 24 full-length copies, respectively. Despite this high IS element load, the A. asiaticus genome displays a GC skew pattern typical for most bacterial genomes, indicating that no major rearrangements have occurred recently. Additionally, the high sequence divergence of some IS elements, the high number of truncated IS element copies (n = 143), as well as the absence of direct repeats in most IS elements suggest that the IS elements of A. asiaticus are transpositionally inactive. Although we could show transcription of 13 IS elements, we did not find experimental evidence for transpositional activity, corroborating our results from sequence analyses. However, we detected contiguous transcripts between IS elements and their downstream genes at nine loci in the A. asiaticus genome, indicating that some IS elements influence the transcription of downstream genes, some of which might be important for host cell interaction. Conclusions Taken together, the IS elements in the A. asiaticus genome are currently in the process of degradation and largely represent reflections of the evolutionary past of A. asiaticus in which its genome was shaped by their activity. PMID:21943072

  12. A simple model for DNA bridging proteins and bacterial or human genomes: bridging-induced attraction and genome compaction

    NASA Astrophysics Data System (ADS)

    Johnson, J.; Brackley, C. A.; Cook, P. R.; Marenduzzo, D.

    2015-02-01

    We present computer simulations of the phase behaviour of an ensemble of proteins interacting with a polymer, mimicking non-specific binding to a piece of bacterial DNA or eukaryotic chromatin. The proteins can simultaneously bind to the polymer in two or more places to create protein bridges. Despite the lack of any explicit interaction between the proteins or between DNA segments, our simulations confirm previous results showing that when the protein-polymer interaction is sufficiently strong, the proteins come together to form clusters. Furthermore, a sufficiently large concentration of bridging proteins leads to the compaction of the swollen polymer into a globular phase. Here we characterise both the formation of protein clusters and the polymer collapse as a function of protein concentration, protein-polymer affinity and fibre flexibility.

  13. Complete Genome Sequence of Lactobacillus rhamnosus Strain BPL5 (CECT 8800), a Probiotic for Treatment of Bacterial Vaginosis.

    PubMed

    Chenoll, Empar; Codoñer, Francisco M; Martinez-Blanch, Juan F; Ramón, Daniel; Genovés, Salvador; Menabrito, Marco

    2016-04-21

    ITALIC! Lactobacillus rhamnosusBPL5 (CECT 8800), is a probiotic strain suitable for the treatment of bacterial vaginosis. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insight into its functional activity. Copyright © 2016 Chenoll et al.

  14. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie

    2014-06-18

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealedmore » substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.« less

  15. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie

    2014-05-12

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealedmore » substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.« less

  16. Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains

    DOE PAGES

    Whitman, William B.; Woyke, Tanja; Klenk, Hans-Peter; ...

    2015-05-17

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project to sequence about 250 bacterial and archaeal genomes of elevated phylogenetic diversity. Here in this paper, we propose to extend this approach to type strains of prokaryotes associated with soil or plants and their close relatives as well as type strains from newly described species. Understanding the microbiology of soil and plants is critical to many DOE mission areas, such as biofuel production from biomass, biogeochemistry, and carbon cycling. We are also targeting type strains of novel species while theymore » are being described. Since 2006, about 630 new species have been described per year, many of which are closely aligned to DOE areas of interest in soil, agriculture, degradation of pollutants, biofuel production, biogeochemical transformation, and biodiversity« less

  17. Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains

    PubMed Central

    2015-01-01

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project to sequence about 250 bacterial and archaeal genomes of elevated phylogenetic diversity. Herein, we propose to extend this approach to type strains of prokaryotes associated with soil or plants and their close relatives as well as type strains from newly described species. Understanding the microbiology of soil and plants is critical to many DOE mission areas, such as biofuel production from biomass, biogeochemistry, and carbon cycling. We are also targeting type strains of novel species while they are being described. Since 2006, about 630 new species have been described per year, many of which are closely aligned to DOE areas of interest in soil, agriculture, degradation of pollutants, biofuel production, biogeochemical transformation, and biodiversity. PMID:26203337

  18. Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whitman, William B.; Woyke, Tanja; Klenk, Hans-Peter

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project to sequence about 250 bacterial and archaeal genomes of elevated phylogenetic diversity. Here in this paper, we propose to extend this approach to type strains of prokaryotes associated with soil or plants and their close relatives as well as type strains from newly described species. Understanding the microbiology of soil and plants is critical to many DOE mission areas, such as biofuel production from biomass, biogeochemistry, and carbon cycling. We are also targeting type strains of novel species while theymore » are being described. Since 2006, about 630 new species have been described per year, many of which are closely aligned to DOE areas of interest in soil, agriculture, degradation of pollutants, biofuel production, biogeochemical transformation, and biodiversity« less

  19. Genome Engineering in Bacillus anthracis Using Cre Recombinase

    PubMed Central

    Pomerantsev, Andrei P.; Sitaraman, Ramakrishnan; Galloway, Craig R.; Kivovich, Violetta; Leppla, Stephen H.

    2006-01-01

    Genome engineering is a powerful method for the study of bacterial virulence. With the availability of the complete genomic sequence of Bacillus anthracis, it is now possible to inactivate or delete selected genes of interest. However, many current methods for disrupting or deleting more than one gene require use of multiple antibiotic resistance determinants. In this report we used an approach that temporarily inserts an antibiotic resistance marker into a selected region of the genome and subsequently removes it, leaving the target region (a single gene or a larger genomic segment) permanently mutated. For this purpose, a spectinomycin resistance cassette flanked by bacteriophage P1 loxP sites oriented as direct repeats was inserted within a selected gene. After identification of strains having the spectinomycin cassette inserted by a double-crossover event, a thermo-sensitive plasmid expressing Cre recombinase was introduced at the permissive temperature. Cre recombinase action at the loxP sites excised the spectinomycin marker, leaving a single loxP site within the targeted gene or genomic segment. The Cre-expressing plasmid was then removed by growth at the restrictive temperature. The procedure could then be repeated to mutate additional genes. In this way, we sequentially mutated two pairs of genes: pepM and spo0A, and mcrB and mrr. Furthermore, loxP sites introduced at distant genes could be recombined by Cre recombinase to cause deletion of large intervening regions. In this way, we deleted the capBCAD region of the pXO2 plasmid and the entire 30 kb of chromosomal DNA between the mcrB and mrr genes, and in the latter case we found that the 32 intervening open reading frames were not essential to growth. PMID:16369025

  20. First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking

    PubMed Central

    Ohashi, Akiko; Parks, Donovan H.; Yamauchi, Toshihiro; Tyson, Gene W.

    2015-01-01

    Filamentous cells belonging to the candidate bacterial phylum KSB3 were previously identified as the causative agent of fatal filament overgrowth (bulking) in a high-rate industrial anaerobic wastewater treatment bioreactor. Here, we obtained near complete genomes from two KSB3 populations in the bioreactor, including the dominant bulking filament, using differential coverage binning of metagenomic data. Fluorescence in situ hybridization with 16S rRNA-targeted probes specific for the two populations confirmed that both are filamentous organisms. Genome-based metabolic reconstruction and microscopic observation of the KSB3 filaments in the presence of sugar gradients indicate that both filament types are Gram-negative, strictly anaerobic fermenters capable of non-flagellar based gliding motility, and have a strikingly large number of sensory and response regulator genes. We propose that the KSB3 filaments are highly sensitive to their surroundings and that cellular processes, including those causing bulking, are controlled by external stimuli. The obtained genomes lay the foundation for a more detailed understanding of environmental cues used by KSB3 filaments, which may lead to more robust treatment options to prevent bulking. PMID:25650158

  1. Automatic Segmentation of High-Throughput RNAi Fluorescent Cellular Images

    PubMed Central

    Yan, Pingkum; Zhou, Xiaobo; Shah, Mubarak; Wong, Stephen T. C.

    2010-01-01

    High-throughput genome-wide RNA interference (RNAi) screening is emerging as an essential tool to assist biologists in understanding complex cellular processes. The large number of images produced in each study make manual analysis intractable; hence, automatic cellular image analysis becomes an urgent need, where segmentation is the first and one of the most important steps. In this paper, a fully automatic method for segmentation of cells from genome-wide RNAi screening images is proposed. Nuclei are first extracted from the DNA channel by using a modified watershed algorithm. Cells are then extracted by modeling the interaction between them as well as combining both gradient and region information in the Actin and Rac channels. A new energy functional is formulated based on a novel interaction model for segmenting tightly clustered cells with significant intensity variance and specific phenotypes. The energy functional is minimized by using a multiphase level set method, which leads to a highly effective cell segmentation method. Promising experimental results demonstrate that automatic segmentation of high-throughput genome-wide multichannel screening can be achieved by using the proposed method, which may also be extended to other multichannel image segmentation problems. PMID:18270043

  2. GenColors-based comparative genome databases for small eukaryotic genomes.

    PubMed

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  3. Microeconomic principles explain an optimal genome size in bacteria.

    PubMed

    Ranea, Juan A G; Grant, Alastair; Thornton, Janet M; Orengo, Christine A

    2005-01-01

    Bacteria can clearly enhance their survival by expanding their genetic repertoire. However, the tight packing of the bacterial genome and the fact that the most evolved species do not necessarily have the biggest genomes suggest there are other evolutionary factors limiting their genome expansion. To clarify these restrictions on size, we studied those protein families contributing most significantly to bacterial-genome complexity. We found that all bacteria apply the same basic and ancestral 'molecular technology' to optimize their reproductive efficiency. The same microeconomics principles that define the optimum size in a factory can also explain the existence of a statistical optimum in bacterial genome size. This optimum is reached when the bacterial genome obtains the maximum metabolic complexity (revenue) for minimal regulatory genes (logistic cost).

  4. Evolution of Salmonella-Host Cell Interactions through a Dynamic Bacterial Genome

    PubMed Central

    Ilyas, Bushra; Tsai, Caressa N.; Coombes, Brian K.

    2017-01-01

    Salmonella Typhimurium has a broad arsenal of genes that are tightly regulated and coordinated to facilitate adaptation to the various host environments it colonizes. The genome of Salmonella Typhimurium has undergone multiple gene acquisition events and has accrued changes in non-coding DNA that have undergone selection by regulatory evolution. Together, at least 17 horizontally acquired pathogenicity islands (SPIs), prophage-associated genes, and changes in core genome regulation contribute to the virulence program of Salmonella. Here, we review the latest understanding of these elements and their contributions to pathogenesis, emphasizing the regulatory circuitry that controls niche-specific gene expression. In addition to an overview of the importance of SPI-1 and SPI-2 to host invasion and colonization, we describe the recently characterized contributions of other SPIs, including the antibacterial activity of SPI-6 and adhesion and invasion mediated by SPI-4. We further discuss how these fitness traits have been integrated into the regulatory circuitry of the bacterial cell through cis-regulatory evolution and by a careful balance of silencing and counter-silencing by regulatory proteins. Detailed understanding of regulatory evolution within Salmonella is uncovering novel aspects of infection biology that relate to host-pathogen interactions and evasion of host immunity. PMID:29034217

  5. Human, Mouse, and Rat Genome Large-Scale Rearrangements: Stability Versus Speciation

    PubMed Central

    Zhao, Shaying; Shetty, Jyoti; Hou, Lihua; Delcher, Arthur; Zhu, Baoli; Osoegawa, Kazutoyo; de Jong, Pieter; Nierman, William C.; Strausberg, Robert L.; Fraser, Claire M.

    2004-01-01

    Using paired-end sequences from bacterial artificial chromosomes, we have constructed high-resolution synteny and rearrangement breakpoint maps among human, mouse, and rat genomes. Among the >300 syntenic blocks identified are segments of over 40 Mb without any detected interspecies rearrangements, as well as regions with frequently broken synteny and extensive rearrangements. As closely related species, mouse and rat share the majority of the breakpoints and often have the same types of rearrangements when compared with the human genome. However, the breakpoints not shared between them indicate that mouse rearrangements are more often interchromosomal, whereas intrachromosomal rearrangements are more prominent in rat. Centromeres may have played a significant role in reorganizing a number of chromosomes in all three species. The comparison of the three species indicates that genome rearrangements follow a path that accommodates a delicate balance between maintaining a basic structure underlying all mammalian species and permitting variations that are necessary for speciation. PMID:15364903

  6. Exploring Other Genomes: Bacteria.

    ERIC Educational Resources Information Center

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  7. Application of Chemical Genomics to Plant-Bacteria Communication: A High-Throughput System to Identify Novel Molecules Modulating the Induction of Bacterial Virulence Genes by Plant Signals.

    PubMed

    Vandelle, Elodie; Puttilli, Maria Rita; Chini, Andrea; Devescovi, Giulia; Venturi, Vittorio; Polverari, Annalisa

    2017-01-01

    The life cycle of bacterial phytopathogens consists of a benign epiphytic phase, during which the bacteria grow in the soil or on the plant surface, and a virulent endophytic phase involving the penetration of host defenses and the colonization of plant tissues. Innovative strategies are urgently required to integrate copper treatments that control the epiphytic phase with complementary tools that control the virulent endophytic phase, thus reducing the quantity of chemicals applied to economically and ecologically acceptable levels. Such strategies include targeted treatments that weaken bacterial pathogens, particularly those inhibiting early infection steps rather than tackling established infections. This chapter describes a reporter gene-based chemical genomic high-throughput screen for the induction of bacterial virulence by plant molecules. Specifically, we describe a chemical genomic screening method to identify agonist and antagonist molecules for the induction of targeted bacterial virulence genes by plant extracts, focusing on the experimental controls required to avoid false positives and thus ensuring the results are reliable and reproducible.

  8. Crop to wild introgression in lettuce: following the fate of crop genome segments in backcross populations

    PubMed Central

    2012-01-01

    Background After crop-wild hybridization, some of the crop genomic segments may become established in wild populations through selfing of the hybrids or through backcrosses to the wild parent. This constitutes a possible route through which crop (trans)genes could become established in natural populations. The likelihood of introgression of transgenes will not only be determined by fitness effects from the transgene itself but also by the crop genes linked to it. Although lettuce is generally regarded as self-pollinating, outbreeding does occur at a low frequency. Backcrossing to wild lettuce is a likely pathway to introgression along with selfing, due to the high frequency of wild individuals relative to the rarely occurring crop-wild hybrids. To test the effect of backcrossing on the vigour of inter-specific hybrids, Lactuca serriola, the closest wild relative of cultivated lettuce, was crossed with L. sativa and the F1 hybrid was backcrossed to L. serriola to generate BC1 and BC2 populations. Experiments were conducted on progeny from selfed plants of the backcrossing families (BC1S1 and BC2S1). Plant vigour of these two backcrossing populations was determined in the greenhouse under non-stress and abiotic stress conditions (salinity, drought, and nutrient deficiency). Results Despite the decreasing contribution of crop genomic blocks in the backcross populations, the BC1S1 and BC2S1 hybrids were characterized by a substantial genetic variation under both non-stress and stress conditions. Hybrids were identified that performed equally or better than the wild genotypes, indicating that two backcrossing events did not eliminate the effect of the crop genomic segments that contributed to the vigour of the BC1 and BC2 hybrids. QTLs for plant vigour under non-stress and the various stress conditions were detected in the two populations with positive as well as negative effects from the crop. Conclusion As it was shown that the crop contributed QTLs with either a

  9. Crop to wild introgression in lettuce: following the fate of crop genome segments in backcross populations.

    PubMed

    Uwimana, Brigitte; Smulders, Marinus J M; Hooftman, Danny A P; Hartman, Yorike; van Tienderen, Peter H; Jansen, Johannes; McHale, Leah K; Michelmore, Richard W; Visser, Richard G F; van de Wiel, Clemens C M

    2012-03-26

    After crop-wild hybridization, some of the crop genomic segments may become established in wild populations through selfing of the hybrids or through backcrosses to the wild parent. This constitutes a possible route through which crop (trans)genes could become established in natural populations. The likelihood of introgression of transgenes will not only be determined by fitness effects from the transgene itself but also by the crop genes linked to it. Although lettuce is generally regarded as self-pollinating, outbreeding does occur at a low frequency. Backcrossing to wild lettuce is a likely pathway to introgression along with selfing, due to the high frequency of wild individuals relative to the rarely occurring crop-wild hybrids. To test the effect of backcrossing on the vigour of inter-specific hybrids, Lactuca serriola, the closest wild relative of cultivated lettuce, was crossed with L. sativa and the F(1) hybrid was backcrossed to L. serriola to generate BC(1) and BC(2) populations. Experiments were conducted on progeny from selfed plants of the backcrossing families (BC(1)S(1) and BC(2)S(1)). Plant vigour of these two backcrossing populations was determined in the greenhouse under non-stress and abiotic stress conditions (salinity, drought, and nutrient deficiency). Despite the decreasing contribution of crop genomic blocks in the backcross populations, the BC(1)S(1) and BC(2)S(1) hybrids were characterized by a substantial genetic variation under both non-stress and stress conditions. Hybrids were identified that performed equally or better than the wild genotypes, indicating that two backcrossing events did not eliminate the effect of the crop genomic segments that contributed to the vigour of the BC(1) and BC(2) hybrids. QTLs for plant vigour under non-stress and the various stress conditions were detected in the two populations with positive as well as negative effects from the crop. As it was shown that the crop contributed QTLs with either a positive

  10. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    NASA Astrophysics Data System (ADS)

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S.; Ellis, Tom

    2016-03-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity.

  11. Comprehensive phylogenetic analysis of bacterial reverse transcriptases.

    PubMed

    Toro, Nicolás; Nisa-Martínez, Rafael

    2014-01-01

    Much less is known about reverse transcriptases (RTs) in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center) platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (≤85% identity) per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs), Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L), and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology.

  12. Comprehensive Phylogenetic Analysis of Bacterial Reverse Transcriptases

    PubMed Central

    Toro, Nicolás; Nisa-Martínez, Rafael

    2014-01-01

    Much less is known about reverse transcriptases (RTs) in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center) platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (≤85% identity) per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs), Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L), and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology. PMID:25423096

  13. Coordination of genomic structure and transcription by the main bacterial nucleoid-associated protein HU

    PubMed Central

    Berger, Michael; Farcas, Anca; Geertz, Marcel; Zhelyazkova, Petya; Brix, Klaudia; Travers, Andrew; Muskhelishvili, Georgi

    2010-01-01

    The histone-like protein HU is a highly abundant DNA architectural protein that is involved in compacting the DNA of the bacterial nucleoid and in regulating the main DNA transactions, including gene transcription. However, the coordination of the genomic structure and function by HU is poorly understood. Here, we address this question by comparing transcript patterns and spatial distributions of RNA polymerase in Escherichia coli wild-type and hupA/B mutant cells. We demonstrate that, in mutant cells, upregulated genes are preferentially clustered in a large chromosomal domain comprising the ribosomal RNA operons organized on both sides of OriC. Furthermore, we show that, in parallel to this transcription asymmetry, mutant cells are also impaired in forming the transcription foci—spatially confined aggregations of RNA polymerase molecules transcribing strong ribosomal RNA operons. Our data thus implicate HU in coordinating the global genomic structure and function by regulating the spatial distribution of RNA polymerase in the nucleoid. PMID:20010798

  14. Construction of a nurse shark (Ginglymostoma cirratum) bacterial artificial chromosome (BAC) library and a preliminary genome survey.

    PubMed

    Luo, Meizhong; Kim, Hyeran; Kudrna, Dave; Sisneros, Nicholas B; Lee, So-Jeong; Mueller, Christopher; Collura, Kristi; Zuccolo, Andrea; Buckingham, E Bryan; Grim, Suzanne M; Yanagiya, Kazuyo; Inoko, Hidetoshi; Shiina, Takashi; Flajnik, Martin F; Wing, Rod A; Ohta, Yuko

    2006-05-03

    Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC) library for the nurse shark, Ginglymostoma cirratum. The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 x 1010 bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6-28 primary positive clones per probe of which 50-90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome.

  15. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    PubMed

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  16. A Bioinformatic Strategy for the Detection, Classification and Analysis of Bacterial Autotransporters

    PubMed Central

    Celik, Nermin; Webb, Chaille T.; Leyton, Denisse L.; Holt, Kathryn E.; Heinz, Eva; Gorrell, Rebecca; Kwok, Terry; Naderer, Thomas; Strugnell, Richard A.; Speed, Terence P.; Teasdale, Rohan D.; Likić, Vladimir A.; Lithgow, Trevor

    2012-01-01

    Autotransporters are secreted proteins that are assembled into the outer membrane of bacterial cells. The passenger domains of autotransporters are crucial for bacterial pathogenesis, with some remaining attached to the bacterial surface while others are released by proteolysis. An enigma remains as to whether autotransporters should be considered a class of secretion system, or simply a class of substrate with peculiar requirements for their secretion. We sought to establish a sensitive search protocol that could identify and characterize diverse autotransporters from bacterial genome sequence data. The new sequence analysis pipeline identified more than 1500 autotransporter sequences from diverse bacteria, including numerous species of Chlamydiales and Fusobacteria as well as all classes of Proteobacteria. Interrogation of the proteins revealed that there are numerous classes of passenger domains beyond the known proteases, adhesins and esterases. In addition the barrel-domain-a characteristic feature of autotransporters-was found to be composed from seven conserved sequence segments that can be arranged in multiple ways in the tertiary structure of the assembled autotransporter. One of these conserved motifs overlays the targeting information required for autotransporters to reach the outer membrane. Another conserved and diagnostic motif maps to the linker region between the passenger domain and barrel-domain, indicating it as an important feature in the assembly of autotransporters. PMID:22905239

  17. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    PubMed

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Genomic Analysis of Caldithrix abyssi, the Thermophilic Anaerobic Bacterium of the Novel Bacterial Phylum Calditrichaeota

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.

    The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H 2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family,more » while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H 2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis

  19. Genomic Analysis of Caldithrix abyssi, the Thermophilic Anaerobic Bacterium of the Novel Bacterial Phylum Calditrichaeota

    DOE PAGES

    Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.; ...

    2017-02-20

    The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H 2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family,more » while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H 2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis

  20. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers.

    PubMed

    Varshney, Rajeev K; Chen, Wenbin; Li, Yupeng; Bharti, Arvind K; Saxena, Rachit K; Schlueter, Jessica A; Donoghue, Mark T A; Azam, Sarwar; Fan, Guangyi; Whaley, Adam M; Farmer, Andrew D; Sheridan, Jaime; Iwata, Aiko; Tuteja, Reetu; Penmetsa, R Varma; Wu, Wei; Upadhyaya, Hari D; Yang, Shiaw-Pyng; Shah, Trushar; Saxena, K B; Michael, Todd; McCombie, W Richard; Yang, Bicheng; Zhang, Gengyun; Yang, Huanming; Wang, Jun; Spillane, Charles; Cook, Douglas R; May, Gregory D; Xu, Xun; Jackson, Scott A

    2011-11-06

    Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences and a genetic map, we assembled into scaffolds representing 72.7% (605.78 Mb) of the 833.07 Mb pigeonpea genome. Genome analysis predicted 48,680 genes for pigeonpea and also showed the potential role that certain gene families, for example, drought tolerance-related genes, have played throughout the domestication of pigeonpea and the evolution of its ancestors. Although we found a few segmental duplication events, we did not observe the recent genome-wide duplication events observed in soybean. This reference genome sequence will facilitate the identification of the genetic basis of agronomically important traits, and accelerate the development of improved pigeonpea varieties that could improve food security in many developing countries.

  1. Plant growth-promoting bacterial endophytes.

    PubMed

    Santoyo, Gustavo; Moreno-Hagelsieb, Gabriel; Orozco-Mosqueda, Ma del Carmen; Glick, Bernard R

    2016-02-01

    Bacterial endophytes ubiquitously colonize the internal tissues of plants, being found in nearly every plant worldwide. Some endophytes are able to promote the growth of plants. For those strains the mechanisms of plant growth-promotion known to be employed by bacterial endophytes are similar to the mechanisms used by rhizospheric bacteria, e.g., the acquisition of resources needed for plant growth and modulation of plant growth and development. Similar to rhizospheric plant growth-promoting bacteria, endophytic plant growth-promoting bacteria can act to facilitate plant growth in agriculture, horticulture and silviculture as well as in strategies for environmental cleanup (i.e., phytoremediation). Genome comparisons between bacterial endophytes and the genomes of rhizospheric plant growth-promoting bacteria are starting to unveil potential genetic factors involved in an endophytic lifestyle, which should facilitate a better understanding of the functioning of bacterial endophytes. Copyright © 2015 Elsevier GmbH. All rights reserved.

  2. GI-SVM: A sensitive method for predicting genomic islands based on unannotated sequence of a single genome.

    PubMed

    Lu, Bingxin; Leong, Hon Wai

    2016-02-01

    Genomic islands (GIs) are clusters of functionally related genes acquired by lateral genetic transfer (LGT), and they are present in many bacterial genomes. GIs are extremely important for bacterial research, because they not only promote genome evolution but also contain genes that enhance adaption and enable antibiotic resistance. Many methods have been proposed to predict GI. But most of them rely on either annotations or comparisons with other closely related genomes. Hence these methods cannot be easily applied to new genomes. As the number of newly sequenced bacterial genomes rapidly increases, there is a need for methods to detect GI based solely on sequences of a single genome. In this paper, we propose a novel method, GI-SVM, to predict GIs given only the unannotated genome sequence. GI-SVM is based on one-class support vector machine (SVM), utilizing composition bias in terms of k-mer content. From our evaluations on three real genomes, GI-SVM can achieve higher recall compared with current methods, without much loss of precision. Besides, GI-SVM allows flexible parameter tuning to get optimal results for each genome. In short, GI-SVM provides a more sensitive method for researchers interested in a first-pass detection of GI in newly sequenced genomes.

  3. Involvement of β-carbonic anhydrase (β-CA) genes in bacterial genomic islands and horizontal transfer to protists.

    PubMed

    Zolfaghari Emameh, Reza; Barker, Harlan R; Hytönen, Vesa P; Parkkila, Seppo

    2018-05-25

    Genomic islands (GIs) are a type of mobile genetic element (MGE) that are present in bacterial chromosomes. They consist of a cluster of genes which produce proteins that contribute to a variety of functions, including, but not limited to, regulation of cell metabolism, anti-microbial resistance, pathogenicity, virulence, and resistance to heavy metals. The genes carried in MGEs can be used as a trait reservoir in times of adversity. Transfer of genes using MGEs, occurring outside of reproduction, is called horizontal gene transfer (HGT). Previous literature has shown that numerous HGT events have occurred through endosymbiosis between prokaryotes and eukaryotes.Beta carbonic anhydrase (β-CA) enzymes play a critical role in the biochemical pathways of many prokaryotes and eukaryotes. We have previously suggested horizontal transfer of β-CA genes from plasmids of some prokaryotic endosymbionts to their protozoan hosts. In this study, we set out to identify β-CA genes that might have transferred between prokaryotic and protist species through HGT in GIs. Therefore, we investigated prokaryotic chromosomes containing β-CA-encoding GIs and utilized multiple bioinformatics tools to reveal the distinct movements of β-CA genes among a wide variety of organisms. Our results identify the presence of β-CA genes in GIs of several medically and industrially relevant bacterial species, and phylogenetic analyses reveal multiple cases of likely horizontal transfer of β-CA genes from GIs of ancestral prokaryotes to protists. IMPORTANCE The evolutionary process is mediated by mobile genetic elements (MGEs), such as genomic islands (GIs). A gene or set of genes in the GIs are exchanged between and within various species through horizontal gene transfer (HGT). Based on the crucial role that GIs can play in bacterial survival and proliferation, they were introduced as the environmental- and pathogen-associated factors. Carbonic anhydrases (CAs) are involved in many critical

  4. Incompatibility and competitive exclusion of genomic segments between sibling Drosophila species.

    PubMed

    Fang, Shu; Yukilevich, Roman; Chen, Ying; Turissini, David A; Zeng, Kai; Boussy, Ian A; Wu, Chung-I

    2012-06-01

    The extent and nature of genetic incompatibilities between incipient races and sibling species is of fundamental importance to our view of speciation. However, with the exception of hybrid inviability and sterility factors, little is known about the extent of other, more subtle genetic incompatibilities between incipient species. Here we experimentally demonstrate the prevalence of such genetic incompatibilities between two young allopatric sibling species, Drosophila simulans and D. sechellia. Our experiments took advantage of 12 introgression lines that carried random introgressed D. sechellia segments in different parts of the D. simulans genome. First, we found that these introgression lines did not show any measurable sterility or inviability effects. To study if these sechellia introgressions in a simulans background contained other fitness consequences, we competed and genetically tracked the marked alleles within each introgression against the wild-type alleles for 20 generations. Strikingly, all marked D. sechellia introgression alleles rapidly decreased in frequency in only 6 to 7 generations. We then developed computer simulations to model our competition results. These simulations indicated that selection against D. sechellia introgression alleles was high (average s = 0.43) and that the marker alleles and the incompatible alleles did not separate in 78% of the introgressions. The latter result likely implies that most introgressions contain multiple genetic incompatibilities. Thus, this study reveals that, even at early stages of speciation, many parts of the genome diverge to a point where introducing foreign elements has detrimental fitness consequences, but which cannot be seen using standard sterility and inviability assays.

  5. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    PubMed Central

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  6. Potential for La Crosse virus segment reassortment in nature

    PubMed Central

    Reese, Sara M; Blitvich, Bradley J; Blair, Carol D; Geske, Dave; Beaty, Barry J; Black, William C

    2008-01-01

    The evolutionary success of La Crosse virus (LACV, family Bunyaviridae) is due to its ability to adapt to changing conditions through intramolecular genetic changes and segment reassortment. Vertical transmission of LACV in mosquitoes increases the potential for segment reassortment. Studies were conducted to determine if segment reassortment was occurring in naturally infected Aedes triseriatus from Wisconsin and Minnesota in 2000, 2004, 2006 and 2007. Mosquito eggs were collected from various sites in Wisconsin and Minnesota. They were reared in the laboratory and adults were tested for LACV antigen by immunofluorescence assay. RNA was isolated from the abdomen of infected mosquitoes and portions of the small (S), medium (M) and large (L) viral genome segments were amplified by RT-PCR and sequenced. Overall, the viral sequences from 40 infected mosquitoes and 5 virus isolates were analyzed. Phylogenetic and linkage disequilibrium analyses revealed that approximately 25% of infected mosquitoes and viruses contained reassorted genome segments, suggesting that LACV segment reassortment is frequent in nature. PMID:19114023

  7. GenomeVista

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poliakov, Alexander; Couronne, Olivier

    2002-11-04

    Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

  8. The diversity of sequence and chromosomal distribution of new transposable element-related segments in the rye genome revealed by FISH and lineage annotation

    USDA-ARS?s Scientific Manuscript database

    The rye genome features a high percentage of repetitive elements, especially transposable elements (TEs). However, studies about the constitution and organization of TEs on rye chromosomes are limited. In this study, 97 unique TE segments were isolated and characterized; 50 TE segmemts showed varyin...

  9. Identifying Bacterial Immune Evasion Proteins Using Phage Display.

    PubMed

    Fevre, Cindy; Scheepmaker, Lisette; Haas, Pieter-Jan

    2017-01-01

    Methods aimed at identification of immune evasion proteins are mainly rely on in silico prediction of sequence, structural homology to known evasion proteins or use a proteomics driven approach. Although proven successful these methods are limited by a low efficiency and or lack of functional identification. Here we describe a high-throughput genomic strategy to functionally identify bacterial immune evasion proteins using phage display technology. Genomic bacterial DNA is randomly fragmented and ligated into a phage display vector that is used to create a phage display library expressing bacterial secreted and membrane bound proteins. This library is used to select displayed bacterial secretome proteins that interact with host immune components.

  10. The genome of Th17 cell-inducing segmented filamentous bacteria reveals extensive auxotrophy and adaptations to the intestinal environment

    PubMed Central

    Sczesnak, Andrew; Segata, Nicola; Qin, Xiang; Gevers, Dirk; Petrosino, Joseph F.; Huttenhower, Curtis; Littman, Dan R.; Ivanov, Ivaylo I.

    2011-01-01

    Summary Perturbations of the composition of the symbiotic intestinal microbiota can have profound consequences for host metabolism and immunity. In mice, segmented filamentous bacteria (SFB) direct the accumulation of potentially pro-inflammatory Th17 cells in the intestinal lamina propria. We present the genome sequence of SFB isolated from mono-colonized mice, which classifies SFB phylogenetically as a unique member of Clostridiales with a highly reduced genome. Annotation analysis demonstrates that SFB depends on its environment for amino acids and essential nutrients and may utilize host and dietary glycans for carbon, nitrogen, and energy. Comparative analyses reveal that SFB is functionally related to members of the genus Clostridium and several pathogenic or commensal “minimal” genera, including Finegoldia, Mycoplasma, Borrelia, and Phytoplasma. However, SFB is functionally distinct from all 1,200 examined genomes, indicating a gene complement representing biology relatively unique to its role as a gut commensal closely tied to host metabolism and immunity. PMID:21925113

  11. Rapid Bacterial Whole-Genome Sequencing to Enhance Diagnostic and Public Health Microbiology

    PubMed Central

    Reuter, Sandra; Ellington, Matthew J.; Cartwright, Edward J. P.; Köser, Claudio U.; Török, M. Estée; Gouliouris, Theodore; Harris, Simon R.; Brown, Nicholas M.; Holden, Matthew T. G.; Quail, Mike; Parkhill, Julian; Smith, Geoffrey P.; Bentley, Stephen D.; Peacock, Sharon J.

    2014-01-01

    IMPORTANCE The latest generation of benchtop DNA sequencing platforms can provide an accurate whole-genome sequence (WGS) for a broad range of bacteria in less than a day. These could be used to more effectively contain the spread of multidrug-resistant pathogens. OBJECTIVE To compare WGS with standard clinical microbiology practice for the investigation of nosocomial outbreaks caused by multidrug-resistant bacteria, the identification of genetic determinants of antimicrobial resistance, and typing of other clinically important pathogens. DESIGN, SETTING, AND PARTICIPANTS A laboratory-based study of hospital inpatients with a range of bacterial infections at Cambridge University Hospitals NHS Foundation Trust, a secondary and tertiary referral center in England, comparing WGS with standard diagnostic microbiology using stored bacterial isolates and clinical information. MAIN OUTCOMES AND MEASURES Specimens were taken and processed as part of routine clinical care, and cultured isolates stored and referred for additional reference laboratory testing as necessary. Isolates underwent DNA extraction and library preparation prior to sequencing on the Illumina MiSeq platform. Bioinformatic analyses were performed by persons blinded to the clinical, epidemiologic, and antimicrobial susceptibility data. RESULTS We investigated 2 putative nosocomial outbreaks, one caused by vancomycin-resistant Enterococcus faecium and the other by carbapenem-resistant Enterobacter cloacae; WGS accurately discriminated between outbreak and nonoutbreak isolates and was superior to conventional typing methods. We compared WGS with standard methods for the identification of the mechanism of carbapenem resistance in a range of gram-negative bacteria (Acinetobacter baumannii, E cloacae, Escherichia coli, and Klebsiella pneumoniae). This demonstrated concordance between phenotypic and genotypic results, and the ability to determine whether resistance was attributable to the presence of

  12. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands.

    PubMed

    Vercoe, Reuben B; Chang, James T; Dy, Ron L; Taylor, Corinda; Gristwood, Tamzin; Clulow, James S; Richter, Corinna; Przybilski, Rita; Pitman, Andrew R; Fineran, Peter C

    2013-04-01

    In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated (Cas) proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2) involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas-mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM) beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA-targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.

  13. Listeria Genomics

    NASA Astrophysics Data System (ADS)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  14. Microbial Genomes Multiply

    NASA Technical Reports Server (NTRS)

    Doolittle, Russell F.

    2002-01-01

    The publication of the first complete sequence of a bacterial genome in 1995 was a signal event, underscored by the fact that the article has been cited more than 2,100 times during the intervening seven years. It was a marvelous technical achievement, made possible by automatic DNA-sequencing machines. The feat is the more impressive in that complete genome sequencing has now been adopted in many different laboratories around the world. Four years ago in these columns I examined the situation after a dozen microbial genomes had been completed. Now, with upwards of 60 microbial genome sequences determined and twice that many in progress, it seems reasonable to assess just what is being learned. Are new concepts emerging about how cells work? Have there been practical benefits in the fields of medicine and agriculture? Is it feasible to determine the genomic sequence of every bacterial species on Earth? The answers to these questions maybe Yes, Perhaps, and No, respectively.

  15. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    PubMed Central

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including

  16. Genome-wide detection of conservative site-specific recombination in bacteria

    PubMed Central

    Mathias Garrett, Elizabeth; Camilli, Andrew

    2018-01-01

    The ability of clonal bacterial populations to generate genomic and phenotypic heterogeneity is thought to be of great importance for many commensal and pathogenic bacteria. One common mechanism contributing to diversity formation relies on the inversion of small genomic DNA segments in a process commonly referred to as conservative site-specific recombination. This phenomenon is known to occur in several bacterial lineages, however it remains notoriously difficult to identify due to the lack of conserved features. Here, we report an easy-to-implement method based on high-throughput paired-end sequencing for genome-wide detection of conservative site-specific recombination on a single-nucleotide level. We demonstrate the effectiveness of the method by successfully detecting several novel inversion sites in an epidemic isolate of the enteric pathogen Clostridium difficile. Using an experimental approach, we validate the inversion potential of all detected sites in C. difficile and quantify their prevalence during exponential and stationary growth in vitro. In addition, we demonstrate that the master recombinase RecV is responsible for the inversion of some but not all invertible sites. Using a fluorescent gene-reporter system, we show that at least one gene from a two-component system located next to an invertible site is expressed in an on-off mode reminiscent of phase variation. We further demonstrate the applicability of our method by mining 209 publicly available sequencing datasets and show that conservative site-specific recombination is common in the bacterial realm but appears to be absent in some lineages. Finally, we show that the gene content associated with the inversion sites is diverse and goes beyond traditionally described surface components. Overall, our method provides a robust platform for detection of conservative site-specific recombination in bacteria and opens a new avenue for global exploration of this important phenomenon. PMID:29621238

  17. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock

    USGS Publications Warehouse

    Kamath, Pauline L.; Foster, Jeffrey T.; Drees, Kevin P.; Luikart, Gordon; Quance, Christine; Anderson, Neil J.; Clarke, P. Ryan; Cole, Eric K.; Drew, Mark L.; Edwards, William H.; Rhyan, Jack C.; Treanor, John J.; Wallen, Rick L.; White, Patrick J.; Robbe-Austerman, Suelee; Cross, Paul C.

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (B3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations.

  18. Sequence Segmentation with changeptGUI.

    PubMed

    Tasker, Edward; Keith, Jonathan M

    2017-01-01

    Many biological sequences have a segmental structure that can provide valuable clues to their content, structure, and function. The program changept is a tool for investigating the segmental structure of a sequence, and can also be applied to multiple sequences in parallel to identify a common segmental structure, thus providing a method for integrating multiple data types to identify functional elements in genomes. In the previous edition of this book, a command line interface for changept is described. Here we present a graphical user interface for this package, called changeptGUI. This interface also includes tools for pre- and post-processing of data and results to facilitate investigation of the number and characteristics of segment classes.

  19. SeeGH--a software tool for visualization of whole genome array comparative genomic hybridization data.

    PubMed

    Chi, Bryan; DeLeeuw, Ronald J; Coe, Bradley P; MacAulay, Calum; Lam, Wan L

    2004-02-09

    Array comparative genomic hybridization (CGH) is a technique which detects copy number differences in DNA segments. Complete sequencing of the human genome and the development of an array representing a tiling set of tens of thousands of DNA segments spanning the entire human genome has made high resolution copy number analysis throughout the genome possible. Since array CGH provides signal ratio for each DNA segment, visualization would require the reassembly of individual data points into chromosome profiles. We have developed a visualization tool for displaying whole genome array CGH data in the context of chromosomal location. SeeGH is an application that translates spot signal ratio data from array CGH experiments to displays of high resolution chromosome profiles. Data is imported from a simple tab delimited text file obtained from standard microarray image analysis software. SeeGH processes the signal ratio data and graphically displays it in a conventional CGH karyotype diagram with the added features of magnification and DNA segment annotation. In this process, SeeGH imports the data into a database, calculates the average ratio and standard deviation for each replicate spot, and links them to chromosome regions for graphical display. Once the data is displayed, users have the option of hiding or flagging DNA segments based on user defined criteria, and retrieve annotation information such as clone name, NCBI sequence accession number, ratio, base pair position on the chromosome, and standard deviation. SeeGH represents a novel software tool used to view and analyze array CGH data. The software gives users the ability to view the data in an overall genomic view as well as magnify specific chromosomal regions facilitating the precise localization of genetic alterations. SeeGH is easily installed and runs on Microsoft Windows 2000 or later environments.

  20. Within-host evolution of bacterial pathogens

    PubMed Central

    Didelot, Xavier; Walker, A. Sarah; Peto, Tim E.; Crook, Derrick W.; Wilson, Daniel J.

    2016-01-01

    Whole genome sequencing has opened the way to investigating the dynamics and genomic evolution of bacterial pathogens during colonization and infection of humans. The application of this technology to the longitudinal study of adaptation in the infected host — in particular, the evolution of drug resistance and host adaptation in patients chronically infected with opportunistic pathogens — has revealed remarkable patterns of convergent evolution, pointing to an inherent repeatability of evolution. In this Review, we describe how these studies have advanced our understanding of the mechanisms and principles of within-host genome evolution, and we consider the consequences of findings such as a potent adaptive potential for pathogenicity. Finally, we discuss the possibility that genomics may be used in the future to predict the clinical progression of bacterial infections, and to suggest the best treatment option. PMID:26806595

  1. Within-host evolution of bacterial pathogens.

    PubMed

    Didelot, Xavier; Walker, A Sarah; Peto, Tim E; Crook, Derrick W; Wilson, Daniel J

    2016-03-01

    Whole-genome sequencing has opened the way for investigating the dynamics and genomic evolution of bacterial pathogens during the colonization and infection of humans. The application of this technology to the longitudinal study of adaptation in an infected host--in particular, the evolution of drug resistance and host adaptation in patients who are chronically infected with opportunistic pathogens--has revealed remarkable patterns of convergent evolution, suggestive of an inherent repeatability of evolution. In this Review, we describe how these studies have advanced our understanding of the mechanisms and principles of within-host genome evolution, and we consider the consequences of findings such as a potent adaptive potential for pathogenicity. Finally, we discuss the possibility that genomics may be used in the future to predict the clinical progression of bacterial infections and to suggest the best option for treatment.

  2. Uprobe: a genome-wide universal probe resource for comparative physical mapping in vertebrates.

    PubMed

    Kellner, Wendy A; Sullivan, Robert T; Carlson, Brian H; Thomas, James W

    2005-01-01

    Interspecies comparisons are important for deciphering the functional content and evolution of genomes. The expansive array of >70 public vertebrate genomic bacterial artificial chromosome (BAC) libraries can provide a means of comparative mapping, sequencing, and functional analysis of targeted chromosomal segments that is independent and complementary to whole-genome sequencing. However, at the present time, no complementary resource exists for the efficient targeted physical mapping of the majority of these BAC libraries. Universal overgo-hybridization probes, designed from regions of sequenced genomes that are highly conserved between species, have been demonstrated to be an effective resource for the isolation of orthologous regions from multiple BAC libraries in parallel. Here we report the application of the universal probe design principal across entire genomes, and the subsequent creation of a complementary probe resource, Uprobe, for screening vertebrate BAC libraries. Uprobe currently consists of whole-genome sets of universal overgo-hybridization probes designed for screening mammalian or avian/reptilian libraries. Retrospective analysis, experimental validation of the probe design process on a panel of representative BAC libraries, and estimates of probe coverage across the genome indicate that the majority of all eutherian and avian/reptilian genes or regions of interest can be isolated using Uprobe. Future implementation of the universal probe design strategy will be used to create an expanded number of whole-genome probe sets that will encompass all vertebrate genomes.

  3. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    PubMed

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  4. Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features

    PubMed Central

    Bakas, Spyridon; Akbari, Hamed; Sotiras, Aristeidis; Bilello, Michel; Rozycki, Martin; Kirby, Justin S.; Freymann, John B.; Farahani, Keyvan; Davatzikos, Christos

    2017-01-01

    Gliomas belong to a group of central nervous system tumors, and consist of various sub-regions. Gold standard labeling of these sub-regions in radiographic imaging is essential for both clinical and computational studies, including radiomic and radiogenomic analyses. Towards this end, we release segmentation labels and radiomic features for all pre-operative multimodal magnetic resonance imaging (MRI) (n=243) of the multi-institutional glioma collections of The Cancer Genome Atlas (TCGA), publicly available in The Cancer Imaging Archive (TCIA). Pre-operative scans were identified in both glioblastoma (TCGA-GBM, n=135) and low-grade-glioma (TCGA-LGG, n=108) collections via radiological assessment. The glioma sub-region labels were produced by an automated state-of-the-art method and manually revised by an expert board-certified neuroradiologist. An extensive panel of radiomic features was extracted based on the manually-revised labels. This set of labels and features should enable i) direct utilization of the TCGA/TCIA glioma collections towards repeatable, reproducible and comparative quantitative studies leading to new predictive, prognostic, and diagnostic assessments, as well as ii) performance evaluation of computer-aided segmentation methods, and comparison to our state-of-the-art method. PMID:28872634

  5. The infinite sites model of genome evolution.

    PubMed

    Ma, Jian; Ratan, Aakrosh; Raney, Brian J; Suh, Bernard B; Miller, Webb; Haussler, David

    2008-09-23

    We formalize the problem of recovering the evolutionary history of a set of genomes that are related to an unseen common ancestor genome by operations of speciation, deletion, insertion, duplication, and rearrangement of segments of bases. The problem is examined in the limit as the number of bases in each genome goes to infinity. In this limit, the chromosomes are represented by continuous circles or line segments. For such an infinite-sites model, we present a polynomial-time algorithm to find the most parsimonious evolutionary history of any set of related present-day genomes.

  6. Genomic segments RNA1 and RNA2 of Prunus necrotic ringspot virus codetermine viral pathogenicity to adapt to alternating natural Prunus hosts.

    PubMed

    Cui, Hongguang; Hong, Ni; Wang, Guoping; Wang, Aiming

    2013-05-01

    Prunus necrotic ringspot virus (PNRSV) affects Prunus fruit production worldwide. To date, numerous PNRSV isolates with diverse pathological properties have been documented. To study the pathogenicity of PNRSV, which directly or indirectly determines the economic losses of infected fruit trees, we have recently sequenced the complete genome of peach isolate Pch12 and cherry isolate Chr3, belonging to the pathogenically aggressive PV32 group and mild PV96 group, respectively. Here, we constructed the Chr3- and Pch12-derived full-length cDNA clones that were infectious in the experimental host cucumber and their respective natural Prunus hosts. Pch12-derived clones induced much more severe symptoms than Chr3 in cucumber, and the pathogenicity discrepancy between Chr3 and Pch12 was associated with virus accumulation. By reassortment of genomic segments, swapping of partial genomic segments, and site-directed mutagenesis, we identified the 3' terminal nucleotide sequence (1C region) in RNA1 and amino acid K at residue 279 in RNA2-encoded P2 as the severe virulence determinants in Pch12. Gain-of-function experiments demonstrated that both the 1C region and K279 of Pch12 were required for severe virulence and high levels of viral accumulation. Our results suggest that PNRSV RNA1 and RNA2 codetermine viral pathogenicity to adapt to alternating natural Prunus hosts, likely through mediating viral accumulation.

  7. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas

    We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequencemore » (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less

  8. dBBQs: dataBase of Bacterial Quality scores.

    PubMed

    Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

    2017-12-28

    It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.

  9. Pan-Genomic Analysis Permits Differentiation of Virulent and Non-virulent Strains of Xanthomonas arboricola That Cohabit Prunus spp. and Elucidate Bacterial Virulence Factors

    PubMed Central

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.; Cubero, Jaime

    2017-01-01

    Xanthomonas arboricola is a plant-associated bacterial species that causes diseases on several plant hosts. One of the most virulent pathovars within this species is X. arboricola pv. pruni (Xap), the causal agent of bacterial spot disease of stone fruit trees and almond. Recently, a non-virulent Xap-look-a-like strain isolated from Prunus was characterized and its genome compared to pathogenic strains of Xap, revealing differences in the profile of virulence factors, such as the genes related to the type III secretion system (T3SS) and type III effectors (T3Es). The existence of this atypical strain arouses several questions associated with the abundance, the pathogenicity, and the evolutionary context of X. arboricola on Prunus hosts. After an initial characterization of a collection of Xanthomonas strains isolated from Prunus bacterial spot outbreaks in Spain during the past decade, six Xap-look-a-like strains, that did not clustered with the pathogenic strains of Xap according to a multi locus sequence analysis, were identified. Pathogenicity of these strains was analyzed and the genome sequences of two Xap-look-a-like strains, CITA 14 and CITA 124, non-virulent to Prunus spp., were obtained and compared to those available genomes of X. arboricola associated with this host plant. Differences were found among the genomes of the virulent and the Prunus non-virulent strains in several characters related to the pathogenesis process. Additionally, a pan-genomic analysis that included the available genomes of X. arboricola, revealed that the atypical strains associated with Prunus were related to a group of non-virulent or low virulent strains isolated from a wide host range. The repertoire of the genes related to T3SS and T3Es varied among the strains of this cluster and those strains related to the most virulent pathovars of the species, corylina, juglandis, and pruni. This variability provides information about the potential evolutionary process associated to the

  10. Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes).

    PubMed

    Dessimoz, Christophe; Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro

    2011-09-01

    Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.

  11. Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes)

    PubMed Central

    Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro

    2011-01-01

    Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. PMID:21712341

  12. Ensembl Genomes 2013: scaling up access to genome-wide data.

    PubMed

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  13. Identification and Characterization of Domesticated Bacterial Transposases

    PubMed Central

    Gallie, Jenna; Rainey, Paul B.

    2017-01-01

    Abstract Selfish genetic elements, such as insertion sequences and transposons are found in most genomes. Transposons are usually identifiable by their high copy number within genomes. In contrast, REP-associated tyrosine transposases (RAYTs), a recently described class of bacterial transposase, are typically present at just one copy per genome. This suggests that RAYTs no longer copy themselves and thus they no longer function as a typical transposase. Motivated by this possibility we interrogated thousands of fully sequenced bacterial genomes in order to determine patterns of RAYT diversity, their distribution across chromosomes and accessory elements, and rate of duplication. RAYTs encompass exceptional diversity and are divisible into at least five distinct groups. They possess features more similar to housekeeping genes than insertion sequences, are predominantly vertically transmitted and have persisted through evolutionary time to the point where they are now found in 24% of all species for which at least one fully sequenced genome is available. Overall, the genomic distribution of RAYTs suggests that they have been coopted by host genomes to perform a function that benefits the host cell. PMID:28910967

  14. Cytotoxic Chromosomal Targeting by CRISPR/Cas Systems Can Reshape Bacterial Genomes and Expel or Remodel Pathogenicity Islands

    PubMed Central

    Vercoe, Reuben B.; Chang, James T.; Dy, Ron L.; Taylor, Corinda; Gristwood, Tamzin; Clulow, James S.; Richter, Corinna; Przybilski, Rita; Pitman, Andrew R.; Fineran, Peter C.

    2013-01-01

    In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated (Cas) proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2) involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas–mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM) beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA–targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity. PMID:23637624

  15. Application of Whole Genome Expression Analysis to Assess Bacterial Responses to Environmental Conditions

    NASA Astrophysics Data System (ADS)

    Vukanti, R. V.; Mintz, E. M.; Leff, L. G.

    2005-05-01

    Bacterial responses to environmental signals are multifactorial and are coupled to changes in gene expression. An understanding of bacterial responses to environmental conditions is possible using microarray expression analysis. In this study, the utility of microarrays for examining changes in gene expression in Escherichia coli under different environmental conditions was assessed. RNA was isolated, hybridized to Affymetrix E. coli Genome 2.0 chips and analyzed using Affymetrix GCOS and Genespring software. Major limiting factors were obtaining enough quality RNA (107-108 cells to get 10μg RNA)and accounting for differences in growth rates under different conditions. Stabilization of RNA prior to isolation and taking extreme precautions while handling RNA were crucial. In addition, use of this method in ecological studies is limited by availability and cost of commercial arrays; choice of primers for cDNA synthesis, reproducibility, complexity of results generated and need to validate findings. This method may be more widely applicable with the development of better approaches for RNA recovery from environmental samples and increased number of available strain-specific arrays. Diligent experimental design and verification of results with real-time PCR or northern blots is needed. Overall, there is a great potential for use of this technology to discover mechanisms underlying organisms' responses to environmental conditions.

  16. Determination of the Core of a Minimal Bacterial Gene Set†

    PubMed Central

    Gil, Rosario; Silva, Francisco J.; Peretó, Juli; Moya, Andrés

    2004-01-01

    The availability of a large number of complete genome sequences raises the question of how many genes are essential for cellular life. Trying to reconstruct the core of the protein-coding gene set for a hypothetical minimal bacterial cell, we have performed a computational comparative analysis of eight bacterial genomes. Six of the analyzed genomes are very small due to a dramatic genome size reduction process, while the other two, corresponding to free-living relatives, are larger. The available data from several systematic experimental approaches to define all the essential genes in some completely sequenced bacterial genomes were also considered, and a reconstruction of a minimal metabolic machinery necessary to sustain life was carried out. The proposed minimal genome contains 206 protein-coding genes with all the genetic information necessary for self-maintenance and reproduction in the presence of a full complement of essential nutrients and in the absence of environmental stress. The main features of such a minimal gene set, as well as the metabolic functions that must be present in the hypothetical minimal cell, are discussed. PMID:15353568

  17. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    PubMed Central

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms. PMID:28706512

  18. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    DOE PAGES

    Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; ...

    2017-08-08

    Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less

  19. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas

    Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less

  20. Using DGGE and 16S rRNA gene sequence analysis to evaluate changes in oral bacterial composition.

    PubMed

    Chen, Zhou; Trivedi, Harsh M; Chhun, Nok; Barnes, Virginia M; Saxena, Deepak; Xu, Tao; Li, Yihong

    2011-01-01

    To investigate whether a standard dental prophylaxis followed by tooth brushing with an antibacterial dentifrice will affect the oral bacterial community, as determined by denaturing gradient gel electrophoresis (DGGE) combined with 16S rRNA gene sequence analysis. Twenty-four healthy adults were instructed to brush their teeth using commercial dentifrice for 1 week during a washout period. An initial set of pooled supragingival plaque samples was collected from each participant at baseline (0 h) before prophylaxis treatment. The subjects were given a clinical examination and dental prophylaxis and asked to brush for 1 min with a dentifrice containing 0.3% triclosan, 2.0% PVM/MA copolymer and 0.243% sodium fluoride (Colgate Total). On the following day, a second set of pooled supragingival plaque samples (24 h) was collected. Total bacterial genomic DNA was isolated from the samples. Differences in the microbial composition before and after the prophylactic procedure and tooth brushing were assessed by comparing the DGGE profiles and 16S rRNA gene segments sequence analysis. Two distinct clusters of DGGE profiles were found, suggesting that a shift in the microbial composition had occurred 24 h after the prophylaxis and brushing. A detailed sequencing analysis of 16S rRNA gene segments further identified 6 phyla and 29 genera, including known and unknown bacterial species. Importantly, an increase in bacterial diversity was observed after 24 h, including members of the Streptococcaceae family, Prevotella, Corynebacterium, TM7 and other commensal bacteria. The results suggest that the use of a standard prophylaxis followed by the use of the dentifrice containing 0.3% triclosan, 2.0% PVM/MA copolymer and 0.243% sodium fluoride may promote a healthier composition within the oral bacterial community.

  1. A segmentation/clustering model for the analysis of array CGH data.

    PubMed

    Picard, F; Robin, S; Lebarbier, E; Daudin, J-J

    2007-09-01

    Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.

  2. Identification of the Genome Segments of Bluetongue Virus Serotype 26 (Isolate KUW2010/02) that Restrict Replication in a Culicoides sonorensis Cell Line (KC Cells).

    PubMed

    Pullinger, Gillian D; Guimerà Busquets, Marc; Nomikou, Kyriaki; Boyce, Mark; Attoui, Houssam; Mertens, Peter P

    2016-01-01

    Bluetongue virus (BTV) can infect most ruminant species and is usually transmitted by adult, vector-competent biting midges (Culicoides spp.). Infection with BTV can cause severe clinical signs and can be fatal, particularly in naïve sheep and some deer species. Although 24 distinct BTV serotypes were recognized for several decades, additional 'types' have recently been identified, including BTV-25 (from Switzerland), BTV-26 (from Kuwait) and BTV-27 from France (Corsica). Although BTV-25 has failed to grow in either insect or mammalian cell cultures, BTV-26 (isolate KUW2010/02), which can be transmitted horizontally between goats in the absence of vector insects, does not replicate in a Culicoides sonorensis cell line (KC cells) but can be propagated in mammalian cells (BSR cells). The BTV genome consists of ten segments of linear dsRNA. Mono-reassortant viruses were generated by reverse-genetics, each one containing a single BTV-26 genome segment in a BTV-1 genetic-background. However, attempts to recover a mono-reassortant containing genome-segment 2 (Seg-2) of BTV-26 (encoding VP2), were unsuccessful but a triple-reassortant was successfully generated containing Seg-2, Seg-6 and Seg-7 (encoding VP5 and VP7 respectively) of BTV-26. Reassortants were recovered and most replicated well in mammalian cells (BSR cells). However, mono-reassortants containing Seg-1 or Seg-3 of BTV-26 (encoding VP1, or VP3 respectively) and the triple reassortant failed to replicate, while a mono-reassortant containing Seg-7 of BTV-26 only replicated slowly in KC cells.

  3. Complete sequence of the first chimera genome constructed by cloning the whole genome of Synechocystis strain PCC6803 into the Bacillus subtilis 168 genome.

    PubMed

    Watanabe, Satoru; Shiwa, Yuh; Itaya, Mitsuhiro; Yoshikawa, Hirofumi

    2012-12-01

    Genome synthesis of existing or designed genomes is made feasible by the first successful cloning of a cyanobacterium, Synechocystis PCC6803, in Gram-positive, endospore-forming Bacillus subtilis. Whole-genome sequence analysis of the isolate and parental B. subtilis strains provides clues for identifying single nucleotide polymorphisms (SNPs) in the 2 complete bacterial genomes in one cell.

  4. Initiation of a pan-genomic research project for Xylella fastidiosa

    USDA-ARS?s Scientific Manuscript database

    Differences in genomic structure and nucleotide polymorphism among strains form the genetic basis for adaptability of a bacterial species. This can be described by a bacterial pan-genome, which is defined as the full complement of genes in all strains of a species. The pan-genome is composed of a "c...

  5. Bacterial membrane proteomics.

    PubMed

    Poetsch, Ansgar; Wolters, Dirk

    2008-10-01

    About one quarter to one third of all bacterial genes encode proteins of the inner or outer bacterial membrane. These proteins perform essential physiological functions, such as the import or export of metabolites, the homeostasis of metal ions, the extrusion of toxic substances or antibiotics, and the generation or conversion of energy. The last years have witnessed completion of a plethora of whole-genome sequences of bacteria important for biotechnology or medicine, which is the foundation for proteome and other functional genome analyses. In this review, we discuss the challenges in membrane proteome analysis, starting from sample preparation and leading to MS-data analysis and quantification. The current state of available proteomics technologies as well as their advantages and disadvantages will be described with a focus on shotgun proteomics. Then, we will briefly introduce the most abundant proteins and protein families present in bacterial membranes before bacterial membrane proteomics studies of the last years will be presented. It will be shown how these works enlarged our knowledge about the physiological adaptations that take place in bacteria during fine chemical production, bioremediation, protein overexpression, and during infections. Furthermore, several examples from literature demonstrate the suitability of membrane proteomics for the identification of antigens and different pathogenic strains, as well as the elucidation of membrane protein structure and function.

  6. Comprehensive analysis of DNA polymerase III α subunits and their homologs in bacterial genomes

    PubMed Central

    Timinskas, Kęstutis; Balvočiūtė, Monika; Timinskas, Albertas; Venclovas, Česlovas

    2014-01-01

    The analysis of ∼2000 bacterial genomes revealed that they all, without a single exception, encode one or more DNA polymerase III α-subunit (PolIIIα) homologs. Classified into C-family of DNA polymerases they come in two major forms, PolC and DnaE, related by ancient duplication. While PolC represents an evolutionary compact group, DnaE can be further subdivided into at least three groups (DnaE1-3). We performed an extensive analysis of various sequence, structure and surface properties of all four polymerase groups. Our analysis suggests a specific evolutionary pathway leading to PolC and DnaE from the last common ancestor and reveals important differences between extant polymerase groups. Among them, DnaE1 and PolC show the highest conservation of the analyzed properties. DnaE3 polymerases apparently represent an ‘impaired’ version of DnaE1. Nonessential DnaE2 polymerases, typical for oxygen-using bacteria with large GC-rich genomes, have a number of features in common with DnaE3 polymerases. The analysis of polymerase distribution in genomes revealed three major combinations: DnaE1 either alone or accompanied by one or more DnaE2s, PolC + DnaE3 and PolC + DnaE1. The first two combinations are present in Escherichia coli and Bacillus subtilis, respectively. The third one (PolC + DnaE1), found in Clostridia, represents a novel, so far experimentally uncharacterized, set. PMID:24106089

  7. Effects of sample treatments on genome recovery via single-cell genomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clingenpeel, Scott; Schwientek, Patrick; Hugenholtz, Philip

    2014-06-13

    It is known that single-cell genomics is a powerful tool for accessing genetic information from uncultivated microorganisms. Methods of handling samples before single-cell genomic amplification may affect the quality of the genomes obtained. Using three bacterial strains we demonstrate that, compared to cryopreservation, lower-quality single-cell genomes are recovered when the sample is preserved in ethanol or if the sample undergoes fluorescence in situ hybridization, while sample preservation in paraformaldehyde renders it completely unsuitable for sequencing.

  8. Novel approach for identification of influenza virus host range and zoonotic transmissible sequences by determination of host-related associative positions in viral genome segments.

    PubMed

    Kargarfard, Fatemeh; Sami, Ashkan; Mohammadi-Dehcheshmeh, Manijeh; Ebrahimie, Esmaeil

    2016-11-16

    Recent (2013 and 2009) zoonotic transmission of avian or porcine influenza to humans highlights an increase in host range by evading species barriers. Gene reassortment or antigenic shift between viruses from two or more hosts can generate a new life-threatening virus when the new shuffled virus is no longer recognized by antibodies existing within human populations. There is no large scale study to help understand the underlying mechanisms of host transmission. Furthermore, there is no clear understanding of how different segments of the influenza genome contribute in the final determination of host range. To obtain insight into the rules underpinning host range determination, various supervised machine learning algorithms were employed to mine reassortment changes in different viral segments in a range of hosts. Our multi-host dataset contained whole segments of 674 influenza strains organized into three host categories: avian, human, and swine. Some of the sequences were assigned to multiple hosts. In point of fact, the datasets are a form of multi-labeled dataset and we utilized a multi-label learning method to identify discriminative sequence sites. Then algorithms such as CBA, Ripper, and decision tree were applied to extract informative and descriptive association rules for each viral protein segment. We found informative rules in all segments that are common within the same host class but varied between different hosts. For example, for infection of an avian host, HA14V and NS1230S were the most important discriminative and combinatorial positions. Host range identification is facilitated by high support combined rules in this study. Our major goal was to detect discriminative genomic positions that were able to identify multi host viruses, because such viruses are likely to cause pandemic or disastrous epidemics.

  9. Sequences of multiple bacterial genomes and a Chlamydia trachomatis genotype from direct sequencing of DNA derived from a vaginal swab diagnostic specimen.

    PubMed

    Andersson, P; Klein, M; Lilliebridge, R A; Giffard, P M

    2013-09-01

    Ultra-deep Illumina sequencing was performed on whole genome amplified DNA derived from a Chlamydia trachomatis-positive vaginal swab. Alignment of reads with reference genomes allowed robust SNP identification from the C. trachomatis chromosome and plasmid. This revealed that the C. trachomatis in the specimen was very closely related to the sequenced urogenital, serovar F, clade T1 isolate F-SW4. In addition, high genome-wide coverage was obtained for Prevotella melaninogenica, Gardnerella vaginalis, Clostridiales genomosp. BVAB3 and Mycoplasma hominis. This illustrates the potential of metagenome data to provide high resolution bacterial typing data from multiple taxa in a diagnostic specimen. ©2013 The Authors Clinical Microbiology and Infection ©2013 European Society of Clinical Microbiology and Infectious Diseases.

  10. CAMBerVis: visualization software to support comparative analysis of multiple bacterial strains.

    PubMed

    Woźniak, Michał; Wong, Limsoon; Tiuryn, Jerzy

    2011-12-01

    A number of inconsistencies in genome annotations are documented among bacterial strains. Visualization of the differences may help biologists to make correct decisions in spurious cases. We have developed a visualization tool, CAMBerVis, to support comparative analysis of multiple bacterial strains. The software manages simultaneous visualization of multiple bacterial genomes, enabling visual analysis focused on genome structure annotations. The CAMBerVis software is freely available at the project website: http://bioputer.mimuw.edu.pl/camber. Input datasets for Mycobacterium tuberculosis and Staphylocacus aureus are integrated with the software as examples. m.wozniak@mimuw.edu.pl Supplementary data are available at Bioinformatics online.

  11. Genomic comparisons of a bacterial lineage that inhabits both marine and terrestrial deep subsurface systems

    DOE PAGES

    Jungbluth, Sean P.; Glavina del Rio, Tijana; Tringe, Susannah G.; ...

    2017-04-06

    It is generally accepted that diverse, poorly characterized microorganisms reside deep within Earth’s crust. One such lineage of deep subsurface-dwelling bacteria is an uncultivated member of the Firmicutes phylum that can dominate molecular surveys from both marine and continental rock fracture fluids, sometimes forming the sole member of a single-species microbiome. Here, we reconstructed a genome from basalt-hosted fluids of the deep subseafloor along the eastern Juan de Fuca Ridge flank and used a phylogenomic analysis to show that, despite vast differences in geographic origin and habitat, it forms a monophyletic clade with the terrestrial deep subsurface genome of “more » Candidatus Desulforudis audaxviator” MP104C. While a limited number of differences were observed between the marine genome of “ Candidatus Desulfopertinax cowenii” modA32 and its terrestrial relative that may be of potential adaptive importance, here it is revealed that the two are remarkably similar thermophiles possessing the genetic capacity for motility, sporulation, hydrogenotrophy, chemoorganotrophy, dissimilatory sulfate reduction, and the ability to fix inorganic carbon via the Wood-Ljungdahl pathway for chemoautotrophic growth. Finally, our results provide insights into the genetic repertoire within marine and terrestrial members of a bacterial lineage that is widespread in the global deep subsurface biosphere, and provides a natural means to investigate adaptations specific to these two environments.« less

  12. Genomic comparisons of a bacterial lineage that inhabits both marine and terrestrial deep subsurface systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jungbluth, Sean P.; Glavina del Rio, Tijana; Tringe, Susannah G.

    It is generally accepted that diverse, poorly characterized microorganisms reside deep within Earth’s crust. One such lineage of deep subsurface-dwelling bacteria is an uncultivated member of the Firmicutes phylum that can dominate molecular surveys from both marine and continental rock fracture fluids, sometimes forming the sole member of a single-species microbiome. Here, we reconstructed a genome from basalt-hosted fluids of the deep subseafloor along the eastern Juan de Fuca Ridge flank and used a phylogenomic analysis to show that, despite vast differences in geographic origin and habitat, it forms a monophyletic clade with the terrestrial deep subsurface genome of “more » Candidatus Desulforudis audaxviator” MP104C. While a limited number of differences were observed between the marine genome of “ Candidatus Desulfopertinax cowenii” modA32 and its terrestrial relative that may be of potential adaptive importance, here it is revealed that the two are remarkably similar thermophiles possessing the genetic capacity for motility, sporulation, hydrogenotrophy, chemoorganotrophy, dissimilatory sulfate reduction, and the ability to fix inorganic carbon via the Wood-Ljungdahl pathway for chemoautotrophic growth. Finally, our results provide insights into the genetic repertoire within marine and terrestrial members of a bacterial lineage that is widespread in the global deep subsurface biosphere, and provides a natural means to investigate adaptations specific to these two environments.« less

  13. Genomic comparisons of a bacterial lineage that inhabits both marine and terrestrial deep subsurface systems

    PubMed Central

    Glavina del Rio, Tijana; Tringe, Susannah G.; Stepanauskas, Ramunas

    2017-01-01

    It is generally accepted that diverse, poorly characterized microorganisms reside deep within Earth’s crust. One such lineage of deep subsurface-dwelling bacteria is an uncultivated member of the Firmicutes phylum that can dominate molecular surveys from both marine and continental rock fracture fluids, sometimes forming the sole member of a single-species microbiome. Here, we reconstructed a genome from basalt-hosted fluids of the deep subseafloor along the eastern Juan de Fuca Ridge flank and used a phylogenomic analysis to show that, despite vast differences in geographic origin and habitat, it forms a monophyletic clade with the terrestrial deep subsurface genome of “Candidatus Desulforudis audaxviator” MP104C. While a limited number of differences were observed between the marine genome of “Candidatus Desulfopertinax cowenii” modA32 and its terrestrial relative that may be of potential adaptive importance, here it is revealed that the two are remarkably similar thermophiles possessing the genetic capacity for motility, sporulation, hydrogenotrophy, chemoorganotrophy, dissimilatory sulfate reduction, and the ability to fix inorganic carbon via the Wood-Ljungdahl pathway for chemoautotrophic growth. Our results provide insights into the genetic repertoire within marine and terrestrial members of a bacterial lineage that is widespread in the global deep subsurface biosphere, and provides a natural means to investigate adaptations specific to these two environments. PMID:28396823

  14. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    PubMed Central

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  15. Genome-Wide Association Study Identifies NBS-LRR-Encoding Genes Related with Anthracnose and Common Bacterial Blight in the Common Bean.

    PubMed

    Wu, Jing; Zhu, Jifeng; Wang, Lanfen; Wang, Shumin

    2017-01-01

    Nucleotide-binding site and leucine-rich repeat (NBS-LRR) genes represent the largest and most important disease resistance genes in plants. The genome sequence of the common bean ( Phaseolus vulgaris L.) provides valuable data for determining the genomic organization of NBS-LRR genes. However, data on the NBS-LRR genes in the common bean are limited. In total, 178 NBS-LRR-type genes and 145 partial genes (with or without a NBS) located on 11 common bean chromosomes were identified from genome sequences database. Furthermore, 30 NBS-LRR genes were classified into Toll/interleukin-1 receptor (TIR)-NBS-LRR (TNL) types, and 148 NBS-LRR genes were classified into coiled-coil (CC)-NBS-LRR (CNL) types. Moreover, the phylogenetic tree supported the division of these PvNBS genes into two obvious groups, TNL types and CNL types. We also built expression profiles of NBS genes in response to anthracnose and common bacterial blight using qRT-PCR. Finally, we detected nine disease resistance loci for anthracnose (ANT) and seven for common bacterial blight (CBB) using the developed NBS-SSR markers. Among these loci, NSSR24, NSSR73, and NSSR265 may be located at new regions for ANT resistance, while NSSR65 and NSSR260 may be located at new regions for CBB resistance. Furthermore, we validated NSSR24, NSSR65, NSSR73, NSSR260, and NSSR265 using a new natural population. Our results provide useful information regarding the function of the NBS-LRR proteins and will accelerate the functional genomics and evolutionary studies of NBS-LRR genes in food legumes. NBS-SSR markers represent a wide-reaching resource for molecular breeding in the common bean and other food legumes. Collectively, our results should be of broad interest to bean scientists and breeders.

  16. Genome-Wide Association Study Identifies NBS-LRR-Encoding Genes Related with Anthracnose and Common Bacterial Blight in the Common Bean

    PubMed Central

    Wu, Jing; Zhu, Jifeng; Wang, Lanfen; Wang, Shumin

    2017-01-01

    Nucleotide-binding site and leucine-rich repeat (NBS-LRR) genes represent the largest and most important disease resistance genes in plants. The genome sequence of the common bean (Phaseolus vulgaris L.) provides valuable data for determining the genomic organization of NBS-LRR genes. However, data on the NBS-LRR genes in the common bean are limited. In total, 178 NBS-LRR-type genes and 145 partial genes (with or without a NBS) located on 11 common bean chromosomes were identified from genome sequences database. Furthermore, 30 NBS-LRR genes were classified into Toll/interleukin-1 receptor (TIR)-NBS-LRR (TNL) types, and 148 NBS-LRR genes were classified into coiled-coil (CC)-NBS-LRR (CNL) types. Moreover, the phylogenetic tree supported the division of these PvNBS genes into two obvious groups, TNL types and CNL types. We also built expression profiles of NBS genes in response to anthracnose and common bacterial blight using qRT-PCR. Finally, we detected nine disease resistance loci for anthracnose (ANT) and seven for common bacterial blight (CBB) using the developed NBS-SSR markers. Among these loci, NSSR24, NSSR73, and NSSR265 may be located at new regions for ANT resistance, while NSSR65 and NSSR260 may be located at new regions for CBB resistance. Furthermore, we validated NSSR24, NSSR65, NSSR73, NSSR260, and NSSR265 using a new natural population. Our results provide useful information regarding the function of the NBS-LRR proteins and will accelerate the functional genomics and evolutionary studies of NBS-LRR genes in food legumes. NBS-SSR markers represent a wide-reaching resource for molecular breeding in the common bean and other food legumes. Collectively, our results should be of broad interest to bean scientists and breeders. PMID:28848595

  17. Stored Canine Whole Blood Units: What is the Real Risk of Bacterial Contamination?

    PubMed

    Miglio, A; Stefanetti, V; Antognoni, M T; Cappelli, K; Capomaccio, S; Coletti, M; Passamonti, F

    2016-11-01

    Bacterial contamination of whole blood (WB) units can result in transfusion-transmitted infection, but the extent of the risk has not been established and may be underestimated in veterinary medicine. To detect, quantify, and identify bacterial microorganisms in 49 canine WB units during their shelf life. Forty-nine healthy adult dogs. Forty-nine WB units were included in the study. Immediately after collection, 8 sterile samples from the tube segment line of each unit were aseptically collected and tested for bacterial contamination on days 0, 1, 7, 14, 21, 28, 35, and 42 of storage. A qPCR assay was performed on days 0, 21, and 35 to identify and quantify any bacterial DNA. On bacterial culture, 47/49 blood units were negative at all time points tested, 1 unit was positive for Enterococcus spp. on days 0 and 1, and 1 was positive for Escherichia coli on day 35. On qPCR assay, 26 of 49 blood units were positive on at least 1 time point and the bacterial loads of the sequences detected (Propionobacterium spp., Corynebacterium spp., Caulobacter spp., Pseudomonas spp., Enterococcus spp., Serratia spp., and Leucobacter spp.) were <80 genome equivalents (GE)/μL. Most of the organisms detected were common bacteria, not usually implicated in septic transfusion reactions. The very low number of GE detected constitutes an acceptable risk of bacterial contamination, indicating that WB units have a good sanitary shelf life during commercial storage. Copyright © 2016 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.

  18. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    PubMed

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  19. PathogenFinder--distinguishing friend from foe using bacterial whole genome sequence data.

    PubMed

    Cosentino, Salvatore; Voldby Larsen, Mette; Møller Aarestrup, Frank; Lund, Ole

    2013-01-01

    Although the majority of bacteria are harmless or even beneficial to their host, others are highly virulent and can cause serious diseases, and even death. Due to the constantly decreasing cost of high-throughput sequencing there are now many completely sequenced genomes available from both human pathogenic and innocuous strains. The data can be used to identify gene families that correlate with pathogenicity and to develop tools to predict the pathogenicity of newly sequenced strains, investigations that previously were mainly done by means of more expensive and time consuming experimental approaches. We describe PathogenFinder (http://cge.cbs.dtu.dk/services/PathogenFinder/), a web-server for the prediction of bacterial pathogenicity by analysing the input proteome, genome, or raw reads provided by the user. The method relies on groups of proteins, created without regard to their annotated function or known involvement in pathogenicity. The method has been built to work with all taxonomic groups of bacteria and using the entire training-set, achieved an accuracy of 88.6% on an independent test-set, by correctly classifying 398 out of 449 completely sequenced bacteria. The approach here proposed is not biased on sets of genes known to be associated with pathogenicity, thus the approach could aid the discovery of novel pathogenicity factors. Furthermore the pathogenicity prediction web-server could be used to isolate the potential pathogenic features of both known and unknown strains.

  20. Evolutionary genomics: transdomain gene transfers.

    PubMed

    Bordenstein, Seth R

    2007-11-06

    Biologists have until now conceded that bacterial gene transfer to multicellular animals is relatively uncommon in Nature. A new study showing promiscuous insertions of bacterial endosymbiont genes into invertebrate genomes ushers in a shift in this paradigm.

  1. Nuclear and cytoplasmic genome components of Solanum tuberosum + S. chacoense somatic hybrids and three SSR alleles related to bacterial wilt resistance.

    PubMed

    Chen, Lin; Guo, Xianpu; Xie, Conghua; He, Li; Cai, Xingkui; Tian, Lingli; Song, Botao; Liu, Jun

    2013-07-01

    The somatic hybrids were derived previously from protoplast fusion between Solanum tuberosum and S. chacoense to gain the bacterial wilt resistance from the wild species. The genome components analysis in the present research was to clarify the nuclear and cytoplasmic composition of the hybrids, to explore the molecular markers associated with the resistance, and provide information for better use of these hybrids in potato breeding. One hundred and eight nuclear SSR markers and five cytoplasmic specific primers polymorphic between the fusion parents were used to detect the genome components of 44 somatic hybrids. The bacterial wilt resistance was assessed thrice by inoculating the in vitro plants with a bacterial suspension of race 1. The disease index, relative disease index, and resistance level were assigned to each hybrid, which were further analyzed in relation to the molecular markers for elucidating the potential genetic base of the resistance. All of the 317 parental unique nuclear SSR alleles appeared in the somatic hybrids with some variations in the number of bands detected. Nearly 80 % of the hybrids randomly showed the chloroplast pattern of one parent, and most of the hybrids exhibited a fused mitochondrial DNA pattern. One hundred and nine specific SSR alleles of S. chacoense were analyzed for their relationship with the disease index of the hybrids, and three alleles were identified to be significantly associated with the resistance. Selection for the resistant SSR alleles of S. chacoense may increase the possibility of producing resistant pedigrees.

  2. MPD: a pathogen genome and metagenome database

    PubMed Central

    Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen

    2018-01-01

    Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040

  3. Ensembl Genomes 2016: more genomes, more complexity

    PubMed Central

    Kersey, Paul Julian; Allen, James E.; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J.; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J.; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K.; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D.; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello–Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M.; Howe, Kevin L.; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M.

    2016-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. PMID:26578574

  4. Bacterial toxin-antitoxin systems: more than selfish entities?

    PubMed

    Van Melderen, Laurence; Saavedra De Bast, Manuel

    2009-03-01

    Bacterial toxin-antitoxin (TA) systems are diverse and widespread in the prokaryotic kingdom. They are composed of closely linked genes encoding a stable toxin that can harm the host cell and its cognate labile antitoxin, which protects the host from the toxin's deleterious effect. TA systems are thought to invade bacterial genomes through horizontal gene transfer. Some TA systems might behave as selfish elements and favour their own maintenance at the expense of their host. As a consequence, they may contribute to the maintenance of plasmids or genomic islands, such as super-integrons, by post-segregational killing of the cell that loses these genes and so suffers the stable toxin's destructive effect. The function of the chromosomally encoded TA systems is less clear and still open to debate. This Review discusses current hypotheses regarding the biological roles of these evolutionarily successful small operons. We consider the various selective forces that could drive the maintenance of TA systems in bacterial genomes.

  5. Bacterial Toxin–Antitoxin Systems: More Than Selfish Entities?

    PubMed Central

    Van Melderen, Laurence; Saavedra De Bast, Manuel

    2009-01-01

    Bacterial toxin–antitoxin (TA) systems are diverse and widespread in the prokaryotic kingdom. They are composed of closely linked genes encoding a stable toxin that can harm the host cell and its cognate labile antitoxin, which protects the host from the toxin's deleterious effect. TA systems are thought to invade bacterial genomes through horizontal gene transfer. Some TA systems might behave as selfish elements and favour their own maintenance at the expense of their host. As a consequence, they may contribute to the maintenance of plasmids or genomic islands, such as super-integrons, by post-segregational killing of the cell that loses these genes and so suffers the stable toxin's destructive effect. The function of the chromosomally encoded TA systems is less clear and still open to debate. This Review discusses current hypotheses regarding the biological roles of these evolutionarily successful small operons. We consider the various selective forces that could drive the maintenance of TA systems in bacterial genomes. PMID:19325885

  6. Limitations to estimating bacterial cross-speciestransmission using genetic and genomic markers: inferencesfrom simulation modeling

    USGS Publications Warehouse

    Julio Andre, Benavides; Cross, Paul C.; Luikart, Gordon; Scott, Creel

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced.

  7. Predicting effects of structural stress in a genome-reduced model bacterial metabolism

    NASA Astrophysics Data System (ADS)

    Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

    2012-08-01

    Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.

  8. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    PubMed

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

  9. Task 1.5 Genomic Shift and Drift Trends of Emerging Pathogens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Borucki, M

    2010-01-05

    The Lawrence Livermore National Laboratory (LLNL) Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to conduct analyses of genomic shift and drift trends of emerging pathogens, with a focused eye on select agent pathogens, as well as antibiotic and virulence markers. Most emerging human pathogens are zoonotic viruses with a genome composed of RNA. The high mutation rate of the replication enzymes of RNA viruses contributes to sequence drift andmore » provides one mechanism for these viruses to adapt to diverse hosts (interspecies transmission events) and cause new human and zoonotic diseases. Additionally, new viral pathogens frequently emerge due to genetic shift (recombination and segment reassortment) which allows for dramatic genotypic and phenotypic changes to occur rapidly. Bacterial pathogens also evolve via genetic drift and shift, although sequence drift generally occurs at a much slower rate for bacteria as compared to RNA viruses. However, genetic shift such as lateral gene transfer and inter- and intragenomic recombination enables bacteria to rapidly acquire new mechanisms of survival and antibiotic resistance. New technologies such as rapid whole genome sequencing of bacterial genomes, ultra-deep sequencing of RNA virus populations, metagenomic studies of environments rich in antibiotic resistance genes, and the use of microarrays for the detection and characterization of emerging pathogens provide mechanisms to address the challenges posed by the rapid emergence of pathogens. Bioinformatic algorithms that enable efficient analysis of the massive amounts of data generated by these technologies as well computational modeling of protein structures and evolutionary processes need to be developed to allow the technology to fulfill its potential.« less

  10. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    PubMed

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  11. Complete Coding Genome Sequence for Mogiana Tick Virus, a Jingmenvirus Isolated from Ticks in Brazil

    DTIC Science & Technology

    2017-05-04

    and capable of infecting a wide range of animal hosts (1–5). Here, we report the complete coding genome sequence (i.e., only missing portions of...segmented nature of the genome was not under- stood. Therefore, only the two genome segments with detectable sequence homolo- gies to flaviviruses were...originally reported (2). We revisited the data set of Maruyama et al. (2) and assembled the complete coding sequences for all four genome segments. We

  12. Patterns and architecture of genomic islands in marine bacteria

    PubMed Central

    2012-01-01

    Background Genomic Islands (GIs) have key roles since they modulate the structure and size of bacterial genomes displaying a diverse set of laterally transferred genes. Despite their importance, GIs in marine bacterial genomes have not been explored systematically to uncover possible trends and to analyze their putative ecological significance. Results We carried out a comprehensive analysis of GIs in 70 selected marine bacterial genomes detected with IslandViewer to explore the distribution, patterns and functional gene content in these genomic regions. We detected 438 GIs containing a total of 8152 genes. GI number per genome was strongly and positively correlated with the total GI size. In 50% of the genomes analyzed the GIs accounted for approximately 3% of the genome length, with a maximum of 12%. Interestingly, we found transposases particularly enriched within Alphaproteobacteria GIs, and site-specific recombinases in Gammaproteobacteria GIs. We described specific Homologous Recombination GIs (HR-GIs) in several genera of marine Bacteroidetes and in Shewanella strains among others. In these HR-GIs, we recurrently found conserved genes such as the β-subunit of DNA-directed RNA polymerase, regulatory sigma factors, the elongation factor Tu and ribosomal protein genes typically associated with the core genome. Conclusions Our results indicate that horizontal gene transfer mediated by phages, plasmids and other mobile genetic elements, and HR by site-specific recombinases play important roles in the mobility of clusters of genes between taxa and within closely related genomes, modulating the flexible pool of the genome. Our findings suggest that GIs may increase bacterial fitness under environmental changing conditions by acquiring novel foreign genes and/or modifying gene transcription and/or transduction. PMID:22839777

  13. Comparative Genomics and Transcriptional Analysis of Prophages Identified in the Genomes of Lactobacillus gasseri, Lactobacillus salivarius, and Lactobacillus casei†

    PubMed Central

    Ventura, Marco; Canchaya, Carlos; Bernini, Valentina; Altermann, Eric; Barrangou, Rodolphe; McGrath, Stephen; Claesson, Marcus J.; Li, Yin; Leahy, Sinead; Walker, Carey D.; Zink, Ralf; Neviani, Erasmo; Steele, Jim; Broadbent, Jeff; Klaenhammer, Todd R.; Fitzgerald, Gerald F.; O'Toole, Paul W.; van Sinderen, Douwe

    2006-01-01

    Lactobacillus gasseri ATCC 33323, Lactobacillus salivarius subsp. salivarius UCC 118, and Lactobacillus casei ATCC 334 contain one (LgaI), four (Sal1, Sal2, Sal3, Sal4), and one (Lca1) distinguishable prophage sequences, respectively. Sequence analysis revealed that LgaI, Lca1, Sal1, and Sal2 prophages belong to the group of Sfi11-like pac site and cos site Siphoviridae, respectively. Phylogenetic investigation of these newly described prophage sequences revealed that they have not followed an evolutionary development similar to that of their bacterial hosts and that they show a high degree of diversity, even within a species. The attachment sites were determined for all these prophage elements; LgaI as well as Sal1 integrates in tRNA genes, while prophage Sal2 integrates in a predicted arginino-succinate lyase-encoding gene. In contrast, Lca1 and the Sal3 and Sal4 prophage remnants are integrated in noncoding regions in the L. casei ATCC 334 and L. salivarius UCC 118 genomes. Northern analysis showed that large parts of the prophage genomes are transcriptionally silent and that transcription is limited to genome segments located near the attachment site. Finally, pulsed-field gel electrophoresis followed by Southern blot hybridization with specific prophage probes indicates that these prophage sequences are narrowly distributed within lactobacilli. PMID:16672450

  14. An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains.

    PubMed

    Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal

    2012-04-01

    Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.

  15. An in silico model for identification of small RNAs in whole bacterial genomes: characterization of antisense RNAs in pathogenic Escherichia coli and Streptococcus agalactiae strains

    PubMed Central

    Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal

    2012-01-01

    Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli. PMID:22139924

  16. Phenetic Comparison of Prokaryotic Genomes Using k-mers

    PubMed Central

    Déraspe, Maxime; Raymond, Frédéric; Boisvert, Sébastien; Culley, Alexander; Roy, Paul H.; Laviolette, François; Corbeil, Jacques

    2017-01-01

    Abstract Bacterial genomics studies are getting more extensive and complex, requiring new ways to envision analyses. Using the Ray Surveyor software, we demonstrate that comparison of genomes based on their k-mer content allows reconstruction of phenetic trees without the need of prior data curation, such as core genome alignment of a species. We validated the methodology using simulated genomes and previously published phylogenomic studies of Streptococcus pneumoniae and Pseudomonas aeruginosa. We also investigated the relationship of specific genetic determinants with bacterial population structures. By comparing clusters from the complete genomic content of a genome population with clusters from specific functional categories of genes, we can determine how the population structures are correlated. Indeed, the strain clustering based on a subset of k-mers allows determination of its similarity with the whole genome clusters. We also applied this methodology on 42 species of bacteria to determine the correlational significance of five important bacterial genomic characteristics. For example, intrinsic resistance is more important in P. aeruginosa than in S. pneumoniae, and the former has increased correlation of its population structure with antibiotic resistance genes. The global view of the pangenome of bacteria also demonstrated the taxa-dependent interaction of population structure with antibiotic resistance, bacteriophage, plasmid, and mobile element k-mer data sets. PMID:28957508

  17. Molecular Characterization of Bombyx mori Cytoplasmic Polyhedrosis Virus Genome Segment 4

    PubMed Central

    Ikeda, Keiko; Nagaoka, Sumiharu; Winkler, Stefan; Kotani, Kumiko; Yagi, Hiroaki; Nakanishi, Kae; Miyajima, Shigetoshi; Kobayashi, Jun; Mori, Hajime

    2001-01-01

    The complete nucleotide sequence of the genome segment 4 (S4) of Bombyx mori cytoplasmic polyhedrosis virus (BmCPV) was determined. The 3,259-nucleotide sequence contains a single long open reading frame which spans nucleotides 14 to 3187 and which is predicted to encode a protein with a molecular mass of about 130 kDa. Western blot analysis showed that S4 encodes BmCPV protein VP3, which is one of the outer components of the BmCPV virion. Sequence analysis of the deduced amino acid sequence of BmCPV VP3 revealed possible sequence homology with proteins from rice ragged stunt virus (RRSV) S2, Nilaparvata lugens reovirus S4, and Fiji disease fijivirus S4. This may suggest that plant reoviruses originated from insect viruses and that RRSV emerged more recently than other plant reoviruses. A chimeric protein consisting of BmCPV VP3 and green fluorescent protein (GFP) was constructed and expressed with BmCPV polyhedrin using a baculovirus expression vector. The VP3-GFP chimera was incorporated into BmCPV polyhedra and released under alkaline conditions. The results indicate that specific interactions occur between BmCPV polyhedrin and VP3 which might facilitate BmCPV virion occlusion into the polyhedra. PMID:11134312

  18. Speed congenics: accelerated genome recovery using genetic markers.

    PubMed

    Visscher, P M

    1999-08-01

    Genetic markers throughout the genome can be used to speed up 'recovery' of the recipient genome in the backcrossing phase of the construction of a congenic strain. The prediction of the genomic proportion during backcrossing depends on the assumptions regarding the distribution of chromosome segments, the population structure, the marker spacing and the selection strategy. In this study simulation was used to investigate the rate of recovery of the recipient genome for a mouse, Drosophila and Arabidopsis genome. It was shown that an incorrect assumption of a binomial distribution of chromosome segments, and failing to take account of a reduction in variance in genomic proportion due to selection, can lead to a downward bias of up to two generations in the estimation of the number of generations required for the formation of a congenic strain.

  19. Site-Specific Integration of Foreign DNA into Minimal Bacterial and Human Target Sequences Mediated by a Conjugative Relaxase

    PubMed Central

    Agúndez, Leticia; González-Prieto, Coral; Machón, Cristina; Llosa, Matxalen

    2012-01-01

    Background Bacterial conjugation is a mechanism for horizontal DNA transfer between bacteria which requires cell to cell contact, usually mediated by self-transmissible plasmids. A protein known as relaxase is responsible for the processing of DNA during bacterial conjugation. TrwC, the relaxase of conjugative plasmid R388, is also able to catalyze site-specific integration of the transferred DNA into a copy of its target, the origin of transfer (oriT), present in a recipient plasmid. This reaction confers TrwC a high biotechnological potential as a tool for genomic engineering. Methodology/Principal Findings We have characterized this reaction by conjugal mobilization of a suicide plasmid to a recipient cell with an oriT-containing plasmid, selecting for the cointegrates. Proteins TrwA and IHF enhanced integration frequency. TrwC could also catalyze integration when it is expressed from the recipient cell. Both Y18 and Y26 catalytic tyrosil residues were essential to perform the reaction, while TrwC DNA helicase activity was dispensable. The target DNA could be reduced to 17 bp encompassing TrwC nicking and binding sites. Two human genomic sequences resembling the 17 bp segment were accepted as targets for TrwC-mediated site-specific integration. TrwC could also integrate the incoming DNA molecule into an oriT copy present in the recipient chromosome. Conclusions/Significance The results support a model for TrwC-mediated site-specific integration. This reaction may allow R388 to integrate into the genome of non-permissive hosts upon conjugative transfer. Also, the ability to act on target sequences present in the human genome underscores the biotechnological potential of conjugative relaxase TrwC as a site-specific integrase for genomic modification of human cells. PMID:22292089

  20. Nitrogen gas plasma treatment of bacterial spores induces oxidative stress that damages the genomic DNA.

    PubMed

    Sakudo, Akikazu; Toyokawa, Yoichi; Nakamura, Tetsuji; Yagyu, Yoshihito; Imanishi, Yuichiro

    2017-01-01

    Gas plasma, produced by a short high‑voltage pulse generated from a static induction thyristor power supply [1.5 kilo pulse/sec (kpps)], was demonstrated to inactivate Geobacillus stearothermophilus spores (decimal reduction time at 15 min, 2.48 min). Quantitative polymerase chain reaction and enzyme‑linked immunosorbent assays further indicated that nitrogen gas plasma treatment for 15 min decreased the level of intact genomic DNA and increased the level of 8-hydroxy-2'-deoxyguanosine, a major product of DNA oxidation. Three potential inactivation factors were generated during operation of the gas plasma instrument: Heat, longwave ultraviolet-A and oxidative stress (production of hydrogen peroxide, nitrite and nitrate). Treatment of the spores with hydrogen peroxide (3x2‑4%) effectively inactivated the bacteria, whereas heat treatment (100˚C), exposure to UV-A (75‑142 mJ/cm2) and 4.92 mM peroxynitrite (•ONOO‑), which is decomposed into nitrite and nitrate, did not. The results of the present study suggest the gas plasma treatment inactivates bacterial spores primarily by generating hydrogen peroxide, which contributes to the oxidation of the host genomic DNA.

  1. Random Amplification and Pyrosequencing for Identification of Novel Viral Genome Sequences

    PubMed Central

    Hang, Jun; Forshey, Brett M.; Kochel, Tadeusz J.; Li, Tao; Solórzano, Víctor Fiestas; Halsey, Eric S.; Kuschner, Robert A.

    2012-01-01

    ssRNA viruses have high levels of genomic divergence, which can lead to difficulty in genomic characterization of new viruses using traditional PCR amplification and sequencing methods. In this study, random reverse transcription, anchored random PCR amplification, and high-throughput pyrosequencing were used to identify orthobunyavirus sequences from total RNA extracted from viral cultures of acute febrile illness specimens. Draft genome sequence for the orthobunyavirus L segment was assembled and sequentially extended using de novo assembly contigs from pyrosequencing reads and orthobunyavirus sequences in GenBank as guidance. Accuracy and continuous coverage were achieved by mapping all reads to the L segment draft sequence. Subsequently, RT-PCR and Sanger sequencing were used to complete the genome sequence. The complete L segment was found to be 6936 bases in length, encoding a 2248-aa putative RNA polymerase. The identified L segment was distinct from previously published South American orthobunyaviruses, sharing 63% and 54% identity at the nucleotide and amino acid level, respectively, with the complete Oropouche virus L segment and 73% and 81% identity at the nucleotide and amino acid level, respectively, with a partial Caraparu virus L segment. The result demonstrated the effectiveness of a sequence-independent amplification and next-generation sequencing approach for obtaining complete viral genomes from total nucleic acid extracts and its use in pathogen discovery. PMID:22468136

  2. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    USDA-ARS?s Scientific Manuscript database

    We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the minimum information about any (x) sequence (MIxS). The standards are the minimum information about a single amplified genome (MISAG) and the ...

  3. Ensembl Genomes 2016: more genomes, more complexity.

    PubMed

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-04

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Comparing Mycobacterium tuberculosis genomes using genome topology networks.

    PubMed

    Jiang, Jianping; Gu, Jianlei; Zhang, Liang; Zhang, Chenyi; Deng, Xiao; Dou, Tonghai; Zhao, Guoping; Zhou, Yan

    2015-02-14

    Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. In this work, we introduce a 'Genome Topology Network' (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of 'unfixed ortholog' has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309). The GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/ , and allows re-annotating the 'lost' genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes

  5. Direct detection of methylation in genomic DNA

    PubMed Central

    Bart, A.; van Passel, M. W. J.; van Amsterdam, K.; van der Ende, A.

    2005-01-01

    The identification of methylated sites on bacterial genomic DNA would be a useful tool to study the major roles of DNA methylation in prokaryotes: distinction of self and nonself DNA, direction of post-replicative mismatch repair, control of DNA replication and cell cycle, and regulation of gene expression. Three types of methylated nucleobases are known: N6-methyladenine, 5-methylcytosine and N4-methylcytosine. The aim of this study was to develop a method to detect all three types of DNA methylation in complete genomic DNA. It was previously shown that N6-methyladenine and 5-methylcytosine in plasmid and viral DNA can be detected by intersequence trace comparison of methylated and unmethylated DNA. We extended this method to include N4-methylcytosine detection in both in vitro and in vivo methylated DNA. Furthermore, application of intersequence trace comparison was extended to bacterial genomic DNA. Finally, we present evidence that intrasequence comparison suffices to detect methylated sites in genomic DNA. In conclusion, we present a method to detect all three natural types of DNA methylation in bacterial genomic DNA. This provides the possibility to define the complete methylome of any prokaryote. PMID:16091626

  6. Accumulation of point mutations and reassortment of genomic RNA segments are involved in the microevolution of Puumala hantavirus in a bank vole (Myodes glareolus) population.

    PubMed

    Razzauti, Maria; Plyusnina, Angelina; Henttonen, Heikki; Plyusnin, Alexander

    2008-07-01

    The genetic diversity of Puumala hantavirus (PUUV) was studied in a local population of its natural host, the bank vole (Myodes glareolus). The trapping area (2.5 x 2.5 km) at Konnevesi, Central Finland, included 14 trapping sites, at least 500 m apart; altogether, 147 voles were captured during May and October 2005. Partial sequences of the S, M and L viral genome segments were recovered from 40 animals. Seven, 12 and 17 variants were detected for the S, M and L sequences, respectively; these represent new wild-type PUUV strains that belong to the Finnish genetic lineage. The genetic diversity of PUUV strains from Konnevesi was 0.2-4.9 % for the S segment, 0.2-4.8 % for the M segment and 0.2-9.7 % for the L segment. Most nucleotide substitutions were synonymous and most deduced amino acid substitutions were conservative, probably due to strong stabilizing selection operating at the protein level. Based on both sequence markers and phylogenetic clustering, the S, M and L sequences could be assigned to two groups, 'A' and 'B'. Notably, not all bank voles carried S, M and L sequences belonging to the same group, i.e. S(A)M(A)L(A) or S(B)M(B)L(B). A substantial proportion (8/40, 20 %) of the newly characterized PUUV strains possessed reassortant genomes such as S(B)M(A)L(A), S(A)M(B)L(B) or S(B)M(A)L(B). These results suggest that at least some of the PUUV reassortants are viable and can survive in the presence of their parental strains.

  7. Bacterial sex in dental plaque.

    PubMed

    Olsen, Ingar; Tribble, Gena D; Fiehn, Nils-Erik; Wang, Bing-Yan

    2013-01-01

    Genes are transferred between bacteria in dental plaque by transduction, conjugation, and transformation. Membrane vesicles can also provide a mechanism for horizontal gene transfer. DNA transfer is considered bacterial sex, but the transfer is not parallel to processes that we associate with sex in higher organisms. Several examples of bacterial gene transfer in the oral cavity are given in this review. How frequently this occurs in dental plaque is not clear, but evidence suggests that it affects a number of the major genera present. It has been estimated that new sequences in genomes established through horizontal gene transfer can constitute up to 30% of bacterial genomes. Gene transfer can be both inter- and intrageneric, and it can also affect transient organisms. The transferred DNA can be integrated or recombined in the recipient's chromosome or remain as an extrachromosomal inheritable element. This can make dental plaque a reservoir for antimicrobial resistance genes. The ability to transfer DNA is important for bacteria, making them better adapted to the harsh environment of the human mouth, and promoting their survival, virulence, and pathogenicity.

  8. Short segment search method for phylogenetic analysis using nested sliding windows

    NASA Astrophysics Data System (ADS)

    Iskandar, A. A.; Bustamam, A.; Trimarsanto, H.

    2017-10-01

    To analyze phylogenetics in Bioinformatics, coding DNA sequences (CDS) segment is needed for maximal accuracy. However, analysis by CDS cost a lot of time and money, so a short representative segment by CDS, which is envelope protein segment or non-structural 3 (NS3) segment is necessary. After sliding window is implemented, a better short segment than envelope protein segment and NS3 is found. This paper will discuss a mathematical method to analyze sequences using nested sliding window to find a short segment which is representative for the whole genome. The result shows that our method can find a short segment which more representative about 6.57% in topological view to CDS segment than an Envelope segment or NS3 segment.

  9. Genome-enabled selection doubles the accuracy of predicted breeding values for bacterial cold water disease resistance compared to traditional family-based selection in rainbow trout aquaculture

    USDA-ARS?s Scientific Manuscript database

    We have shown previously that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  10. Assembling the bacterial segrosome.

    PubMed

    Hayes, Finbarr; Barillà, Daniela

    2006-05-01

    Genome segregation in prokaryotes is a highly ordered process that integrates with DNA replication, cytokinesis and other fundamental facets of the bacterial cell cycle. The segrosome is the nucleoprotein complex that mediates DNA segregation in bacteria, its assembly and organization is best understood for plasmid partition. The recent elucidation of structures of the ParB plasmid segregation protein bound to centromeric DNA, and of the tertiary structures of other segregation proteins, are key milestones in the path to deciphering the molecular basis of bacterial DNA segregation.

  11. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slezak, T; Borucki, M; Lam, M

    Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes shouldmore » be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.« less

  12. The quest for a unified view of bacterial land colonization

    PubMed Central

    Wu, Hao; Fang, Yongjun; Yu, Jun; Zhang, Zhang

    2014-01-01

    Exploring molecular mechanisms underlying bacterial water-to-land transition represents a critical start toward a better understanding of the functioning and stability of the terrestrial ecosystems. Here, we perform comprehensive analyses based on a large variety of bacteria by integrating taxonomic, phylogenetic and metagenomic data, in the quest for a unified view that elucidates genomic, evolutionary and ecological dynamics of the marine progenitors in adapting to nonaquatic environments. We hypothesize that bacterial land colonization is dominated by a single-gene sweep, that is, the emergence of dnaE2 derived from an early duplication event of the primordial dnaE, followed by a series of niche-specific genomic adaptations, including GC content increase, intensive horizontal gene transfer and constant genome expansion. In addition, early bacterial radiation may be stimulated by an explosion of land-borne hosts (for example, plants and animals) after initial land colonization events. PMID:24451209

  13. Comparative Genomic Analysis of Xanthomonas axonopodis pv. citrumelo F1, Which Causes Citrus Bacterial Spot Disease, and Related Strains Provides Insights into Virulence and Host Specificity ▿ #

    PubMed Central

    Jalan, Neha; Aritua, Valente; Kumar, Dibyendu; Yu, Fahong; Jones, Jeffrey B.; Graham, James H.; Setubal, João C.; Wang, Nian

    2011-01-01

    Xanthomonas axonopodis pv. citrumelo is a citrus pathogen causing citrus bacterial spot disease that is geographically restricted within the state of Florida. Illumina, 454 sequencing, and optical mapping were used to obtain a complete genome sequence of X. axonopodis pv. citrumelo strain F1, 4.9 Mb in size. The strain lacks plasmids, in contrast to other citrus Xanthomonas pathogens. Phylogenetic analysis revealed that this pathogen is very close to the tomato bacterial spot pathogen X. campestris pv. vesicatoria 85-10, with a completely different host range. We also compared X. axonopodis pv. citrumelo to the genome of citrus canker pathogen X. axonopodis pv. citri 306. Comparative genomic analysis showed differences in several gene clusters, like those for type III effectors, the type IV secretion system, lipopolysaccharide synthesis, and others. In addition to pthA, effectors such as xopE3, xopAI, and hrpW were absent from X. axonopodis pv. citrumelo while present in X. axonopodis pv. citri. These effectors might be responsible for survival and the low virulence of this pathogen on citrus compared to that of X. axonopodis pv. citri. We also identified unique effectors in X. axonopodis pv. citrumelo that may be related to the different host range as compared to that of X. axonopodis pv. citri. X. axonopodis pv. citrumelo also lacks various genes, such as syrE1, syrE2, and RTX toxin family genes, which were present in X. axonopodis pv. citri. These may be associated with the distinct virulences of X. axonopodis pv. citrumelo and X. axonopodis pv. citri. Comparison of the complete genome sequence of X. axonopodis pv. citrumelo to those of X. axonopodis pv. citri and X. campestris pv. vesicatoria provides valuable insights into the mechanism of bacterial virulence and host specificity. PMID:21908674

  14. The origins and impact of primate segmental duplications.

    PubMed

    Marques-Bonet, Tomas; Girirajan, Santhosh; Eichler, Evan E

    2009-10-01

    Duplicated sequences are substrates for the emergence of new genes and are an important source of genetic instability associated with rare and common diseases. Analyses of primate genomes have shown an increase in the proportion of interspersed segmental duplications (SDs) within the genomes of humans and great apes. This contrasts with other mammalian genomes that seem to have their recently duplicated sequences organized in a tandem configuration. In this review, we focus on the mechanistic origin and impact of this difference with respect to evolution, genetic diversity and primate phenotype. Although many genomes will be sequenced in the future, resolution of this aspect of genomic architecture still requires high quality sequences and detailed analyses.

  15. Sequencing of a new target genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) genome project.

    PubMed

    Pittendrigh, B R; Clark, J M; Johnston, J S; Lee, S H; Romero-Severson, J; Dasch, G A

    2006-11-01

    The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus, determining the body louse genome will enhance studies of host-vector-pathogen interactions. Although not important as a major disease vector, head lice are of major social concern. Resistance to traditional pesticides used to control head and body lice have developed. It is imperative that new molecular targets be discovered for the development of novel compounds to control these insects. No complete genome sequence exists for a hemimetabolous insect species primarily because hemimetabolous insects often have large (2000 Mb) to very large (up to 16,300 Mb) genomes. Fortuitously, we determined that the human body louse has one of the smallest genome sizes known in insects, suggesting it may be a suitable choice as a minimal hemimetabolous genome in which many genes have been eliminated during its adaptation to human parasitism. Because many louse species infest birds and mammals, the body louse genome-sequencing project will facilitate studies of their comparative genomics. A 6-8X coverage of the body louse genome, plus sequenced expressed sequence tags, should provide the entomological, evolutionary biology, medical, and public health communities with useful genetic information.

  16. Bacterial cell identification in differential interference contrast microscopy images.

    PubMed

    Obara, Boguslaw; Roberts, Mark A J; Armitage, Judith P; Grau, Vicente

    2013-04-23

    Microscopy image segmentation lays the foundation for shape analysis, motion tracking, and classification of biological objects. Despite its importance, automated segmentation remains challenging for several widely used non-fluorescence, interference-based microscopy imaging modalities. For example in differential interference contrast microscopy which plays an important role in modern bacterial cell biology. Therefore, new revolutions in the field require the development of tools, technologies and work-flows to extract and exploit information from interference-based imaging data so as to achieve new fundamental biological insights and understanding. We have developed and evaluated a high-throughput image analysis and processing approach to detect and characterize bacterial cells and chemotaxis proteins. Its performance was evaluated using differential interference contrast and fluorescence microscopy images of Rhodobacter sphaeroides. Results demonstrate that the proposed approach provides a fast and robust method for detection and analysis of spatial relationship between bacterial cells and their chemotaxis proteins.

  17. Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome

    PubMed Central

    Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

    2014-01-01

    Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes. PMID:25482895

  18. Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome.

    PubMed

    Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

    2014-01-01

    Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes.

  19. CRISPR/Cas9 Editing of the Bacillus subtilis Genome

    PubMed Central

    Burby, Peter E.; Simmons, Lyle A.

    2017-01-01

    A fundamental procedure for most modern biologists is the genetic manipulation of the organism under study. Although many different methods for editing bacterial genomes have been used in laboratories for decades, the adaptation of CRISPR/Cas9 technology to bacterial genetics has allowed researchers to manipulate bacterial genomes with unparalleled facility. CRISPR/Cas9 has allowed for genome edits to be more precise, while also increasing the efficiency of transferring mutations into a variety of genetic backgrounds. As a result, the advantages are realized in tractable organisms and organisms that have been refractory to genetic manipulation. Here, we describe our method for editing the genome of the bacterium Bacillus subtilis. Our method is highly efficient, resulting in precise, markerless mutations. Further, after generating the editing plasmid, the mutation can be quickly introduced into several genetic backgrounds, greatly increasing the speed with which genetic analyses may be performed. PMID:28706963

  20. Programming biological operating systems: genome design, assembly and activation.

    PubMed

    Gibson, Daniel G

    2014-05-01

    The DNA technologies developed over the past 20 years for reading and writing the genetic code converged when the first synthetic cell was created 4 years ago. An outcome of this work has been an extraordinary set of tools for synthesizing, assembling, engineering and transplanting whole bacterial genomes. Technical progress, options and applications for bacterial genome design, assembly and activation are discussed.

  1. Genomic Species Are Ecological Species as Revealed by Comparative Genomics in Agrobacterium tumefaciens

    PubMed Central

    Lassalle, Florent; Campillo, Tony; Vial, Ludovic; Baude, Jessica; Costechareyre, Denis; Chapulliot, David; Shams, Malek; Abrouk, Danis; Lavire, Céline; Oger-Desfeux, Christine; Hommais, Florence; Guéguen, Laurent; Daubin, Vincent; Muller, Daniel; Nesme, Xavier

    2011-01-01

    The definition of bacterial species is based on genomic similarities, giving rise to the operational concept of genomic species, but the reasons of the occurrence of differentiated genomic species remain largely unknown. We used the Agrobacterium tumefaciens species complex and particularly the genomic species presently called genomovar G8, which includes the sequenced strain C58, to test the hypothesis of genomic species having specific ecological adaptations possibly involved in the speciation process. We analyzed the gene repertoire specific to G8 to identify potential adaptive genes. By hybridizing 25 strains of A. tumefaciens on DNA microarrays spanning the C58 genome, we highlighted the presence and absence of genes homologous to C58 in the taxon. We found 196 genes specific to genomovar G8 that were mostly clustered into seven genomic islands on the C58 genome—one on the circular chromosome and six on the linear chromosome—suggesting higher plasticity and a major adaptive role of the latter. Clusters encoded putative functional units, four of which had been verified experimentally. The combination of G8-specific functions defines a hypothetical species primary niche for G8 related to commensal interaction with a host plant. This supports that the G8 ancestor was able to exploit a new ecological niche, maybe initiating ecological isolation and thus speciation. Searching genomic data for synapomorphic traits is a powerful way to describe bacterial species. This procedure allowed us to find such phenotypic traits specific to genomovar G8 and thus propose a Latin binomial, Agrobacterium fabrum, for this bona fide genomic species. PMID:21795751

  2. Cronobacter, the emergent bacterial pathogen Enterobacter sakazakii comes of age; MLST and whole genome sequence analysis.

    PubMed

    Forsythe, Stephen J; Dickins, Benjamin; Jolley, Keith A

    2014-12-16

    Following the association of Cronobacter spp. to several publicized fatal outbreaks in neonatal intensive care units of meningitis and necrotising enterocolitis, the World Health Organization (WHO) in 2004 requested the establishment of a molecular typing scheme to enable the international control of the organism. This paper presents the application of Next Generation Sequencing (NGS) to Cronobacter which has led to the establishment of the Cronobacter PubMLST genome and sequence definition database (http://pubmlst.org/cronobacter/) containing over 1000 isolates with metadata along with the recognition of specific clonal lineages linked to neonatal meningitis and adult infections Whole genome sequencing and multilocus sequence typing (MLST) has supports the formal recognition of the genus Cronobacter composed of seven species to replace the former single species Enterobacter sakazakii. Applying the 7-loci MLST scheme to 1007 strains revealed 298 definable sequence types, yet only C. sakazakii clonal complex 4 (CC4) was principally associated with neonatal meningitis. This clonal lineage has been confirmed using ribosomal-MLST (51-loci) and whole genome-MLST (1865 loci) to analyse 107 whole genomes via the Cronobacter PubMLST database. This database has enabled the retrospective analysis of historic cases and outbreaks following re-identification of those strains. The Cronobacter PubMLST database offers a central, open access, reliable sequence-based repository for researchers. It has the capacity to create new analysis schemes 'on the fly', and to integrate metadata (source, geographic distribution, clinical presentation). It is also expandable and adaptable to changes in taxonomy, and able to support the development of reliable detection methods of use to industry and regulatory authorities. Therefore it meets the WHO (2004) request for the establishment of a typing scheme for this emergent bacterial pathogen. Whole genome sequencing has additionally shown a range

  3. Genomic selection models double the accuracy of predicted breeding values for bacterial cold water disease resistance compared to a traditional pedigree-based model in rainbow trout aquaculture

    USDA-ARS?s Scientific Manuscript database

    Previously we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  4. Evolutionary force of AT-rich repeats to trap genomic and episomal DNAs into the rice genome: lessons from endogenous pararetrovirus.

    PubMed

    Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji

    2012-12-01

    In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  5. A conserved segmental duplication within ELA.

    PubMed

    Brinkmeyer-Langford, C L; Murphy, W J; Childers, C P; Skow, L C

    2010-12-01

    The assembled genomic sequence of the horse major histocompatibility complex (MHC) (equine lymphocyte antigen, ELA) is very similar to the homologous human HLA, with the notable exception of a large segmental duplication at the boundary of ELA class I and class III that is absent in HLA. The segmental duplication consists of a ∼ 710 kb region of at least 11 repeated blocks: 10 blocks each contain an MHC class I-like sequence and the helicase domain portion of a BAT1-like sequence, and the remaining unit contains the full-length BAT1 gene. Similar genomic features were found in other Perissodactyls, indicating an ancient origin, which is consistent with phylogenetic analyses. Reverse-transcriptase PCR (RT-PCR) of mRNA from peripheral white blood cells of healthy and chronically or acutely infected horses detected transcription from predicted open reading frames in several of the duplicated blocks. This duplication is not present in the sequenced MHCs of most other mammals, although a similar feature at the same relative position is present in the feline MHC (FLA). Striking sequence conservation throughout Perissodactyl evolution is consistent with a functional role for at least some of the genes included within this segmental duplication. © 2010 The Authors, Journal compilation © 2010 Stichting International Foundation for Animal Genetics.

  6. A web-based genomic sequence database for the Streptomycetaceae: a tool for systematics and genome mining

    USDA-ARS?s Scientific Manuscript database

    The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...

  7. An Inhibitory Motif on the 5’UTR of Several Rotavirus Genome Segments Affects Protein Expression and Reverse Genetics Strategies

    PubMed Central

    Papa, Guido; Eichwald, Catherine; Burrone, Oscar R.

    2016-01-01

    Rotavirus genome consists of eleven segments of dsRNA, each encoding one single protein. Viral mRNAs contain an open reading frame (ORF) flanked by relatively short untranslated regions (UTRs), whose role in the viral cycle remains elusive. Here we investigated the role of 5’UTRs in T7 polymerase-driven cDNAs expression in uninfected cells. The 5’UTRs of eight genome segments (gs3, gs5-6, gs7-11) of the simian SA11 strain showed a strong inhibitory effect on the expression of viral proteins. Decreased protein expression was due to both compromised transcription and translation and was independent of the ORF and the 3’UTR sequences. Analysis of several mutants of the 21-nucleotide long 5’UTR of gs 11 defined an inhibitory motif (IM) represented by its primary sequence rather than its secondary structure. IM was mapped to the 5’ terminal 6-nucleotide long pyrimidine-rich tract 5’-GGY(U/A)UY-3’. The 5’ terminal position within the mRNA was shown to be essentially required, as inhibitory activity was lost when IM was moved to an internal position. We identified two mutations (insertion of a G upstream the 5’UTR and the U to A mutation of the fifth nucleotide of IM) that render IM non-functional and increase the transcription and translation rate to levels that could considerably improve the efficiency of virus helper-free reverse genetics strategies. PMID:27846320

  8. Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes

    PubMed Central

    Doerr, Daniel; Chauve, Cedric

    2017-01-01

    Yersinia pestis is the causative agent of the bubonic plague, a disease responsible for several dramatic historical pandemics. Progress in ancient DNA (aDNA) sequencing rendered possible the sequencing of whole genomes of important human pathogens, including the ancient Y. pestis strains responsible for outbreaks of the bubonic plague in London in the 14th century and in Marseille in the 18th century, among others. However, aDNA sequencing data are still characterized by short reads and non-uniform coverage, so assembling ancient pathogen genomes remains challenging and often prevents a detailed study of genome rearrangements. It has recently been shown that comparative scaffolding approaches can improve the assembly of ancient Y. pestis genomes at a chromosome level. In the present work, we address the last step of genome assembly, the gap-filling stage. We describe an optimization-based method AGapEs (ancestral gap estimation) to fill in inter-contig gaps using a combination of a template obtained from related extant genomes and aDNA reads. We show how this approach can be used to refine comparative scaffolding by selecting contig adjacencies supported by a mix of unassembled aDNA reads and comparative signal. We applied our method to two Y. pestis data sets from the London and Marseilles outbreaks, for which we obtained highly improved genome assemblies for both genomes, comprised of, respectively, five and six scaffolds with 95 % of the assemblies supported by ancient reads. We analysed the genome evolution between both ancient genomes in terms of genome rearrangements, and observed a high level of synteny conservation between these strains. PMID:29114402

  9. Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworks.

    PubMed

    Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun

    2014-11-25

    The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position

  10. Genome-wide association studies reveal similar genetic architecture with shared and unique QTL for Bacterial Cold Water Disease resistance in two rainbow trout (Oncorhynchus mykiss) breeding populations

    USDA-ARS?s Scientific Manuscript database

    Bacterial cold water disease (BCWD) causes significant mortality and economic losses in salmonid aquaculture. In previous studies, we identified moderate-large effect QTL for BCWD resistance in rainbow trout (Oncorhynchus mykiss). However, the recent availability of a 57K SNP array and a genome phys...

  11. Bacterial genomes lacking long-range correlations may not be modeled by low-order Markov chains: the role of mixing statistics and frame shift of neighboring genes.

    PubMed

    Cocho, Germinal; Miramontes, Pedro; Mansilla, Ricardo; Li, Wentian

    2014-12-01

    We examine the relationship between exponential correlation functions and Markov models in a bacterial genome in detail. Despite the well known fact that Markov models generate sequences with correlation function that decays exponentially, simply constructed Markov models based on nearest-neighbor dimer (first-order), trimer (second-order), up to hexamer (fifth-order), and treating the DNA sequence as being homogeneous all fail to predict the value of exponential decay rate. Even reading-frame-specific Markov models (both first- and fifth-order) could not explain the fact that the exponential decay is very slow. Starting with the in-phase coding-DNA-sequence (CDS), we investigated correlation within a fixed-codon-position subsequence, and in artificially constructed sequences by packing CDSs with out-of-phase spacers, as well as altering CDS length distribution by imposing an upper limit. From these targeted analyses, we conclude that the correlation in the bacterial genomic sequence is mainly due to a mixing of heterogeneous statistics at different codon positions, and the decay of correlation is due to the possible out-of-phase between neighboring CDSs. There are also small contributions to the correlation from bases at the same codon position, as well as by non-coding sequences. These show that the seemingly simple exponential correlation functions in bacterial genome hide a complexity in correlation structure which is not suitable for a modeling by Markov chain in a homogeneous sequence. Other results include: use of the (absolute value) second largest eigenvalue to represent the 16 correlation functions and the prediction of a 10-11 base periodicity from the hexamer frequencies. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. Identifying uniformly mutated segments within repeats.

    PubMed

    Sahinalp, S Cenk; Eichler, Evan; Goldberg, Paul; Berenbrink, Petra; Friedetzky, Tom; Ergun, Funda

    2004-12-01

    Given a long string of characters from a constant size alphabet we present an algorithm to determine whether its characters have been generated by a single i.i.d. random source. More specifically, consider all possible n-coin models for generating a binary string S, where each bit of S is generated via an independent toss of one of the n coins in the model. The choice of which coin to toss is decided by a random walk on the set of coins where the probability of a coin change is much lower than the probability of using the same coin repeatedly. We present a procedure to evaluate the likelihood of a n-coin model for given S, subject a uniform prior distribution over the parameters of the model (that represent mutation rates and probabilities of copying events). In the absence of detailed prior knowledge of these parameters, the algorithm can be used to determine whether the a posteriori probability for n=1 is higher than for any other n>1. Our algorithm runs in time O(l4logl), where l is the length of S, through a dynamic programming approach which exploits the assumed convexity of the a posteriori probability for n. Our test can be used in the analysis of long alignments between pairs of genomic sequences in a number of ways. For example, functional regions in genome sequences exhibit much lower mutation rates than non-functional regions. Because our test provides means for determining variations in the mutation rate, it may be used to distinguish functional regions from non-functional ones. Another application is in determining whether two highly similar, thus evolutionarily related, genome segments are the result of a single copy event or of a complex series of copy events. This is particularly an issue in evolutionary studies of genome regions rich with repeat segments (especially tandemly repeated segments).

  13. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  14. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life.

    PubMed

    Parks, Donovan H; Rinke, Christian; Chuvochina, Maria; Chaumeil, Pierre-Alain; Woodcroft, Ben J; Evans, Paul N; Hugenholtz, Philip; Tyson, Gene W

    2017-11-01

    Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in sequencing throughput and computational techniques that allow for the cultivation-independent recovery of genomes from metagenomes. Here, we report the reconstruction of 7,903 bacterial and archaeal genomes from >1,500 public metagenomes. All genomes are estimated to be ≥50% complete and nearly half are ≥90% complete with ≤5% contamination. These genomes increase the phylogenetic diversity of bacterial and archaeal genome trees by >30% and provide the first representatives of 17 bacterial and three archaeal candidate phyla. We also recovered 245 genomes from the Patescibacteria superphylum (also known as the Candidate Phyla Radiation) and find that the relative diversity of this group varies substantially with different protein marker sets. The scale and quality of this data set demonstrate that recovering genomes from metagenomes provides an expedient path forward to exploring microbial dark matter.

  15. DNA Data Visualization (DDV): Software for Generating Web-Based Interfaces Supporting Navigation and Analysis of DNA Sequence Data of Entire Genomes.

    PubMed

    Neugebauer, Tomasz; Bordeleau, Eric; Burrus, Vincent; Brzezinski, Ryszard

    2015-01-01

    Data visualization methods are necessary during the exploration and analysis activities of an increasingly data-intensive scientific process. There are few existing visualization methods for raw nucleotide sequences of a whole genome or chromosome. Software for data visualization should allow the researchers to create accessible data visualization interfaces that can be exported and shared with others on the web. Herein, novel software developed for generating DNA data visualization interfaces is described. The software converts DNA data sets into images that are further processed as multi-scale images to be accessed through a web-based interface that supports zooming, panning and sequence fragment selection. Nucleotide composition frequencies and GC skew of a selected sequence segment can be obtained through the interface. The software was used to generate DNA data visualization of human and bacterial chromosomes. Examples of visually detectable features such as short and long direct repeats, long terminal repeats, mobile genetic elements, heterochromatic segments in microbial and human chromosomes, are presented. The software and its source code are available for download and further development. The visualization interfaces generated with the software allow for the immediate identification and observation of several types of sequence patterns in genomes of various sizes and origins. The visualization interfaces generated with the software are readily accessible through a web browser. This software is a useful research and teaching tool for genetics and structural genomics.

  16. Comparative ruminant genomics highlights segmental duplication and mobile element insertion diversity

    USDA-ARS?s Scientific Manuscript database

    We have expanded upon a previously reported comparative genomics approach using a read-depth (JaRMs) and a hybrid read-pair, split-read (RAPTR-SV) copy number variation (CNV) detection method that uses read alignments to the cattle reference genome in order to identify species-specific genomic rearr...

  17. Genomic survey of pathogenicity determinants and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. Manihotis strain CIO151.

    PubMed

    Arrieta-Ortiz, Mario L; Rodríguez-R, Luis M; Pérez-Quintero, Álvaro L; Poulin, Lucie; Díaz, Ana C; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D; Ortiz Quiñones, Juan F; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P; Tabima, Javier; Urrego Morales, Oscar G; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo; Koebnik, Ralf; Bernal, Adriana

    2013-01-01

    Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis

  18. Genomic Survey of Pathogenicity Determinants and VNTR Markers in the Cassava Bacterial Pathogen Xanthomonas axonopodis pv. Manihotis Strain CIO151

    PubMed Central

    Arrieta-Ortiz, Mario L.; Rodríguez-R, Luis M.; Pérez-Quintero, Álvaro L.; Poulin, Lucie; Díaz, Ana C.; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D.; Ortiz Quiñones, Juan F.; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B.; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P.; Tabima, Javier; Urrego Morales, Oscar G.; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo

    2013-01-01

    Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis

  19. Evolutionary genomics: is Buchnera a bacterium or an organelle?

    PubMed

    Andersson, J O

    2000-11-30

    The first genome sequence of an intracellular bacterial symbiont of a eukaryotic cell has been determined. The Buchnera genome shares features with the genomes of both intracellular pathogenic bacteria and eukaryotic organelles, and it may represent an intermediate between the two.

  20. Bacterial RNA Biology on a Genome Scale.

    PubMed

    Hör, Jens; Gorski, Stanislaw A; Vogel, Jörg

    2018-06-07

    Bacteria are an exceedingly diverse group of organisms whose molecular exploration is experiencing a renaissance. While the classical view of bacterial gene expression was relatively simple, the emerging view is more complex, encompassing extensive post-transcriptional control involving riboswitches, RNA thermometers, and regulatory small RNAs (sRNAs) associated with the RNA-binding proteins CsrA, Hfq, and ProQ, as well as CRISPR/Cas systems that are programmed by RNAs. Moreover, increasing interest in members of the human microbiota and environmental microbial communities has highlighted the importance of understudied bacterial species with largely unknown transcriptome structures and RNA-based control mechanisms. Collectively, this creates a need for global RNA biology approaches that can rapidly and comprehensively analyze the RNA composition of a bacterium of interest. We review such approaches with a focus on RNA-seq as a versatile tool to investigate the different layers of gene expression in which RNA is made, processed, regulated, modified, translated, and turned over. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Coding Complete Genome for the Mogiana Tick Virus, a Jingmenvirus Isolated from Ticks in Brazil

    DTIC Science & Technology

    2017-05-04

    sequences for all four genome segments. We downloaded the raw Illumina sequence reads from the NCBI Short Read Archive (GenBank...MGTV genome segments through sequence similarity (BLASTN) to the published genome of Jingmen tick virus (JMTV) isolate SY84 (GenBank: KJ001579-KJ001582...2014. Standards for sequencing viral genomes in the era of high-throughput sequencing . MBio 5:e01360–14. 8. Bankevich A, Nurk S, Antipov

  2. Limitations to estimating bacterial cross-species transmission using genetic and genomic markers: inferences from simulation modeling

    PubMed Central

    Benavides, Julio A; Cross, Paul C; Luikart, Gordon; Creel, Scott

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced. PMID:25469159

  3. Genome Sequencing of Steroid Producing Bacteria Using Ion Torrent Technology and a Reference Genome.

    PubMed

    Sola-Landa, Alberto; Rodríguez-García, Antonio; Barreiro, Carlos; Pérez-Redondo, Rosario

    2017-01-01

    The Next-Generation Sequencing technology has enormously eased the bacterial genome sequencing and several tens of thousands of genomes have been sequenced during the last 10 years. Most of the genome projects are published as draft version, however, for certain applications the complete genome sequence is required.In this chapter, we describe the strategy that allowed the complete genome sequencing of Mycobacterium neoaurum NRRL B-3805, an industrial strain exploited for steroid production, using Ion Torrent sequencing reads and the genome of a close strain as the reference. This protocol can be applied to analyze the genetic variations between closely related strains; for example, to elucidate the point mutations between a parental strain and a random mutagenesis-derived mutant.

  4. The map-based genome sequence of Spirodela polyrhiza aligned with its chromosomes, a reference for karyotype evolution.

    PubMed

    Cao, Hieu Xuan; Vu, Giang Thi Ha; Wang, Wenqin; Appenroth, Klaus J; Messing, Joachim; Schubert, Ingo

    2016-01-01

    Duckweeds are aquatic monocotyledonous plants of potential economic interest with fast vegetative propagation, comprising 37 species with variable genome sizes (0.158-1.88 Gbp). The genomic sequence of Spirodela polyrhiza, the smallest and the most ancient duckweed genome, needs to be aligned to its chromosomes as a reference and prerequisite to study the genome and karyotype evolution of other duckweed species. We selected physically mapped bacterial artificial chromosomes (BACs) containing Spirodela DNA inserts with little or no repetitive elements as probes for multicolor fluorescence in situ hybridization (mcFISH), using an optimized BAC pooling strategy, to validate its physical map and correlate it with its chromosome complement. By consecutive mcFISH analyses, we assigned the originally assembled 32 pseudomolecules (supercontigs) of the genomic sequences to the 20 chromosomes of S. polyrhiza. A Spirodela cytogenetic map containing 96 BAC markers with an average distance of 0.89 Mbp was constructed. Using a cocktail of 41 BACs in three colors, all chromosome pairs could be individualized simultaneously. Seven ancestral blocks emerged from duplicated chromosome segments of 19 Spirodela chromosomes. The chromosomally integrated genome of S. polyrhiza and the established prerequisites for comparative chromosome painting enable future studies on the chromosome homoeology and karyotype evolution of duckweed species. © 2015 IPK Gatersleben. New Phytologist © 2015 New Phytologist Trust.

  5. Comparative analysis of the complete genome of KPC-2-producing Klebsiella pneumoniae Kp13 reveals remarkable genome plasticity and a wide repertoire of virulence and resistance mechanisms

    PubMed Central

    2014-01-01

    many hierarchical levels (from whole genomic segments to individual nucleotide bases) may play a role on the lifestyle of K. pneumoniae Kp13 and underlie the importance of whole-genome sequencing to study bacterial pathogens. The general chromosomal structure was somewhat conserved among the compared bacteria, and recombination events with consequent gain/loss of genomic segments appears to be driving the evolution of these strains. PMID:24450656

  6. Complete Genome Sequence and Immunoproteomic Analyses of the Bacterial Fish Pathogen Streptococcus parauberis▿†

    PubMed Central

    Nho, Seong Won; Hikima, Jun-ichi; Cha, In Seok; Park, Seong Bin; Jang, Ho Bin; del Castillo, Carmelo S.; Kondo, Hidehiro; Hirono, Ikuo; Aoki, Takashi; Jung, Tae Sung

    2011-01-01

    Although Streptococcus parauberis is known as a bacterial pathogen associated with bovine udder mastitis, it has recently become one of the major causative agents of olive flounder (Paralichthys olivaceus) streptococcosis in northeast Asia, causing massive mortality resulting in severe economic losses. S. parauberis contains two serotypes, and it is likely that capsular polysaccharide antigens serve to differentiate the serotypes. In the present study, the complete genome sequence of S. parauberis (serotype I) was determined using the GS-FLX system to investigate its phylogeny, virulence factors, and antigenic proteins. S. parauberis possesses a single chromosome of 2,143,887 bp containing 1,868 predicted coding sequences (CDSs), with an average GC content of 35.6%. Whole-genome dot plot analysis and phylogenetic analysis of a 60-kDa chaperonin-encoding gene and the glyceraldehyde-3-phosphate dehydrogenase (GAPDH)-encoding gene showed that the strain was evolutionarily closely related to Streptococcus uberis. S. parauberis antigenic proteins were analyzed using an immunoproteomic technique. Twenty-one antigenic protein spots were identified in S. parauberis, by reaction with an antiserum obtained from S. parauberis-challenged olive flounder. This work provides the foundation needed to understand more clearly the relationship between pathogen and host and develops new approaches toward prophylactic and therapeutic strategies to deal with streptococcosis in fish. The work also provides a better understanding of the physiology and evolution of a significant representative of the Streptococcaceae. PMID:21531805

  7. Draft Genome Sequence of Two Strains of Xanthomonas arboricola Isolated from Prunus persica Which Are Dissimilar to Strains That Cause Bacterial Spot Disease on Prunus spp.

    PubMed Central

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.

    2016-01-01

    The draft genome sequences of two strains of Xanthomonas arboricola, isolated from asymptomatic peach trees in Spain, are reported here. These strains are avirulent and do not belong to the same phylogroup as X. arboricola pv. pruni, a causal agent of bacterial spot disease of stone fruits and almonds. PMID:27609931

  8. Genome-derived vaccines.

    PubMed

    De Groot, Anne S; Rappuoli, Rino

    2004-02-01

    Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.

  9. Exploiting Bacterial Whole-Genome Sequencing Data for Evaluation of Diagnostic Assays: Campylobacter Species Identification as a Case Study

    PubMed Central

    Jansen van Rensburg, Melissa J.; Swift, Craig; Cody, Alison J.; Jenkins, Claire

    2016-01-01

    The application of whole-genome sequencing (WGS) to problems in clinical microbiology has had a major impact on the field. Clinical laboratories are now using WGS for pathogen identification, antimicrobial susceptibility testing, and epidemiological typing. WGS data also represent a valuable resource for the development and evaluation of molecular diagnostic assays, which continue to play an important role in clinical microbiology. To demonstrate this application of WGS, this study used publicly available genomic data to evaluate a duplex real-time PCR (RT-PCR) assay that targets mapA and ceuE for the detection of Campylobacter jejuni and Campylobacter coli, leading global causes of bacterial gastroenteritis. In silico analyses of mapA and ceuE primer and probe sequences from 1,713 genetically diverse C. jejuni and C. coli genomes, supported by RT-PCR testing, indicated that the assay was robust, with 1,707 (99.7%) isolates correctly identified. The high specificity of the mapA-ceuE assay was the result of interspecies diversity and intraspecies conservation of the target genes in C. jejuni and C. coli. Rare instances of a lack of specificity among C. coli isolates were due to introgression in mapA or sequence diversity in ceuE. The results of this study illustrate how WGS can be exploited to evaluate molecular diagnostic assays by using publicly available data, online databases, and open-source software. PMID:27733632

  10. Precise, High-throughput Analysis of Bacterial Growth.

    PubMed

    Kurokawa, Masaomi; Ying, Bei-Wen

    2017-09-19

    Bacterial growth is a central concept in the development of modern microbial physiology, as well as in the investigation of cellular dynamics at the systems level. Recent studies have reported correlations between bacterial growth and genome-wide events, such as genome reduction and transcriptome reorganization. Correctly analyzing bacterial growth is crucial for understanding the growth-dependent coordination of gene functions and cellular components. Accordingly, the precise quantitative evaluation of bacterial growth in a high-throughput manner is required. Emerging technological developments offer new experimental tools that allow updates of the methods used for studying bacterial growth. The protocol introduced here employs a microplate reader with a highly optimized experimental procedure for the reproducible and precise evaluation of bacterial growth. This protocol was used to evaluate the growth of several previously described Escherichia coli strains. The main steps of the protocol are as follows: the preparation of a large number of cell stocks in small vials for repeated tests with reproducible results, the use of 96-well plates for high-throughput growth evaluation, and the manual calculation of two major parameters (i.e., maximal growth rate and population density) representing the growth dynamics. In comparison to the traditional colony-forming unit (CFU) assay, which counts the cells that are cultured in glass tubes over time on agar plates, the present method is more efficient and provides more detailed temporal records of growth changes, but has a stricter detection limit at low population densities. In summary, the described method is advantageous for the precise and reproducible high-throughput analysis of bacterial growth, which can be used to draw conceptual conclusions or to make theoretical observations.

  11. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  12. Delineating slowly and rapidly evolving fractions of the Drosophila genome.

    PubMed

    Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S

    2008-05-01

    Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.

  13. Short- and Long-term Evolutionary Dynamics of Bacterial Insertion Sequences: Insights from Wolbachia Endosymbionts

    PubMed Central

    Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard

    2011-01-01

    Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52–171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes. PMID:21940637

  14. Short- and long-term evolutionary dynamics of bacterial insertion sequences: insights from Wolbachia endosymbionts.

    PubMed

    Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard

    2011-01-01

    Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52-171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes.

  15. Segmental allotetraploidy and allelic interactions in buffelgrass (Pennisetum ciliare (L.) Link syn. Cenchrus ciliaris L.) as revealed by genome mapping.

    PubMed

    Jessup, R W; Burson, B L; Burow, O; Wang, Y W; Chang, C; Li, Z; Paterson, A H; Hussey, M A

    2003-04-01

    Linkage analyses increasingly complement cytological and traditional plant breeding techniques by providing valuable information regarding genome organization and transmission genetics of complex polyploid species. This study reports a genome map of buffelgrass (Pennisetum ciliare (L.) Link syn. Cenchrus ciliaris L.). Maternal and paternal maps were constructed with restriction fragment length polymorphisms (RFLPs) segregating in 87 F1 progeny from an intraspecific cross between two heterozygous genotypes. A survey of 862 heterologous cDNAs and gDNAs from across the Poaceae, as well as 443 buffelgrass cDNAs, yielded 100 and 360 polymorphic probes, respectively. The maternal map included 322 RFLPs, 47 linkage groups, and 3464 cM, whereas the paternal map contained 245 RFLPs, 42 linkage groups, and 2757 cM. Approximately 70 to 80% of the buffelgrass genome was covered, and the average marker spacing was 10.8 and 11.3 cM on the respective maps. Preferential pairing was indicated between many linkage groups, which supports cytological reports that buffelgrass is a segmental allotetraploid. More preferential pairing (disomy) was found in the maternal than paternal parent across linkage groups (55 vs. 38%) and loci (48 vs. 15%). Comparison of interval lengths in 15 allelic bridges indicated significantly less meiotic recombination in paternal gametes. Allelic interactions were detected in four regions of the maternal map and were absent in the paternal map.

  16. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence.

    PubMed

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A; Garrido, Joseba M; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-11-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  17. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence

    PubMed Central

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A.; Garrido, Joseba M.; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-01-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  18. Family genome browser: visualizing genomes with pedigree information.

    PubMed

    Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

    2015-07-15

    Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. [Comparative analysis of variable regions in the genomes of variola virus].

    PubMed

    Babkin, I V; Nepomniashchikh, T S; Maksiutov, R A; Gutorov, V V; Babkina, I N; Shchelkunov, S N

    2008-01-01

    Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable.

  20. Determining the culturability of the rumen bacterial microbiome

    PubMed Central

    Creevey, Christopher J; Kelly, William J; Henderson, Gemma; Leahy, Sinead C

    2014-01-01

    The goal of the Hungate1000 project is to generate a reference set of rumen microbial genome sequences. Toward this goal we have carried out a meta-analysis using information from culture collections, scientific literature, and the NCBI and RDP databases and linked this with a comparative study of several rumen 16S rRNA gene-based surveys. In this way we have attempted to capture a snapshot of rumen bacterial diversity to examine the culturable fraction of the rumen bacterial microbiome. Our analyses have revealed that for cultured rumen bacteria, there are many genera without a reference genome sequence. Our examination of culture-independent studies highlights that there are few novel but many uncultured taxa within the rumen bacterial microbiome. Taken together these results have allowed us to compile a list of cultured rumen isolates that are representative of abundant, novel and core bacterial species in the rumen. In addition, we have identified taxa, particularly within the phylum Bacteroidetes, where further cultivation efforts are clearly required. This information is being used to guide the isolation efforts and selection of bacteria from the rumen microbiota for sequencing through the Hungate1000. PMID:24986151

  1. Comparison of different methods for isolation of bacterial DNA from retail oyster tissues

    USDA-ARS?s Scientific Manuscript database

    Oysters are filter-feeders that bio-accumulate bacteria in water while feeding. To evaluate the bacterial genomic DNA extracted from retail oyster tissues, including the gills and digestive glands, four isolation methods were used. Genomic DNA extraction was performed using the Allmag™ Blood Genomic...

  2. Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project

    DOE PAGES

    Kyrpides, Nikos C.; Woyke, Tanja; Eisen, Jonathan A.; ...

    2014-06-15

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project with the objective of sequencing 250 bacterial and archaeal genomes. The two major goals of that project were (a) to test the hypothesis that there are many benefits to the use the phylogenetic diversity of organisms in the tree of life as a primary criterion for generating their genome sequence and (b) to develop the necessary framework, technology and organization for large-scale sequencing of microbial isolate genomes. While the GEBA pilot project has not yet been entirely completed, both ofmore » the original goals have already been successfully accomplished, leading the way for the next phase of the project. Here we propose taking the GEBA project to the next level, by generating high quality draft genomes for 1,000 bacterial and archaeal strains. This represents a combined 16-fold increase in both scale and speed as compared to the GEBA pilot project (250 isolate genomes in 4+ years). We will follow a similar approach for organism selection and sequencing prioritization as was done for the GEBA pilot project (i.e. phylogenetic novelty, availability and growth of cultures of type strains and DNA extraction capability), focusing on type strains as this ensures reproducibility of our results and provides the strongest linkage between genome sequences and other knowledge about each strain. In turn, this project will constitute a pilot phase of a larger effort that will target the genome sequences of all available type strains of the Bacteria and Archaea.« less

  3. Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kyrpides, Nikos C.; Woyke, Tanja; Eisen, Jonathan A.

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project with the objective of sequencing 250 bacterial and archaeal genomes. The two major goals of that project were (a) to test the hypothesis that there are many benefits to the use the phylogenetic diversity of organisms in the tree of life as a primary criterion for generating their genome sequence and (b) to develop the necessary framework, technology and organization for large-scale sequencing of microbial isolate genomes. While the GEBA pilot project has not yet been entirely completed, both ofmore » the original goals have already been successfully accomplished, leading the way for the next phase of the project. Here we propose taking the GEBA project to the next level, by generating high quality draft genomes for 1,000 bacterial and archaeal strains. This represents a combined 16-fold increase in both scale and speed as compared to the GEBA pilot project (250 isolate genomes in 4+ years). We will follow a similar approach for organism selection and sequencing prioritization as was done for the GEBA pilot project (i.e. phylogenetic novelty, availability and growth of cultures of type strains and DNA extraction capability), focusing on type strains as this ensures reproducibility of our results and provides the strongest linkage between genome sequences and other knowledge about each strain. In turn, this project will constitute a pilot phase of a larger effort that will target the genome sequences of all available type strains of the Bacteria and Archaea.« less

  4. Detection of Low-Copy-Number Genomic DNA Sequences in Individual Bacterial Cells by Using Peptide Nucleic Acid-Assisted Rolling-Circle Amplification and Fluorescence In Situ Hybridization▿ †

    PubMed Central

    Smolina, Irina; Lee, Charles; Frank-Kamenetskii, Maxim

    2007-01-01

    An approach is proposed for in situ detection of short signature DNA sequences present in single copies per bacterial genome. The site is locally opened by peptide nucleic acids, and a circular oligonucleotide is assembled. The amplicon generated by rolling circle amplification is detected by hybridization with fluorescently labeled decorator probes. PMID:17293504

  5. Modeling Intraocular Bacterial Infections

    PubMed Central

    Astley, Roger A.; Coburn, Phillip S.; Parkunan, Salai Madhumathi; Callegan, Michelle C.

    2016-01-01

    Bacterial endophthalmitis is an infection and inflammation of the posterior segment of the eye which can result in significant loss of visual acuity. Even with prompt antibiotic, anti-inflammatory and surgical intervention, vision and even the eye itself may be lost. For the past century, experimental animal models have been used to examine various aspects of the pathogenesis and pathophysiology of bacterial endophthalmitis, to further the development of anti-inflammatory treatment strategies, and to evaluate the pharmacokinetics and efficacies of antibiotics. Experimental models allow independent control of many parameters of infection and facilitate systematic examination of infection outcomes. While no single animal model perfectly reproduces the human pathology of bacterial endophthalmitis, investigators have successfully used these models to understand the infectious process and the host response, and have provided new information regarding therapeutic options for the treatment of bacterial endophthalmitis. This review highlights experimental animal models of endophthalmitis and correlates this information with the clinical setting. The goal is to identify knowledge gaps that may be addressed in future experimental and clinical studies focused on improvements in the therapeutic preservation of vision during and after this disease. PMID:27154427

  6. Idiosyncratic Genome Degradation in a Bacterial Endosymbiont of Periodical Cicadas.

    PubMed

    Campbell, Matthew A; Łukasik, Piotr; Simon, Chris; McCutcheon, John P

    2017-11-20

    When a free-living bacterium transitions to a host-beneficial endosymbiotic lifestyle, it almost invariably loses a large fraction of its genome [1, 2]. The resulting small genomes often become stable in size, structure, and coding capacity [3-5], as exemplified by Sulcia muelleri, a nutritional endosymbiont of cicadas. Sulcia's partner endosymbiont, Hodgkinia cicadicola, similarly remains co-linear in some cicadas diverged by millions of years [6, 7]. But in the long-lived periodical cicada Magicicada tredecim, the Hodgkinia genome has split into dozens of tiny, gene-sparse circles that sometimes reside in distinct Hodgkinia cells [8]. Previous data suggested that all other Magicicada species harbor complex Hodgkinia populations, but the timing, number of origins, and outcomes of the splitting process were unknown. Here, by sequencing Hodgkinia metagenomes from the remaining six Magicicada and two sister species, we show that each Magicicada species harbors Hodgkinia populations of at least 20 genomic circles. We find little synteny among the 256 Hodgkinia circles analyzed except between the most closely related cicada species. Gene phylogenies show multiple Hodgkinia lineages in the common ancestor of Magicicada and its closest known relatives but that most splitting has occurred within Magicicada and has given rise to highly variable Hodgkinia gene dosages among species. These data show that Hodgkinia genome degradation has proceeded down different paths in different Magicicada species and support a model of genomic degradation that is stochastic in outcome and nonadaptive for the host. These patterns mirror the genomic instability seen in some mitochondria. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Expression of lysozymes from Erwinia amylovora phages and Erwinia genomes and inhibition by a bacterial protein.

    PubMed

    Müller, Ina; Gernold, Marina; Schneider, Bernd; Geider, Klaus

    2012-01-01

    Genes coding for lysozyme-inhibiting proteins (Ivy) were cloned from the chromosomes of the plant pathogens Erwinia amylovora and Erwinia pyrifoliae. The product interfered not only with activity of hen egg white lysozyme, but also with an enzyme from E. amylovora phage ΦEa1h. We have expressed lysozyme genes from the genomes of three Erwinia species in Escherichia coli. The lysozymes expressed from genes of the E. amylovora phages ΦEa104 and ΦEa116, Erwinia chromosomes and Arabidopsis thaliana were not affected by Ivy. The enzyme from bacteriophage ΦEa1h was fused at the N- or C-terminus to other peptides. Compared to the intact lysozyme, a His-tag reduced its lytic activity about 10-fold and larger fusion proteins abolished activity completely. Specific protease cleavage restored lysozyme activity of a GST-fusion. The bacteriophage-encoded lysozymes were more active than the enzymes from bacterial chromosomes. Viral lyz genes were inserted into a broad-host range vector, and transfer to E. amylovora inhibited cell growth. Inserted in the yeast Pichia pastoris, the ΦEa1h-lysozyme was secreted and also inhibited by Ivy. Here we describe expression of unrelated cloned 'silent' lyz genes from Erwinia chromosomes and a novel interference of bacterial Ivy proteins with a viral lysozyme. Copyright © 2012 S. Karger AG, Basel.

  8. Genomic reassortment of influenza A virus in North American swine, 1998–2011

    PubMed Central

    Detmer, Susan E.; Wentworth, David E.; Tan, Yi; Schwartzbard, Aaron; Halpin, Rebecca A.; Stockwell, Timothy B.; Lin, Xudong; Vincent, Amy L.; Gramer, Marie R.; Holmes, Edward C.

    2012-01-01

    Revealing the frequency and determinants of reassortment among RNA genome segments is fundamental to understanding basic aspects of the biology and evolution of the influenza virus. To estimate the extent of genomic reassortment in influenza viruses circulating in North American swine, we performed a phylogenetic analysis of 139 whole-genome viral sequences sampled during 1998–2011 and representing seven antigenically distinct viral lineages. The highest amounts of reassortment were detected between the H3 and the internal gene segments (PB2, PB1, PA, NP, M and NS), while the lowest reassortment frequencies were observed among the H1γ, H1pdm and neuraminidase segments, particularly N1. Less reassortment was observed among specific haemagglutinin–neuraminidase combinations that were more prevalent in swine, suggesting that some genome constellations may be evolutionarily more stable. PMID:22993190

  9. Toward functional genomics in bacteria: Analysis of gene expression in Escherichia coli from a bacterial artificial chromosome library of Bacillus cereus

    PubMed Central

    Rondon, Michelle R.; Raffel, Sandra J.; Goodman, Robert M.; Handelsman, Jo

    1999-01-01

    As the study of microbes moves into the era of functional genomics, there is an increasing need for molecular tools for analysis of a wide diversity of microorganisms. Currently, biological study of many prokaryotes of agricultural, medical, and fundamental scientific interest is limited by the lack of adequate genetic tools. We report the application of the bacterial artificial chromosome (BAC) vector to prokaryotic biology as a powerful approach to address this need. We constructed a BAC library in Escherichia coli from genomic DNA of the Gram-positive bacterium Bacillus cereus. This library provides 5.75-fold coverage of the B. cereus genome, with an average insert size of 98 kb. To determine the extent of heterologous expression of B. cereus genes in the library, we screened it for expression of several B. cereus activities in the E. coli host. Clones expressing 6 of 10 activities tested were identified in the library, namely, ampicillin resistance, zwittermicin A resistance, esculin hydrolysis, hemolysis, orange pigment production, and lecithinase activity. We analyzed selected BAC clones genetically to identify rapidly specific B. cereus loci. These results suggest that BAC libraries will provide a powerful approach for studying gene expression from diverse prokaryotes. PMID:10339608

  10. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world

    PubMed Central

    Koonin, Eugene V.; Wolf, Yuri I.

    2008-01-01

    The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution. PMID:18948295

  11. Genome-wide analysis of bacterial determinants of plant growth promotion and induced systemic resistance by Pseudomonas fluorescens.

    PubMed

    Cheng, Xu; Etalo, Desalegn W; van de Mortel, Judith E; Dekkers, Ester; Nguyen, Linh; Medema, Marnix H; Raaijmakers, Jos M

    2017-11-01

    Pseudomonas fluorescens strain SS101 (Pf.SS101) promotes growth of Arabidopsis thaliana, enhances greening and lateral root formation, and induces systemic resistance (ISR) against the bacterial pathogen Pseudomonas syringae pv. tomato (Pst). Here, targeted and untargeted approaches were adopted to identify bacterial determinants and underlying mechanisms involved in plant growth promotion and ISR by Pf.SS101. Based on targeted analyses, no evidence was found for volatiles, lipopeptides and siderophores in plant growth promotion by Pf.SS101. Untargeted, genome-wide analyses of 7488 random transposon mutants of Pf.SS101 led to the identification of 21 mutants defective in both plant growth promotion and ISR. Many of these mutants, however, were auxotrophic and impaired in root colonization. Genetic analysis of three mutants followed by site-directed mutagenesis, genetic complementation and plant bioassays revealed the involvement of the phosphogluconate dehydratase gene edd, the response regulator gene colR and the adenylsulfate reductase gene cysH in both plant growth promotion and ISR. Subsequent comparative plant transcriptomics analyses strongly suggest that modulation of sulfur assimilation, auxin biosynthesis and transport, steroid biosynthesis and carbohydrate metabolism in Arabidopsis are key mechanisms linked to growth promotion and ISR by Pf.SS101. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  12. Genomic anatomy of the Tyrp1 (brown) deletion complex

    PubMed Central

    Smyth, Ian M.; Wilming, Laurens; Lee, Angela W.; Taylor, Martin S.; Gautier, Phillipe; Barlow, Karen; Wallis, Justine; Martin, Sancha; Glithero, Rebecca; Phillimore, Ben; Pelan, Sarah; Andrew, Rob; Holt, Karen; Taylor, Ruth; McLaren, Stuart; Burton, John; Bailey, Jonathon; Sims, Sarah; Squares, Jan; Plumb, Bob; Joy, Ann; Gibson, Richard; Gilbert, James; Hart, Elizabeth; Laird, Gavin; Loveland, Jane; Mudge, Jonathan; Steward, Charlie; Swarbreck, David; Harrow, Jennifer; North, Philip; Leaves, Nicholas; Greystrong, John; Coppola, Maria; Manjunath, Shilpa; Campbell, Mark; Smith, Mark; Strachan, Gregory; Tofts, Calli; Boal, Esther; Cobley, Victoria; Hunter, Giselle; Kimberley, Christopher; Thomas, Daniel; Cave-Berry, Lee; Weston, Paul; Botcherby, Marc R. M.; White, Sharon; Edgar, Ruth; Cross, Sally H.; Irvani, Marjan; Hummerich, Holger; Simpson, Eleanor H.; Johnson, Dabney; Hunsicker, Patricia R.; Little, Peter F. R.; Hubbard, Tim; Campbell, R. Duncan; Rogers, Jane; Jackson, Ian J.

    2006-01-01

    Chromosome deletions in the mouse have proven invaluable in the dissection of gene function. The brown deletion complex comprises >28 independent genome rearrangements, which have been used to identify several functional loci on chromosome 4 required for normal embryonic and postnatal development. We have constructed a 172-bacterial artificial chromosome contig that spans this 22-megabase (Mb) interval and have produced a contiguous, finished, and manually annotated sequence from these clones. The deletion complex is strikingly gene-poor, containing only 52 protein-coding genes (of which only 39 are supported by human homologues) and has several further notable genomic features, including several segments of >1 Mb, apparently devoid of a coding sequence. We have used sequence polymorphisms to finely map the deletion breakpoints and identify strong candidate genes for the known phenotypes that map to this region, including three lethal loci (l4Rn1, l4Rn2, and l4Rn3) and the fitness mutant brown-associated fitness (baf). We have also characterized misexpression of the basonuclin homologue, Bnc2, associated with the inversion-mediated coat color mutant white-based brown (Bw). This study provides a molecular insight into the basis of several characterized mouse mutants, which will allow further dissection of this region by targeted or chemical mutagenesis. PMID:16505357

  13. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia.

    PubMed

    Hou, Shaobin; Makarova, Kira S; Saw, Jimmy H W; Senin, Pavel; Ly, Benjamin V; Zhou, Zhemin; Ren, Yan; Wang, Jianmei; Galperin, Michael Y; Omelchenko, Marina V; Wolf, Yuri I; Yutin, Natalya; Koonin, Eugene V; Stott, Matthew B; Mountain, Bruce W; Crowe, Michelle A; Smirnova, Angela V; Dunfield, Peter F; Feng, Lu; Wang, Lei; Alam, Maqsudul

    2008-07-01

    The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift

  14. [Bacterial biofilms on PVC tubing's inner surface of hemodialysis water treatment system].

    PubMed

    Yang, Sha; Jia, Ke; Peng, Youming; Liu, Hong; Liu, Yinghong; Chen, Xing; Liu, Fuyou

    2009-10-01

    To determine the morphology, bacteria and endotoxin content of biofilms on the inner surface of PVC tubes in hemodialysis water treatment system. We dissolved biofilms of segments before and after reverse osmosis machine for bacterial count and identification. We studied biofilm structure of segments before and after reverse osmosis machine with eyes and scanning electron microscope. Biofilms of all 7 segments were dissolved for qualitative and quantitative assay of endotoxin. The inner surface of segment before reverse osmosis machine was homogeneously distributed with activated carbon powder deposition. The segment after reverse osmosis machine was normal. With scanning electron microscope, biofilm with successive surface and sandwich was found on the inner surface of segment before reverse osmosis machine, formed by clustering bacillus, activated carbon powder and some coccus. Bacteria of the same shape and length were found on segment after reverse osmosis machine, but fewer and looser. Bacterial culture and identification showed the former was mostly gram-negative bacillus, the latter was only a few micrococcus. Endotoxin of biofilm was between 2.0 EU/mL and 4.0 EU/mL. Quantitative assay showed: segment after softener (2.821+/-0.807) EU/mL; segment after active charcoal canister(3.635+/-0.427) EU/mL; segment before reverse osmosis machine (3.687+/-0.271) EU/mL; segment after reverse osmosis machine (2.041+/-0.295) EU/mL; exit of power pump (1.983+/-0.390)EU/mL;the 1st dead space (2.373+/-0.535) EU/mL; and the 2nd dead space (2.858+/-0.690)EU/mL. Biofilms are found on the inner surface of segment before and after reverse osmosis machine. Endotoxin level from high to low is as follows: segment before reverse osmosis machine, segment after active charcoal canister, the 2nd dead space, segment after softener, the 1st dead space, segment after reverse osmosis machine, exit of power pump. The character of the bacteria and endotoxin of the biofilm can help us find

  15. Accuracy of genomic prediction for BCWD resistance in rainbow trout using different genotyping platforms and genomic selection models

    USDA-ARS?s Scientific Manuscript database

    In this study, we aimed to (1) predict genomic estimated breeding value (GEBV) for bacterial cold water disease (BCWD) resistance by genotyping training (n=583) and validation samples (n=53) with two genotyping platforms (24K RAD-SNP and 49K SNP) and using different genomic selection (GS) models (Ba...

  16. Diversity of Hindgut Bacterial Population in Subterranean Termite, Reticulitermes flavipes

    Treesearch

    Olanrewaju Raji; Dragica Jeremic-Nikolic; Juliet D. Tang

    2017-01-01

    The termite hindgut contains a bacterial community that symbiotically aids in digestion of cellulosic materials. For this paper, a species survey of bacterial hindgut symbionts in termites collected from Saucier, Mississippi was examined. Two methods were tested for optimal genetic material isolation. Genomic DNA was isolated from the hindgut luminal contents of five...

  17. Implications of segment mismatch for influenza A virus evolution

    PubMed Central

    White, Maria C.; Lowen, Anice C.

    2018-01-01

    Influenza A virus (IAV) is an RNA virus with a segmented genome. These viral properties allow for the rapid evolution of IAV under selective pressure, due to mutation occurring from error-prone replication and the exchange of gene segments within a co-infected cell, termed reassortment. Both mutation and reassortment give rise to genetic diversity, but constraints shape their impact on viral evolution: just as most mutations are deleterious, most reassortment events result in genetic incompatibilities. The phenomenon of segment mismatch encompasses both RNA- and protein-based incompatibilities between co-infecting viruses and results in the production of progeny viruses with fitness defects. Segment mismatch is an important determining factor of the outcomes of mixed IAV infections and has been addressed in multiple risk assessment studies undertaken to date. However, due to the complexity of genetic interactions among the eight viral gene segments, our understanding of segment mismatch and its underlying mechanisms remain incomplete. Here, we summarize current knowledge regarding segment mismatch and discuss the implications of this phenomenon for IAV reassortment and diversity. PMID:29244017

  18. Exploiting Bacterial Whole-Genome Sequencing Data for Evaluation of Diagnostic Assays: Campylobacter Species Identification as a Case Study.

    PubMed

    Jansen van Rensburg, Melissa J; Swift, Craig; Cody, Alison J; Jenkins, Claire; Maiden, Martin C J

    2016-12-01

    The application of whole-genome sequencing (WGS) to problems in clinical microbiology has had a major impact on the field. Clinical laboratories are now using WGS for pathogen identification, antimicrobial susceptibility testing, and epidemiological typing. WGS data also represent a valuable resource for the development and evaluation of molecular diagnostic assays, which continue to play an important role in clinical microbiology. To demonstrate this application of WGS, this study used publicly available genomic data to evaluate a duplex real-time PCR (RT-PCR) assay that targets mapA and ceuE for the detection of Campylobacter jejuni and Campylobacter coli, leading global causes of bacterial gastroenteritis. In silico analyses of mapA and ceuE primer and probe sequences from 1,713 genetically diverse C. jejuni and C. coli genomes, supported by RT-PCR testing, indicated that the assay was robust, with 1,707 (99.7%) isolates correctly identified. The high specificity of the mapA-ceuE assay was the result of interspecies diversity and intraspecies conservation of the target genes in C. jejuni and C. coli Rare instances of a lack of specificity among C. coli isolates were due to introgression in mapA or sequence diversity in ceuE The results of this study illustrate how WGS can be exploited to evaluate molecular diagnostic assays by using publicly available data, online databases, and open-source software. Copyright © 2016 Jansen van Rensburg et al.

  19. Modeling heterogeneous (co)variances from adjacent-SNP groups improves genomic prediction for milk protein composition traits.

    PubMed

    Gebreyesus, Grum; Lund, Mogens S; Buitenhuis, Bart; Bovenhuis, Henk; Poulsen, Nina A; Janss, Luc G

    2017-12-05

    Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls. Single-nucleotide polymorphisms (SNPs), from 50K SNP arrays, were grouped into non-overlapping genome segments. A segment was defined as one SNP, or a group of 50, 100, or 200 adjacent SNPs, or one chromosome, or the whole genome. Traditional univariate and bivariate genomic best linear unbiased prediction (GBLUP) models were also run for comparison. Reliabilities were calculated through a resampling strategy and using deterministic formula. BayesAS models improved prediction reliability for most of the traits compared to GBLUP models and this gain depended on segment size and genetic architecture of the traits. The gain in prediction reliability was especially marked for the protein composition traits β-CN, κ-CN and β-LG, for which prediction reliabilities were improved by 49 percentage points on average using the MT-BayesAS model with a 100-SNP segment size compared to the bivariate GBLUP. Prediction reliabilities were highest with the BayesAS model that uses a 100-SNP segment size. The bivariate versions of our BayesAS models resulted in extra gains of up to 6% in

  20. Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum

    DOE PAGES

    Anantharaman, Karthik; Brown, Christopher T.; Burstein, David; ...

    2016-01-28

    Five closely related populations of bacteria from the Candidate Phylum (CP) Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR), were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugarsmore » including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Furthermore, phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. In conclusion, we propose the provisional taxonomic assignment as ‘ Candidatus Peribacter riflensis’, Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria.« less

  1. Inverse Symmetry in Complete Genomes and Whole-Genome Inverse Duplication

    PubMed Central

    Kong, Sing-Guan; Fan, Wen-Lang; Chen, Hong-Da; Hsu, Zi-Ting; Zhou, Nengji; Zheng, Bo; Lee, Hoong-Chien

    2009-01-01

    The cause of symmetry is usually subtle, and its study often leads to a deeper understanding of the bearer of the symmetry. To gain insight into the dynamics driving the growth and evolution of genomes, we conducted a comprehensive study of textual symmetries in 786 complete chromosomes. We focused on symmetry based on our belief that, in spite of their extreme diversity, genomes must share common dynamical principles and mechanisms that drive their growth and evolution, and that the most robust footprints of such dynamics are symmetry related. We found that while complement and reverse symmetries are essentially absent in genomic sequences, inverse–complement plus reverse–symmetry is prevalent in complex patterns in most chromosomes, a vast majority of which have near maximum global inverse symmetry. We also discovered relations that can quantitatively account for the long observed but unexplained phenomenon of -mer skews in genomes. Our results suggest segmental and whole-genome inverse duplications are important mechanisms in genome growth and evolution, probably because they are efficient means by which the genome can exploit its double-stranded structure to enrich its code-inventory. PMID:19898631

  2. Structural Genomics of Bacterial Virulence Factors

    DTIC Science & Technology

    2005-05-01

    is deficient to mammals and unique to bacteria, the enzymes involved in the pathway may be useful for antibiotic design. Recent genome sequence...the SARS S1 spike protein with a high affinity antibody (඘R)" ( Sui et al., 2004). Both the Si protein and antibody have been expressed and purified in... Streptococcus group are now in preparation. Key Research Accomplishments * Development of the VirFact database (J;p ’liL- tbur.htm o.i) of virulence

  3. Role of osmotic and hydrostatic pressures in bacteriophage genome ejection

    NASA Astrophysics Data System (ADS)

    Lemay, Serge G.; Panja, Debabrata; Molineux, Ian J.

    2013-02-01

    A critical step in the bacteriophage life cycle is genome ejection into host bacteria. The ejection process for double-stranded DNA phages has been studied thoroughly in vitro, where after triggering with the cellular receptor the genome ejects into a buffer. The experimental data have been interpreted in terms of the decrease in free energy of the densely packed DNA associated with genome ejection. Here we detail a simple model of genome ejection in terms of the hydrostatic and osmotic pressures inside the phage, a bacterium, and a buffer solution or culture medium. We argue that the hydrodynamic flow associated with the water movement from the buffer solution into the phage capsid and further drainage into the bacterial cytoplasm, driven by the osmotic gradient between the bacterial cytoplasm and culture medium, provides an alternative mechanism for phage genome ejection in vivo; the mechanism is perfectly consistent with phage genome ejection in vitro.

  4. Bacterial spoilers of food: behavior, fitness and functional properties.

    PubMed

    Remenant, Benoît; Jaffrès, Emmanuel; Dousset, Xavier; Pilet, Marie-France; Zagorec, Monique

    2015-02-01

    Most food products are highly perishable as they constitute a rich nutrient source for microbial development. Among the microorganisms contaminating food, some present metabolic activities leading to spoilage. In addition to hygienic rules to reduce contamination, various treatments are applied during production and storage to avoid the growth of unwanted microbes. The nature and appearance of spoilage therefore depend on the physiological state of spoilers and on their ability to resist the processing/storage conditions and flourish on the food matrix. Spoilage also relies on the interactions between the microorganisms composing the ecosystems encountered in food. The recent rapid increase in publicly available bacterial genome sequences, as well as the access to high-throughput methods, should lead to a better understanding of spoiler behavior and to the possibility of decreasing food spoilage. This review lists the main bacterial species identified as food spoilers, their ability to develop during storage and/or processing, and the functions potentially involved in spoilage. We have also compiled an inventory of the available genome sequences of species encompassing spoilage strains. Combining in silico analysis of genome sequences with experimental data is proposed in order to understand and thus control the bacterial spoilage of food better. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Complete genome sequence and phylogenetic analyses of an aquabirnavirus isolated from a diseased marbled eel culture in Taiwan.

    PubMed

    Wen, Chiu-Ming

    2017-08-01

    An aquabirnavirus was isolated from diseased marbled eels (Anguilla marmorata; MEIPNV1310) with gill haemorrhages and associated mortality. Its genome segment sequences were obtained through next-generation sequencing and compared with published aquabirnavirus sequences. The results indicated that the genome sequence of MEIPNV1310 contains segment A (3099 nucleotides) and segment B (2789 nucleotides). Phylogenetic analysis showed that MEIPNV1310 is closely related to the infectious pancreatic necrosis Ab strain within genogroup II. This genome sequence is beneficial for studying the geographic distribution and evolution of aquabirnaviruses.

  6. Structural Genomics: Correlation Blocks, Population Structure, and Genome Architecture

    PubMed Central

    Hu, Xin-Sheng; Yeh, Francis C.; Wang, Zhiquan

    2011-01-01

    An integration of the pattern of genome-wide inter-site associations with evolutionary forces is important for gaining insights into the genomic evolution in natural or artificial populations. Here, we assess the inter-site correlation blocks and their distributions along chromosomes. A correlation block is broadly termed as the DNA segment within which strong correlations exist between genetic diversities at any two sites. We bring together the population genetic structure and the genomic diversity structure that have been independently built on different scales and synthesize the existing theories and methods for characterizing genomic structure at the population level. We discuss how population structure could shape correlation blocks and their patterns within and between populations. Effects of evolutionary forces (selection, migration, genetic drift, and mutation) on the pattern of genome-wide correlation blocks are discussed. In eukaryote organisms, we briefly discuss the associations between the pattern of correlation blocks and genome assembly features in eukaryote organisms, including the impacts of multigene family, the perturbation of transposable elements, and the repetitive nongenic sequences and GC-rich isochores. Our reviews suggest that the observable pattern of correlation blocks can refine our understanding of the ecological and evolutionary processes underlying the genomic evolution at the population level. PMID:21886455

  7. Genome-wide analysis of alternative splicing during dendritic cell response to a bacterial challenge.

    PubMed

    Rodrigues, Raquel; Grosso, Ana Rita; Moita, Luís

    2013-01-01

    The immune system relies on the plasticity of its components to produce appropriate responses to frequent environmental challenges. Dendritic cells (DCs) are critical initiators of innate immunity and orchestrate the later and more specific adaptive immunity. The generation of diversity in transcriptional programs is central for effective immune responses. Alternative splicing is widely considered a key generator of transcriptional and proteomic complexity, but its role has been rarely addressed systematically in immune cells. Here we used splicing-sensitive arrays to assess genome-wide gene- and exon-level expression profiles in human DCs in response to a bacterial challenge. We find widespread alternative splicing events and splicing factor transcriptional signatures induced by an E. coli challenge to human DCs. Alternative splicing acts in concert with transcriptional modulation, but these two mechanisms of gene regulation affect primarily distinct functional gene groups. Alternative splicing is likely to have an important role in DC immunobiology because it affects genes known to be involved in DC development, endocytosis, antigen presentation and cell cycle arrest.

  8. Microbial Culturomics Broadens Human Vaginal Flora Diversity: Genome Sequence and Description of Prevotella lascolaii sp. nov. Isolated from a Patient with Bacterial Vaginosis.

    PubMed

    Diop, Khoudia; Diop, Awa; Levasseur, Anthony; Mediannikov, Oleg; Robert, Catherine; Armstrong, Nicholas; Couderc, Carine; Bretelle, Florence; Raoult, Didier; Fournier, Pierre-Edouard; Fenollar, Florence

    2018-03-01

    Microbial culturomics is a new subfield of postgenomic medicine and omics biotechnology application that has broadened our awareness on bacterial diversity of the human microbiome, including the human vaginal flora bacterial diversity. Using culturomics, a new obligate anaerobic Gram-stain-negative rod-shaped bacterium designated strain khD1 T was isolated in the vagina of a patient with bacterial vaginosis and characterized using taxonogenomics. The most abundant cellular fatty acids were C 15:0 anteiso (36%), C 16:0 (19%), and C 15:0 iso (10%). Based on an analysis of the full-length 16S rRNA gene sequences, phylogenetic analysis showed that the strain khD1 T exhibited 90% sequence similarity with Prevotella loescheii, the phylogenetically closest validated Prevotella species. With 3,763,057 bp length, the genome of strain khD1 T contained (mol%) 48.7 G + C and 3248 predicted genes, including 3194 protein-coding and 54 RNA genes. Given the phenotypical and biochemical characteristic results as well as genome sequencing, strain khD1 T is considered to represent a novel species within the genus Prevotella, for which the name Prevotella lascolaii sp. nov. is proposed. The type strain is khD1 T ( = CSUR P0109, = DSM 101754). These results show that microbial culturomics greatly improves the characterization of the human microbiome repertoire by isolating potential putative new species. Further studies will certainly clarify the microbial mechanisms of pathogenesis of these new microbes and their role in health and disease. Microbial culturomics is an important new addition to the diagnostic medicine toolbox and warrants attention in future medical, global health, and integrative biology postgraduate teaching curricula.

  9. Detection of Mixed Infection from Bacterial Whole Genome Sequence Data Allows Assessment of Its Role in Clostridium difficile Transmission

    PubMed Central

    Eyre, David W.; Cule, Madeleine L.; Griffiths, David; Crook, Derrick W.; Peto, Tim E. A.

    2013-01-01

    Bacterial whole genome sequencing offers the prospect of rapid and high precision investigation of infectious disease outbreaks. Close genetic relationships between microorganisms isolated from different infected cases suggest transmission is a strong possibility, whereas transmission between cases with genetically distinct bacterial isolates can be excluded. However, undetected mixed infections—infection with ≥2 unrelated strains of the same species where only one is sequenced—potentially impairs exclusion of transmission with certainty, and may therefore limit the utility of this technique. We investigated the problem by developing a computationally efficient method for detecting mixed infection without the need for resource-intensive independent sequencing of multiple bacterial colonies. Given the relatively low density of single nucleotide polymorphisms within bacterial sequence data, direct reconstruction of mixed infection haplotypes from current short-read sequence data is not consistently possible. We therefore use a two-step maximum likelihood-based approach, assuming each sample contains up to two infecting strains. We jointly estimate the proportion of the infection arising from the dominant and minor strains, and the sequence divergence between these strains. In cases where mixed infection is confirmed, the dominant and minor haplotypes are then matched to a database of previously sequenced local isolates. We demonstrate the performance of our algorithm with in silico and in vitro mixed infection experiments, and apply it to transmission of an important healthcare-associated pathogen, Clostridium difficile. Using hospital ward movement data in a previously described stochastic transmission model, 15 pairs of cases enriched for likely transmission events associated with mixed infection were selected. Our method identified four previously undetected mixed infections, and a previously undetected transmission event, but no direct transmission between

  10. Extreme-Scale De Novo Genome Assembly

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less

  11. Comparison of Marker-Based Genomic Estimated Breeding Values and Phenotypic Evaluation for Selection of Bacterial Spot Resistance in Tomato.

    PubMed

    Liabeuf, Debora; Sim, Sung-Chur; Francis, David M

    2018-03-01

    Bacterial spot affects tomato crops (Solanum lycopersicum) grown under humid conditions. Major genes and quantitative trait loci (QTL) for resistance have been described, and multiple loci from diverse sources need to be combined to improve disease control. We investigated genomic selection (GS) prediction models for resistance to Xanthomonas euvesicatoria and experimentally evaluated the accuracy of these models. The training population consisted of 109 families combining resistance from four sources and directionally selected from a population of 1,100 individuals. The families were evaluated on a plot basis in replicated inoculated trials and genotyped with single nucleotide polymorphisms (SNP). We compared the prediction ability of models developed with 14 to 387 SNP. Genomic estimated breeding values (GEBV) were derived using Bayesian least absolute shrinkage and selection operator regression (BL) and ridge regression (RR). Evaluations were based on leave-one-out cross validation and on empirical observations in replicated field trials using the next generation of inbred progeny and a hybrid population resulting from selections in the training population. Prediction ability was evaluated based on correlations between GEBV and phenotypes (r g ), percentage of coselection between genomic and phenotypic selection, and relative efficiency of selection (r g /r p ). Results were similar with BL and RR models. Models using only markers previously identified as significantly associated with resistance but weighted based on GEBV and mixed models with markers associated with resistance treated as fixed effects and markers distributed in the genome treated as random effects offered greater accuracy and a high percentage of coselection. The accuracy of these models to predict the performance of progeny and hybrids exceeded the accuracy of phenotypic selection.

  12. Bacterial Group II Introns: Identification and Mobility Assay.

    PubMed

    Toro, Nicolás; Molina-Sánchez, María Dolores; Nisa-Martínez, Rafael; Martínez-Abarca, Francisco; García-Rodríguez, Fernando Manuel

    2016-01-01

    Group II introns are large catalytic RNAs and mobile retroelements that encode a reverse transcriptase. Here, we provide methods for their identification in bacterial genomes and further analysis of their splicing and mobility capacities.

  13. Comparative genomic survey, exon-intron annotation and phylogenetic analysis of NAT-homologous sequences in archaea, protists, fungi, viruses, and invertebrates

    USDA-ARS?s Scientific Manuscript database

    We have previously published extensive genomic surveys [1-3], reporting NAT-homologous sequences in hundreds of sequenced bacterial, fungal and vertebrate genomes. We present here the results of our latest search of 2445 genomes, representing 1532 (70 archaeal, 1210 bacterial, 43 protist, 97 fungal,...

  14. Genome-scale rates of evolutionary change in bacteria

    PubMed Central

    Duchêne, Sebastian; Holt, Kathryn E.; Weill, François-Xavier; Le Hello, Simon; Hawkey, Jane; Edwards, David J.; Fourment, Mathieu

    2016-01-01

    Estimating the rates at which bacterial genomes evolve is critical to understanding major evolutionary and ecological processes such as disease emergence, long-term host–pathogen associations and short-term transmission patterns. The surge in bacterial genomic data sets provides a new opportunity to estimate these rates and reveal the factors that shape bacterial evolutionary dynamics. For many organisms estimates of evolutionary rate display an inverse association with the time-scale over which the data are sampled. However, this relationship remains unexplored in bacteria due to the difficulty in estimating genome-wide evolutionary rates, which are impacted by the extent of temporal structure in the data and the prevalence of recombination. We collected 36 whole genome sequence data sets from 16 species of bacterial pathogens to systematically estimate and compare their evolutionary rates and assess the extent of temporal structure in the absence of recombination. The majority (28/36) of data sets possessed sufficient clock-like structure to robustly estimate evolutionary rates. However, in some species reliable estimates were not possible even with ‘ancient DNA’ data sampled over many centuries, suggesting that they evolve very slowly or that they display extensive rate variation among lineages. The robustly estimated evolutionary rates spanned several orders of magnitude, from approximately 10−5 to 10−8 nucleotide substitutions per site year−1. This variation was negatively associated with sampling time, with this relationship best described by an exponential decay curve. To avoid potential estimation biases, such time-dependency should be considered when inferring evolutionary time-scales in bacteria. PMID:28348834

  15. Comparative genome analysis and characterization of the Salmonella Typhimurium strain CCRJ_26 isolated from swine carcasses using whole-genome sequencing approach.

    PubMed

    Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A

    2018-04-01

    Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.

  16. Ecology and genomics of Bacillus subtilis.

    PubMed

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2008-06-01

    Bacillus subtilis is a remarkably diverse bacterial species that is capable of growth within many environments. Recent microarray-based comparative genomic analyses have revealed that members of this species also exhibit considerable genomic diversity. The identification of strain-specific genes might explain how B. subtilis has become so broadly adapted. The goal of identifying ecologically adaptive genes could soon be realized with the imminent release of several new B. subtilis genome sequences. As we embark upon this exciting new era of B. subtilis comparative genomics we review what is currently known about the ecology and evolution of this species.

  17. The Encapsidated Genome of Microplitis demolitor Bracovirus Integrates into the Host Pseudoplusia includens ▿ ‡

    PubMed Central

    Beck, Markus H.; Zhang, Shu; Bitra, Kavita; Burke, Gaelen R.; Strand, Michael R.

    2011-01-01

    Polydnaviruses (PDVs) are symbionts of parasitoid wasps that function as gene delivery vehicles in the insects (hosts) that the wasps parasitize. PDVs persist in wasps as integrated proviruses but are packaged as circularized and segmented double-stranded DNAs into the virions that wasps inject into hosts. In contrast, little is known about how PDV genomic DNAs persist in host cells. Microplitis demolitor carries Microplitis demolitor bracovirus (MdBV) and parasitizes the host Pseudoplusia includens. MdBV infects primarily host hemocytes and also infects a hemocyte-derived cell line from P. includens called CiE1 cells. Here we report that all 15 genomic segments of the MdBV encapsidated genome exhibited long-term persistence in CiE1 cells. Most MdBV genes expressed in hemocytes were persistently expressed in CiE1 cells, including members of the glc gene family whose products transformed CiE1 cells into a suspension culture. PCR-based integration assays combined with cloning and sequencing of host-virus junctions confirmed that genomic segments J and C persisted in CiE1 cells by integration. These genomic DNAs also rapidly integrated into parasitized P. includens. Sequence analysis of wasp-viral junction clones showed that the integration of proviral segments in M. demolitor was associated with a wasp excision/integration motif (WIM) known from other bracoviruses. However, integration into host cells occurred in association with a previously unknown domain that we named the host integration motif (HIM). The presence of HIMs in most MdBV genomic DNAs suggests that the integration of each genomic segment into host cells occurs through a shared mechanism. PMID:21880747

  18. Sequence and Analysis of the Tomato JOINTLESS Locus1

    PubMed Central

    Mao, Long; Begum, Dilara; Goff, Stephen A.; Wing, Rod A.

    2001-01-01

    A 119-kb bacterial artificial chromosome from the JOINTLESS locus on the tomato (Lycopersicon esculentum) chromosome 11 contained 15 putative genes. Repetitive sequences in this region include one copia-like LTR retrotransposon, 13 simple sequence repeats, three copies of a novel type III foldback transposon, and four putative short DNA repeats. Database searches showed that the foldback transposon and the short DNA repeats seemed to be associated preferably with genes. The predicted tomato genes were compared with the complete Arabidopsis genome. Eleven out of 15 tomato open reading frames were found to be colinear with segments on five Arabidopsis bacterial artificial chromosome/P1-derived artificial chromosome clones. The synteny patterns, however, did not reveal duplicated segments in Arabidopsis, where over half of the genome is duplicated. Our analysis indicated that the microsynteny between the tomato and Arabidopsis genomes was still conserved at a very small scale but was complicated by the large number of gene families in the Arabidopsis genome. PMID:11457984

  19. The genome of the sea urchin Strongylocentrotus purpuratus.

    PubMed

    Sodergren, Erica; Weinstock, George M; Davidson, Eric H; Cameron, R Andrew; Gibbs, Richard A; Angerer, Robert C; Angerer, Lynne M; Arnone, Maria Ina; Burgess, David R; Burke, Robert D; Coffman, James A; Dean, Michael; Elphick, Maurice R; Ettensohn, Charles A; Foltz, Kathy R; Hamdoun, Amro; Hynes, Richard O; Klein, William H; Marzluff, William; McClay, David R; Morris, Robert L; Mushegian, Arcady; Rast, Jonathan P; Smith, L Courtney; Thorndyke, Michael C; Vacquier, Victor D; Wessel, Gary M; Wray, Greg; Zhang, Lan; Elsik, Christine G; Ermolaeva, Olga; Hlavina, Wratko; Hofmann, Gretchen; Kitts, Paul; Landrum, Melissa J; Mackey, Aaron J; Maglott, Donna; Panopoulou, Georgia; Poustka, Albert J; Pruitt, Kim; Sapojnikov, Victor; Song, Xingzhi; Souvorov, Alexandre; Solovyev, Victor; Wei, Zheng; Whittaker, Charles A; Worley, Kim; Durbin, K James; Shen, Yufeng; Fedrigo, Olivier; Garfield, David; Haygood, Ralph; Primus, Alexander; Satija, Rahul; Severson, Tonya; Gonzalez-Garay, Manuel L; Jackson, Andrew R; Milosavljevic, Aleksandar; Tong, Mark; Killian, Christopher E; Livingston, Brian T; Wilt, Fred H; Adams, Nikki; Bellé, Robert; Carbonneau, Seth; Cheung, Rocky; Cormier, Patrick; Cosson, Bertrand; Croce, Jenifer; Fernandez-Guerra, Antonio; Genevière, Anne-Marie; Goel, Manisha; Kelkar, Hemant; Morales, Julia; Mulner-Lorillon, Odile; Robertson, Anthony J; Goldstone, Jared V; Cole, Bryan; Epel, David; Gold, Bert; Hahn, Mark E; Howard-Ashby, Meredith; Scally, Mark; Stegeman, John J; Allgood, Erin L; Cool, Jonah; Judkins, Kyle M; McCafferty, Shawn S; Musante, Ashlan M; Obar, Robert A; Rawson, Amanda P; Rossetti, Blair J; Gibbons, Ian R; Hoffman, Matthew P; Leone, Andrew; Istrail, Sorin; Materna, Stefan C; Samanta, Manoj P; Stolc, Viktor; Tongprasit, Waraporn; Tu, Qiang; Bergeron, Karl-Frederik; Brandhorst, Bruce P; Whittle, James; Berney, Kevin; Bottjer, David J; Calestani, Cristina; Peterson, Kevin; Chow, Elly; Yuan, Qiu Autumn; Elhaik, Eran; Graur, Dan; Reese, Justin T; Bosdet, Ian; Heesun, Shin; Marra, Marco A; Schein, Jacqueline; Anderson, Michele K; Brockton, Virginia; Buckley, Katherine M; Cohen, Avis H; Fugmann, Sebastian D; Hibino, Taku; Loza-Coll, Mariano; Majeske, Audrey J; Messier, Cynthia; Nair, Sham V; Pancer, Zeev; Terwilliger, David P; Agca, Cavit; Arboleda, Enrique; Chen, Nansheng; Churcher, Allison M; Hallböök, F; Humphrey, Glen W; Idris, Mohammed M; Kiyama, Takae; Liang, Shuguang; Mellott, Dan; Mu, Xiuqian; Murray, Greg; Olinski, Robert P; Raible, Florian; Rowe, Matthew; Taylor, John S; Tessmar-Raible, Kristin; Wang, D; Wilson, Karen H; Yaguchi, Shunsuke; Gaasterland, Terry; Galindo, Blanca E; Gunaratne, Herath J; Juliano, Celina; Kinukawa, Masashi; Moy, Gary W; Neill, Anna T; Nomura, Mamoru; Raisch, Michael; Reade, Anna; Roux, Michelle M; Song, Jia L; Su, Yi-Hsien; Townley, Ian K; Voronina, Ekaterina; Wong, Julian L; Amore, Gabriele; Branno, Margherita; Brown, Euan R; Cavalieri, Vincenzo; Duboc, Véronique; Duloquin, Louise; Flytzanis, Constantin; Gache, Christian; Lapraz, François; Lepage, Thierry; Locascio, Annamaria; Martinez, Pedro; Matassi, Giorgio; Matranga, Valeria; Range, Ryan; Rizzo, Francesca; Röttinger, Eric; Beane, Wendy; Bradham, Cynthia; Byrum, Christine; Glenn, Tom; Hussain, Sofia; Manning, Gerard; Miranda, Esther; Thomason, Rebecca; Walton, Katherine; Wikramanayke, Athula; Wu, Shu-Yu; Xu, Ronghui; Brown, C Titus; Chen, Lili; Gray, Rachel F; Lee, Pei Yun; Nam, Jongmin; Oliveri, Paola; Smith, Joel; Muzny, Donna; Bell, Stephanie; Chacko, Joseph; Cree, Andrew; Curry, Stacey; Davis, Clay; Dinh, Huyen; Dugan-Rocha, Shannon; Fowler, Jerry; Gill, Rachel; Hamilton, Cerrissa; Hernandez, Judith; Hines, Sandra; Hume, Jennifer; Jackson, Laronda; Jolivet, Angela; Kovar, Christie; Lee, Sandra; Lewis, Lora; Miner, George; Morgan, Margaret; Nazareth, Lynne V; Okwuonu, Geoffrey; Parker, David; Pu, Ling-Ling; Thorn, Rachel; Wright, Rita

    2006-11-10

    We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes.

  20. Global analysis of bacterial transcription factors to predict cellular target processes.

    PubMed

    Doerks, Tobias; Andrade, Miguel A; Lathe, Warren; von Mering, Christian; Bork, Peer

    2004-03-01

    Whole-genome sequences are now available for >100 bacterial species, giving unprecedented power to comparative genomics approaches. We have applied genome-context methods to predict target processes that are regulated by transcription factors (TFs). Of 128 orthologous groups of proteins annotated as TFs, to date, 36 are functionally uncharacterized; in our analysis we predict a probable cellular target process or biochemical pathway for half of these functionally uncharacterized TFs.

  1. Human Contamination in Public Genome Assemblies.

    PubMed

    Kryukov, Kirill; Imanishi, Tadashi

    2016-01-01

    Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.

  2. Genomics of Actinobacteria: Tracing the Evolutionary History of an Ancient Phylum†

    PubMed Central

    Ventura, Marco; Canchaya, Carlos; Tauch, Andreas; Chandra, Govind; Fitzgerald, Gerald F.; Chater, Keith F.; van Sinderen, Douwe

    2007-01-01

    Summary: Actinobacteria constitute one of the largest phyla among Bacteria and represent gram-positive bacteria with a high G+C content in their DNA. This bacterial group includes microorganisms exhibiting a wide spectrum of morphologies, from coccoid to fragmenting hyphal forms, as well as possessing highly variable physiological and metabolic properties. Furthermore, Actinobacteria members have adopted different lifestyles, and can be pathogens (e.g., Corynebacterium, Mycobacterium, Nocardia, Tropheryma, and Propionibacterium), soil inhabitants (Streptomyces), plant commensals (Leifsonia), or gastrointestinal commensals (Bifidobacterium). The divergence of Actinobacteria from other bacteria is ancient, making it impossible to identify the phylogenetically closest bacterial group to Actinobacteria. Genome sequence analysis has revolutionized every aspect of bacterial biology by enhancing the understanding of the genetics, physiology, and evolutionary development of bacteria. Various actinobacterial genomes have been sequenced, revealing a wide genomic heterogeneity probably as a reflection of their biodiversity. This review provides an account of the recent explosion of actinobacterial genomics data and an attempt to place this in a biological and evolutionary context. PMID:17804669

  3. Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics

    PubMed Central

    2012-01-01

    Background Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. Methods In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. Results Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. Conclusions This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. PMID:23282225

  4. Translocations of chromosome end-segments and facultative heterochromatin promote meiotic ring formation in evening primroses.

    PubMed

    Golczyk, Hieronim; Massouh, Amid; Greiner, Stephan

    2014-03-01

    Due to reciprocal chromosomal translocations, many species of Oenothera (evening primrose) form permanent multichromosomal meiotic rings. However, regular bivalent pairing is also observed. Chiasmata are restricted to chromosomal ends, which makes homologous recombination virtually undetectable. Genetic diversity is achieved by changing linkage relations of chromosomes in rings and bivalents via hybridization and reciprocal translocations. Although the structural prerequisite for this system is enigmatic, whole-arm translocations are widely assumed to be the mechanistic driving force. We demonstrate that this prerequisite is genome compartmentation into two epigenetically defined chromatin fractions. The first one facultatively condenses in cycling cells into chromocenters negative both for histone H3 dimethylated at lysine 4 and for C-banding, and forms huge condensed middle chromosome regions on prophase chromosomes. Remarkably, it decondenses in differentiating cells. The second fraction is euchromatin confined to distal chromosome segments, positive for histone H3 lysine 4 dimethylation and for histone H3 lysine 27 trimethylation. The end-segments are deprived of canonical telomeres but capped with constitutive heterochromatin. This genomic organization promotes translocation breakpoints between the two chromatin fractions, thus facilitating exchanges of end-segments. We challenge the whole-arm translocation hypothesis by demonstrating why reciprocal translocations of chromosomal end-segments should strongly promote meiotic rings and evolution toward permanent translocation heterozygosity. Reshuffled end-segments, each possessing a major crossover hot spot, can furthermore explain meiotic compatibility between genomes with different translocation histories.

  5. Conditions for the Evolution of Gene Clusters in Bacterial Genomes

    PubMed Central

    Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.

    2010-01-01

    Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992

  6. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Qiu, D.; Tu, Q.; He, Zhili

    2010-05-17

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to bemore » of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.« less

  7. Analysis of bacterial genomes from an evolution experiment with horizontal gene transfer shows that recombination can sometimes overwhelm selection

    PubMed Central

    2018-01-01

    Few experimental studies have examined the role that sexual recombination plays in bacterial evolution, including the effects of horizontal gene transfer on genome structure. To address this limitation, we analyzed genomes from an experiment in which Escherichia coli K-12 Hfr (high frequency recombination) donors were periodically introduced into 12 evolving populations of E. coli B and allowed to conjugate repeatedly over the course of 1000 generations. Previous analyses of the evolved strains from this experiment showed that recombination did not accelerate adaptation, despite increasing genetic variation relative to asexual controls. However, the resolution in that previous work was limited to only a few genetic markers. We sought to clarify and understand these puzzling results by sequencing complete genomes from each population. The effects of recombination were highly variable: one lineage was mostly derived from the donors, while another acquired almost no donor DNA. In most lineages, some regions showed repeated introgression and others almost none. Regions with high introgression tended to be near the donors’ origin of transfer sites. To determine whether introgressed alleles imposed a genetic load, we extended the experiment for 200 generations without recombination and sequenced whole-population samples. Beneficial alleles in the recipient populations were occasionally driven extinct by maladaptive donor-derived alleles. On balance, our analyses indicate that the plasmid-mediated recombination was sufficiently frequent to drive donor alleles to fixation without providing much, if any, selective advantage. PMID:29385126

  8. By their genes ye shall know them: genomic signatures of predatory bacteria

    PubMed Central

    Pasternak, Zohar; Pietrokovski, Shmuel; Rotem, Or; Gophna, Uri; Lurie-Weinberger, Mor N; Jurkevitch, Edouard

    2013-01-01

    Predatory bacteria are taxonomically disparate, exhibit diverse predatory strategies and are widely distributed in varied environments. To date, their predatory phenotypes cannot be discerned in genome sequence data thereby limiting our understanding of bacterial predation, and of its impact in nature. Here, we define the ‘predatome,' that is, sets of protein families that reflect the phenotypes of predatory bacteria. The proteomes of all sequenced 11 predatory bacteria, including two de novo sequenced genomes, and 19 non-predatory bacteria from across the phylogenetic and ecological landscapes were compared. Protein families discriminating between the two groups were identified and quantified, demonstrating that differences in the proteomes of predatory and non-predatory bacteria are large and significant. This analysis allows predictions to be made, as we show by confirming from genome data an over-looked bacterial predator. The predatome exhibits deficiencies in riboflavin and amino acids biosynthesis, suggesting that predators obtain them from their prey. In contrast, these genomes are highly enriched in adhesins, proteases and particular metabolic proteins, used for binding to, processing and consuming prey, respectively. Strikingly, predators and non-predators differ in isoprenoid biosynthesis: predators use the mevalonate pathway, whereas non-predators, like almost all bacteria, use the DOXP pathway. By defining predatory signatures in bacterial genomes, the predatory potential they encode can be uncovered, filling an essential gap for measuring bacterial predation in nature. Moreover, we suggest that full-genome proteomic comparisons are applicable to other ecological interactions between microbes, and provide a convenient and rational tool for the functional classification of bacteria. PMID:23190728

  9. A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.

    PubMed

    Swain, Martin T; Tsai, Isheng J; Assefa, Samual A; Newbold, Chris; Berriman, Matthew; Otto, Thomas D

    2012-06-07

    Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.

  10. Bacterial diversity of bacteriomes and organs of reproductive, digestive and excretory systems in two cicada species (Hemiptera: Cicadidae).

    PubMed

    Zheng, Zhou; Wang, Dandan; He, Hong; Wei, Cong

    2017-01-01

    Cicadas form intimate symbioses with bacteria to obtain nutrients that are scarce in the xylem fluid they feed on. The obligate symbionts in cicadas are purportedly confined to specialized bacteriomes, but knowledge of bacterial communities associated with cicadas is limited. Bacterial communities in the bacteriomes and organs of reproductive, digestive and excretory systems of two cicada species (Platypleura kaempferi and Meimuna mongolica) were investigated using different methods, and the bacterial diversity and distribution patterns of dominant bacteria in different tissues were compared. Within each species, the bacterial communities of testes are significantly different from those of bacteriomes and ovaries. The dominant endosymbiont Candidatus Sulcia muelleri is found not only in the bacteriomes and reproductive organs, but also in the "filter chamber + conical segment" of both species. The transmission mode of this endosymbiont in the alimentary canal and its effect on physiological processes merits further study. A novel bacterium of Rhizobiales, showing ~80% similarity to Candidatus Hodgkinia cicadicola, is dominant in the bacteriomes and ovaries of P. kaempferi. Given that the genome of H. cicadicola exhibits rapid sequence evolution, it is possible that this novel bacterium is a related endosymbiont with beneficial trophic functions similar to that of H. cicadicola in some other cicadas. Failure to detect H. cicadicola in M. mongolica suggests that it has been subsequently replaced by another bacterium, a yeast or gut microbiota which compensates for the loss of H. cicadicola. The distribution of this novel Rhizobiales species in other cicadas and its identification require further investigation to help establish the definition of the bacterial genus Candidatus Hodgkinia and to provide more information on sequence divergence of related endosymbionts of cicadas. Our results highlight the complex bacterial communities of cicadas, and are informative for

  11. A segment of the apospory-specific genomic region is highly microsyntenic not only between the apomicts Pennisetum squamulatum and buffelgrass, but also with a rice chromosome 11 centromeric-proximal genomic region.

    PubMed

    Gualtieri, Gustavo; Conner, Joann A; Morishige, Daryl T; Moore, L David; Mullet, John E; Ozias-Akins, Peggy

    2006-03-01

    Bacterial artificial chromosome (BAC) clones from apomicts Pennisetum squamulatum and buffelgrass (Cenchrus ciliaris), isolated with the apospory-specific genomic region (ASGR) marker ugt197, were assembled into contigs that were extended by chromosome walking. Gene-like sequences from contigs were identified by shotgun sequencing and BLAST searches, and used to isolate orthologous rice contigs. Additional gene-like sequences in the apomicts' contigs were identified by bioinformatics using fully sequenced BACs from orthologous rice contigs as templates, as well as by interspecies, whole-contig cross-hybridizations. Hierarchical contig orthology was rapidly assessed by constructing detailed long-range contig molecular maps showing the distribution of gene-like sequences and markers, and searching for microsyntenic patterns of sequence identity and spatial distribution within and across species contigs. We found microsynteny between P. squamulatum and buffelgrass contigs. Importantly, this approach also enabled us to isolate from within the rice (Oryza sativa) genome contig Rice A, which shows the highest microsynteny and is most orthologous to the ugt197-containing C1C buffelgrass contig. Contig Rice A belongs to the rice genome database contig 77 (according to the current September 12, 2003, rice fingerprint contig build) that maps proximal to the chromosome 11 centromere, a feature that interestingly correlates with the mapping of ASGR-linked BACs proximal to the centromere or centromere-like sequences. Thus, relatedness between these two orthologous contigs is supported both by their molecular microstructure and by their centromeric-proximal location. Our discoveries promote the use of a microsynteny-based positional-cloning approach using the rice genome as a template to aid in constructing the ASGR toward the isolation of genes underlying apospory.

  12. From Genomes to Phenotypes: Traitar, the Microbial Trait Analyzer.

    PubMed

    Weimann, Aaron; Mooren, Kyra; Frank, Jeremy; Pope, Phillip B; Bremges, Andreas; McHardy, Alice C

    2016-01-01

    The number of sequenced genomes is growing exponentially, profoundly shifting the bottleneck from data generation to genome interpretation. Traits are often used to characterize and distinguish bacteria and are likely a driving factor in microbial community composition, yet little is known about the traits of most microbes. We describe Traitar, the microbial trait analyzer, which is a fully automated software package for deriving phenotypes from a genome sequence. Traitar provides phenotype classifiers to predict 67 traits related to the use of various substrates as carbon and energy sources, oxygen requirement, morphology, antibiotic susceptibility, proteolysis, and enzymatic activities. Furthermore, it suggests protein families associated with the presence of particular phenotypes. Our method uses L1-regularized L2-loss support vector machines for phenotype assignments based on phyletic patterns of protein families and their evolutionary histories across a diverse set of microbial species. We demonstrate reliable phenotype assignment for Traitar to bacterial genomes from 572 species of eight phyla, also based on incomplete single-cell genomes and simulated draft genomes. We also showcase its application in metagenomics by verifying and complementing a manual metabolic reconstruction of two novel Clostridiales species based on draft genomes recovered from commercial biogas reactors. Traitar is available at https://github.com/hzi-bifo/traitar. IMPORTANCE Bacteria are ubiquitous in our ecosystem and have a major impact on human health, e.g., by supporting digestion in the human gut. Bacterial communities can also aid in biotechnological processes such as wastewater treatment or decontamination of polluted soils. Diverse bacteria contribute with their unique capabilities to the functioning of such ecosystems, but lab experiments to investigate those capabilities are labor-intensive. Major advances in sequencing techniques open up the opportunity to study bacteria by

  13. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia

    PubMed Central

    Hou, Shaobin; Makarova, Kira S; Saw, Jimmy HW; Senin, Pavel; Ly, Benjamin V; Zhou, Zhemin; Ren, Yan; Wang, Jianmei; Galperin, Michael Y; Omelchenko, Marina V; Wolf, Yuri I; Yutin, Natalya; Koonin, Eugene V; Stott, Matthew B; Mountain, Bruce W; Crowe, Michelle A; Smirnova, Angela V; Dunfield, Peter F; Feng, Lu; Wang, Lei; Alam, Maqsudul

    2008-01-01

    Background The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a

  14. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.

    PubMed

    Medema, Marnix H; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A; Weber, Tilmann; Takano, Eriko; Breitling, Rainer

    2011-07-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide variety of microbes. However, rapidly and reliably pinpointing all the potential gene clusters for secondary metabolites in dozens of newly sequenced genomes has been extremely challenging, due to their biochemical heterogeneity, the presence of unknown enzymes and the dispersed nature of the necessary specialized bioinformatics tools and resources. Here, we present antiSMASH (antibiotics & Secondary Metabolite Analysis Shell), the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes (polyketides, non-ribosomal peptides, terpenes, aminoglycosides, aminocoumarins, indolocarbazoles, lantibiotics, bacteriocins, nucleosides, beta-lactams, butyrolactones, siderophores, melanins and others). It aligns the identified regions at the gene cluster level to their nearest relatives from a database containing all other known gene clusters, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view. antiSMASH is available at http://antismash.secondarymetabolites.org.

  15. Genomic organization of the 260 kb surrounding the waxy locus in a Japonica rice

    PubMed

    Nagano; Wu; Kawasaki; Kishima; Sano

    1999-12-01

    The present study was carried out to characterize the molecular organization in the vicinity of the waxy locus in rice. To determine the structural organization of the region surrounding waxy, contiguous clones covering a total of 260 kb were constructed using a bacterial artificial chromosome (BAC) library from the Shimokita variety of Japonica rice. This map also contains 200 overlapping subclones, which allowed construction of a fine physical map with a total of 64 HindIII sites. During the course of constructing the map, we noticed the presence of some repeated regions which might be related to transposable elements. We divided the 260-kb region into 60 segments (average size of 5.7 kb) to use as probes to determine their genomic organization. Hybridization patterns obtained by probing with these segments were classified into four types: class 1, a single or a few bands without a smeared background; class 2, a single or a few bands with a smeared background; class 3, multiple discrete bands without a smeared background; and class 4, only a smeared background. These classes constituted 6.5%, 20.9%, 3.7%, and 68.9% of the 260-kb region, respectively. The distribution of each class revealed that repetitive sequences are a major component in this region, as expected, and that unique sequence regions were mostly no longer than 6 kb due to interruption by repetitive sequences. We discuss how the map constructed here might be a powerful tool for characterization and comparison of the genome structures and the genes around the waxy locus in the Oryza species.

  16. The construction of recombinant industrial yeasts free of bacterial sequences by directed gene replacement into a nonessential region of the genome.

    PubMed

    Xiao, W; Rank, G H

    1989-03-15

    The yeast SMR1 gene was used as a dominant resistance-selectable marker for industrial yeast transformation and for targeting integration of an economically important gene at the homologous ILV2 locus. A MEL1 gene, which codes for alpha-galactosidase, was inserted into a dispensable upstream region of SMR1 in vitro; different treatments of the plasmid (pWX813) prior to transformation resulted in 3' end, 5' end and replacement integrations that exhibited distinct integrant structures. One-step replacement within a nonessential region of the host genome generated a stable integration of MEL1 devoid of bacterial plasmid DNA. Using this method, we have constructed several alpha-galactosidase positive industrial Saccharomyces strains. Our study provides a general method for stable gene transfer in most industrial Saccharomyces yeasts, including those used in the baking, brewing (ale and lager), distilling, wine and sake industries, with solely nucleotide sequences of interest. The absence of bacterial DNA in the integrant structure facilitates the commercial application of recombinant DNA technology in the food and beverage industry.

  17. CTCF and Cohesin in Genome Folding and Transcriptional Gene Regulation.

    PubMed

    Merkenschlager, Matthias; Nora, Elphège P

    2016-08-31

    Genome function, replication, integrity, and propagation rely on the dynamic structural organization of chromosomes during the cell cycle. Genome folding in interphase provides regulatory segmentation for appropriate transcriptional control, facilitates ordered genome replication, and contributes to genome integrity by limiting illegitimate recombination. Here, we review recent high-resolution chromosome conformation capture and functional studies that have informed models of the spatial and regulatory compartmentalization of mammalian genomes, and discuss mechanistic models for how CTCF and cohesin control the functional architecture of mammalian chromosomes.

  18. Genomic diversity and evolution of the fish pathogen Flavobacterium psychrophilum

    USDA-ARS?s Scientific Manuscript database

    Flavobacterium psychrophilum, the etiological agent of rainbow trout fry syndrome and bacterial cold-water disease in salmonid fish, is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide. In this study, the genomic diversity of the F. psychrophilum...

  19. The Genome of the Sea Urchin Strongylocentrotus purpuratus

    PubMed Central

    2011-01-01

    We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes. PMID:17095691

  20. Mobile Bacterial Group II Introns at the Crux of Eukaryotic Evolution

    PubMed Central

    Lambowitz, Alan M.; Belfort, Marlene

    2015-01-01

    SUMMARY This review focuses on recent developments in our understanding of group II intron function, the relationships of these introns to retrotransposons and spliceosomes, and how their common features have informed thinking about bacterial group II introns as key elements in eukaryotic evolution. Reverse transcriptase-mediated and host factor-aided intron retrohoming pathways are considered along with retrotransposition mechanisms to novel sites in bacteria, where group II introns are thought to have originated. DNA target recognition and movement by target-primed reverse transcription infer an evolutionary relationship among group II introns, non-LTR retrotransposons, such as LINE elements, and telomerase. Additionally, group II introns are almost certainly the progenitors of spliceosomal introns. Their profound similarities include splicing chemistry extending to RNA catalysis, reaction stereochemistry, and the position of two divalent metals that perform catalysis at the RNA active site. There are also sequence and structural similarities between group II introns and the spliceosome’s small nuclear RNAs (snRNAs) and between a highly conserved core spliceosomal protein Prp8 and a group II intron-like reverse transcriptase. It has been proposed that group II introns entered eukaryotes during bacterial endosymbiosis or bacterial-archaeal fusion, proliferated within the nuclear genome, necessitating evolution of the nuclear envelope, and fragmented giving rise to spliceosomal introns. Thus, these bacterial self-splicing mobile elements have fundamentally impacted the composition of extant eukaryotic genomes, including the human genome, most of which is derived from close relatives of mobile group II introns. PMID:25878921

  1. Genomic and phenotypic characterization of Xanthomonas cynarae sp. nov., a new species that causes bacterial bract spot of artichoke (Cynara scolymus L.).

    PubMed

    Trébaol, G; Gardan, L; Manceau, C; Tanguy, J L; Tirilly, Y; Boury, S

    2000-07-01

    A bacterial disease of artichoke (Cynara scolymus L.) was first observed in 1954 in Brittany and the Loire Valley, France. This disease causes water-soaked spots on bracts and depreciates marketability of the harvest. Ten strains of the pathogen causing bacterial spot of artichoke, previously identified as a member of the genus Xanthomonas, were characterized and compared with type and pathotype strains of the 20 Xanthomonas species using a polyphasic study including both phenotypic and genomic methods. The ten strains presented general morphological, biochemical and physiological traits and G+C content characteristic of the genus Xanthomonas. Sequencing of the 165 rRNA gene confirmed that this bacterium belongs to the genus Xanthomonas, and more precisely to the Xanthomonas campestris core. DNA-DNA hybridization results showed that the strains that cause bacterial spot of artichoke were 92-100% related to the proposed type strain CFBP 4188T and constituted a discrete DNA homology group that was distinct from the 20 previously described Xanthomonas species. The results of numerical analysis were in accordance with DNA-DNA hybridization data. Strains causing the bacterial bract spot of artichoke exhibited consistent determinative biochemical characteristics, which distinguished them from the 20 other Xanthomonas species previously described. Furthermore, pathogenicity tests allowed specific identification of this new phytopathogenic bacterium. Thus, it is concluded that this bacterium is a new species belonging to the genus Xanthomonas, for which the name Xanthomonas cynarae is proposed. The type strain, CFBP 4188T, has been deposited in the Collection Française des Bactéries Phytopathogènes (CFBP).

  2. Applications of CRISPR/Cas System to Bacterial Metabolic Engineering.

    PubMed

    Cho, Suhyung; Shin, Jongoh; Cho, Byung-Kwan

    2018-04-05

    The clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) adaptive immune system has been extensively used for gene editing, including gene deletion, insertion, and replacement in bacterial and eukaryotic cells owing to its simple, rapid, and efficient activities in unprecedented resolution. Furthermore, the CRISPR interference (CRISPRi) system including deactivated Cas9 (dCas9) with inactivated endonuclease activity has been further investigated for regulation of the target gene transiently or constitutively, avoiding cell death by disruption of genome. This review discusses the applications of CRISPR/Cas for genome editing in various bacterial systems and their applications. In particular, CRISPR technology has been used for the production of metabolites of high industrial significance, including biochemical, biofuel, and pharmaceutical products/precursors in bacteria. Here, we focus on methods to increase the productivity and yield/titer scan by controlling metabolic flux through individual or combinatorial use of CRISPR/Cas and CRISPRi systems with introduction of synthetic pathway in industrially common bacteria including Escherichia coli . Further, we discuss additional useful applications of the CRISPR/Cas system, including its use in functional genomics.

  3. Bacterial Genome Engineering and Synthetic Biology: Combating Pathogens

    DTIC Science & Technology

    2016-11-04

    engineering and SB methods such as recombineering, clustered regularly interspaced short palindromic repeats ( CRISPR ), and bacterial cell-cell...Cholera# Yersinia pseudotuberculosis# Staphylococcus aureus* Phage Engineering CRISPR /Cas9 Delivery of CRISPR genes and RNA guides for sequence...bear very close sequence alignment to the harmless strains via the use of the CRISPR /Cas9 system. The CRISPR system specifically targets a DNA sequence

  4. Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria

    PubMed Central

    Dy, Ron L; Pitman, Andrew R; Fineran, Peter C

    2013-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution. PMID:24251073

  5. Rearrangement of Influenza Virus Spliced Segments for the Development of Live-Attenuated Vaccines

    PubMed Central

    Nogales, Aitor; DeDiego, Marta L.; Topham, David J.

    2016-01-01

    ABSTRACT Influenza viral infections represent a serious public health problem, with influenza virus causing a contagious respiratory disease which is most effectively prevented through vaccination. Segments 7 (M) and 8 (NS) of the influenza virus genome encode mRNA transcripts that are alternatively spliced to express two different viral proteins. This study describes the generation, using reverse genetics, of three different recombinant influenza A/Puerto Rico/8/1934 (PR8) H1N1 viruses containing M or NS viral segments individually or modified M or NS viral segments combined in which the overlapping open reading frames of matrix 1 (M1)/M2 for the modified M segment and the open reading frames of nonstructural protein 1 (NS1)/nuclear export protein (NEP) for the modified NS segment were split by using the porcine teschovirus 1 (PTV-1) 2A autoproteolytic cleavage site. Viruses with an M split segment were impaired in replication at nonpermissive high temperatures, whereas high viral titers could be obtained at permissive low temperatures (33°C). Furthermore, viruses containing the M split segment were highly attenuated in vivo, while they retained their immunogenicity and provided protection against a lethal challenge with wild-type PR8. These results indicate that influenza viruses can be effectively attenuated by the rearrangement of spliced segments and that such attenuated viruses represent an excellent option as safe, immunogenic, and protective live-attenuated vaccines. Moreover, this is the first time in which an influenza virus containing a restructured M segment has been described. Reorganization of the M segment to encode M1 and M2 from two separate, nonoverlapping, independent open reading frames represents a useful tool to independently study mutations in the M1 and M2 viral proteins without affecting the other viral M product. IMPORTANCE Vaccination represents our best therapeutic option against influenza viral infections. However, the efficacy of

  6. Studies on cattle genomic structural variation provide insights into ruminant speciation and adaptation

    USDA-ARS?s Scientific Manuscript database

    Genomic structural variations, including segmental duplications (SD) and copy number variations (CNV), contribute significantly to individual health and disease in primates and rodents. As a part of the bovine genome annotation effort, we performed the first genome-wide analysis of SD in cattle usin...

  7. Development of microbial genome-probing microarrays using digital multiple displacement amplification of uncultivated microbial single cells.

    PubMed

    Chang, Ho-Won; Sung, Youlboong; Kim, Kyoung-Ho; Nam, Young-Do; Roh, Seong Woon; Kim, Min-Soo; Jeon, Che Ok; Bae, Jin-Woo

    2008-08-15

    A crucial problem in the use of previously developed genome-probing microarrays (GPM) has been the inability to use uncultivated bacterial genomes to take advantage of the high sensitivity and specificity of GPM in microbial detection and monitoring. We show here a method, digital multiple displacement amplification (MDA), to amplify and analyze various genomes obtained from single uncultivated bacterial cells. We used 15 genomes from key microbes involved in dichloromethane (DCM)-dechlorinating enrichment as microarray probes to uncover the bacterial population dynamics of samples without PCR amplification. Genomic DNA amplified from single cells originating from uncultured bacteria with 80.3-99.4% similarity to 16S rRNA genes of cultivated bacteria. The digital MDA-GPM method successfully monitored the dynamics of DCM-dechlorinating communities from different phases of enrichment status. Without a priori knowledge of microbial diversity, the digital MDA-GPM method could be designed to monitor most microbial populations in a given environmental sample.

  8. Exploiting rRNA operon copy number to investigate bacterial reproductive strategies.

    PubMed

    Roller, Benjamin R K; Stoddard, Steven F; Schmidt, Thomas M

    2016-09-12

    The potential for rapid reproduction is a hallmark of microbial life, but microbes in nature must also survive and compete when growth is constrained by resource availability. Successful reproduction requires different strategies when resources are scarce and when they are abundant 1,2 , but a systematic framework for predicting these reproductive strategies in bacteria has not been available. Here, we show that the number of ribosomal RNA operons (rrn) in bacterial genomes predicts two important components of reproduction-growth rate and growth efficiency-which are favoured under contrasting regimes of resource availability 3,4 . We find that the maximum reproductive rate of bacteria doubles with a doubling of rrn copy number, and the efficiency of carbon use is inversely related to maximal growth rate and rrn copy number. We also identify a feasible explanation for these patterns: the rate and yield of protein synthesis mirror the overall pattern in maximum growth rate and growth efficiency. Furthermore, comparative analysis of genomes from 1,167 bacterial species reveals that rrn copy number predicts traits associated with resource availability, including chemotaxis and genome streamlining. Genome-wide patterns of orthologous gene content covary with rrn copy number, suggesting convergent evolution in response to resource availability. Our findings imply that basic cellular processes adapt in contrasting ways to long-term differences in resource availability. They also establish a basis for predicting changes in bacterial community composition in response to resource perturbations using rrn copy number measurements 5 or inferences 6,7 .

  9. Segmentation of time series with long-range fractal correlations

    PubMed Central

    Bernaola-Galván, P.; Oliver, J.L.; Hackenberg, M.; Coronado, A.V.; Ivanov, P.Ch.; Carpena, P.

    2012-01-01

    Segmentation is a standard method of data analysis to identify change-points dividing a nonstationary time series into homogeneous segments. However, for long-range fractal correlated series, most of the segmentation techniques detect spurious change-points which are simply due to the heterogeneities induced by the correlations and not to real nonstationarities. To avoid this oversegmentation, we present a segmentation algorithm which takes as a reference for homogeneity, instead of a random i.i.d. series, a correlated series modeled by a fractional noise with the same degree of correlations as the series to be segmented. We apply our algorithm to artificial series with long-range correlations and show that it systematically detects only the change-points produced by real nonstationarities and not those created by the correlations of the signal. Further, we apply the method to the sequence of the long arm of human chromosome 21, which is known to have long-range fractal correlations. We obtain only three segments that clearly correspond to the three regions of different G + C composition revealed by means of a multi-scale wavelet plot. Similar results have been obtained when segmenting all human chromosome sequences, showing the existence of previously unknown huge compositional superstructures in the human genome. PMID:23645997

  10. Segmentation of time series with long-range fractal correlations.

    PubMed

    Bernaola-Galván, P; Oliver, J L; Hackenberg, M; Coronado, A V; Ivanov, P Ch; Carpena, P

    2012-06-01

    Segmentation is a standard method of data analysis to identify change-points dividing a nonstationary time series into homogeneous segments. However, for long-range fractal correlated series, most of the segmentation techniques detect spurious change-points which are simply due to the heterogeneities induced by the correlations and not to real nonstationarities. To avoid this oversegmentation, we present a segmentation algorithm which takes as a reference for homogeneity, instead of a random i.i.d. series, a correlated series modeled by a fractional noise with the same degree of correlations as the series to be segmented. We apply our algorithm to artificial series with long-range correlations and show that it systematically detects only the change-points produced by real nonstationarities and not those created by the correlations of the signal. Further, we apply the method to the sequence of the long arm of human chromosome 21, which is known to have long-range fractal correlations. We obtain only three segments that clearly correspond to the three regions of different G + C composition revealed by means of a multi-scale wavelet plot. Similar results have been obtained when segmenting all human chromosome sequences, showing the existence of previously unknown huge compositional superstructures in the human genome.

  11. Best practices for evaluating single nucleotide variant calling methods for microbial genomics

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Colman, Rebecca E.; Foster, Jeffrey T.; Sahl, Jason W.; Schupp, James M.; Keim, Paul; Morrow, Jayne B.; Salit, Marc L.; Zook, Justin M.

    2015-01-01

    Innovations in sequencing technologies have allowed biologists to make incredible advances in understanding biological systems. As experience grows, researchers increasingly recognize that analyzing the wealth of data provided by these new sequencing platforms requires careful attention to detail for robust results. Thus far, much of the scientific Communit’s focus for use in bacterial genomics has been on evaluating genome assembly algorithms and rigorously validating assembly program performance. Missing, however, is a focus on critical evaluation of variant callers for these genomes. Variant calling is essential for comparative genomics as it yields insights into nucleotide-level organismal differences. Variant calling is a multistep process with a host of potential error sources that may lead to incorrect variant calls. Identifying and resolving these incorrect calls is critical for bacterial genomics to advance. The goal of this review is to provide guidance on validating algorithms and pipelines used in variant calling for bacterial genomics. First, we will provide an overview of the variant calling procedures and the potential sources of error associated with the methods. We will then identify appropriate datasets for use in evaluating algorithms and describe statistical methods for evaluating algorithm performance. As variant calling moves from basic research to the applied setting, standardized methods for performance evaluation and reporting are required; it is our hope that this review provides the groundwork for the development of these standards. PMID:26217378

  12. Use of bacterial artificial chromosomes in generating targeted mutations in human and mouse cytomegaloviruses.

    PubMed

    Borst, Eva Maria; Benkartek, Corinna; Messerle, Martin

    2007-05-01

    Cloning of cytomegalovirus (CMV) genomes as bacterial artificial chromosomes (BAC) in E. coli and their manipulation using the techniques of bacterial genetics has greatly facilitated the construction of CMV mutants. This unit describes easily applicable procedures that allow rapid introduction of any kind of targeted mutation into BAC-cloned CMV genomes. Protocols for the reconstitution of virus from isolated BAC DNA, preparation of a virus stock, and isolation and characterization of viral DNA are also included. Special emphasis is laid on description of critical steps and thorough characterization of the altered BACs.

  13. Relaxation dynamics of internal segments of DNA chains in nanochannels

    NASA Astrophysics Data System (ADS)

    Jain, Aashish; Muralidhar, Abhiram; Dorfman, Kevin; Dorfman Group Team

    We will present relaxation dynamics of internal segments of a DNA chain confined in nanochannel. The results have direct application in genome mapping technology, where long DNA molecules containing sequence-specific fluorescent probes are passed through an array of nanochannels to linearize them, and then the distances between these probes (the so-called ``DNA barcode'') are measured. The relaxation dynamics of internal segments set the experimental error due to dynamic fluctuations. We developed a multi-scale simulation algorithm, combining a Pruned-Enriched Rosenbluth Method (PERM) simulation of a discrete wormlike chain model with hard spheres with Brownian dynamics (BD) simulations of a bead-spring chain. Realistic parameters such as the bead friction coefficient and spring force law parameters are obtained from PERM simulations and then mapped onto the bead-spring model. The BD simulations are carried out to obtain the extension autocorrelation functions of various segments, which furnish their relaxation times. Interestingly, we find that (i) corner segments relax faster than the center segments and (ii) relaxation times of corner segments do not depend on the contour length of DNA chain, whereas the relaxation times of center segments increase linearly with DNA chain size.

  14. Animals in a bacterial world, a new imperative for the life sciences

    PubMed Central

    McFall-Ngai, Margaret; Hadfield, Michael G.; Bosch, Thomas C. G.; Carey, Hannah V.; Domazet-Lošo, Tomislav; Douglas, Angela E.; Dubilier, Nicole; Eberl, Gerard; Fukami, Tadashi; Gilbert, Scott F.; Hentschel, Ute; King, Nicole; Kjelleberg, Staffan; Knoll, Andrew H.; Kremer, Natacha; Mazmanian, Sarkis K.; Metcalf, Jessica L.; Nealson, Kenneth; Pierce, Naomi E.; Rawls, John F.; Reid, Ann; Ruby, Edward G.; Rumpho, Mary; Sanders, Jon G.; Tautz, Diethard; Wernegreen, Jennifer J.

    2013-01-01

    In the last two decades, the widespread application of genetic and genomic approaches has revealed a bacterial world astonishing in its ubiquity and diversity. This review examines how a growing knowledge of the vast range of animal–bacterial interactions, whether in shared ecosystems or intimate symbioses, is fundamentally altering our understanding of animal biology. Specifically, we highlight recent technological and intellectual advances that have changed our thinking about five questions: how have bacteria facilitated the origin and evolution of animals; how do animals and bacteria affect each other’s genomes; how does normal animal development depend on bacterial partners; how is homeostasis maintained between animals and their symbionts; and how can ecological approaches deepen our understanding of the multiple levels of animal–bacterial interaction. As answers to these fundamental questions emerge, all biologists will be challenged to broaden their appreciation of these interactions and to include investigations of the relationships between and among bacteria and their animal partners as we seek a better understanding of the natural world. PMID:23391737

  15. BactoGeNIE: A large-scale comparative genome visualization for big displays

    DOE PAGES

    Aurisano, Jillian; Reda, Khairi; Johnson, Andrew; ...

    2015-08-13

    The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE throughmore » a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.« less

  16. BactoGeNIE: a large-scale comparative genome visualization for big displays

    PubMed Central

    2015-01-01

    Background The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. Results In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. Conclusions BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics. PMID:26329021

  17. BactoGeNIE: A large-scale comparative genome visualization for big displays

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aurisano, Jillian; Reda, Khairi; Johnson, Andrew

    The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE throughmore » a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.« less

  18. Genome Duplication and Gene Loss Affect the Evolution of Heat Shock Transcription Factor Genes in Legumes

    PubMed Central

    Jin, Jing; Jin, Xiaolei; Jiang, Haiyang; Yan, Hanwei; Cheng, Beijiu

    2014-01-01

    Whole-genome duplication events (polyploidy events) and gene loss events have played important roles in the evolution of legumes. Here we show that the vast majority of Hsf gene duplications resulted from whole genome duplication events rather than tandem duplication, and significant differences in gene retention exist between species. By searching for intraspecies gene colinearity (microsynteny) and dating the age distributions of duplicated genes, we found that genome duplications accounted for 42 of 46 Hsf-containing segments in Glycine max, while paired segments were rarely identified in Lotus japonicas, Medicago truncatula and Cajanus cajan. However, by comparing interspecies microsynteny, we determined that the great majority of Hsf-containing segments in Lotus japonicas, Medicago truncatula and Cajanus cajan show extensive conservation with the duplicated regions of Glycine max. These segments formed 17 groups of orthologous segments. These results suggest that these regions shared ancient genome duplication with Hsf genes in Glycine max, but more than half of the copies of these genes were lost. On the other hand, the Glycine max Hsf gene family retained approximately 75% and 84% of duplicated genes produced from the ancient genome duplication and recent Glycine-specific genome duplication, respectively. Continuous purifying selection has played a key role in the maintenance of Hsf genes in Glycine max. Expression analysis of the Hsf genes in Lotus japonicus revealed their putative involvement in multiple tissue-/developmental stages and responses to various abiotic stimuli. This study traces the evolution of Hsf genes in legume species and demonstrates that the rates of gene gain and loss are far from equilibrium in different species. PMID:25047803

  19. Translocations of Chromosome End-Segments and Facultative Heterochromatin Promote Meiotic Ring Formation in Evening Primroses[W][OPEN

    PubMed Central

    Golczyk, Hieronim; Massouh, Amid; Greiner, Stephan

    2014-01-01

    Due to reciprocal chromosomal translocations, many species of Oenothera (evening primrose) form permanent multichromosomal meiotic rings. However, regular bivalent pairing is also observed. Chiasmata are restricted to chromosomal ends, which makes homologous recombination virtually undetectable. Genetic diversity is achieved by changing linkage relations of chromosomes in rings and bivalents via hybridization and reciprocal translocations. Although the structural prerequisite for this system is enigmatic, whole-arm translocations are widely assumed to be the mechanistic driving force. We demonstrate that this prerequisite is genome compartmentation into two epigenetically defined chromatin fractions. The first one facultatively condenses in cycling cells into chromocenters negative both for histone H3 dimethylated at lysine 4 and for C-banding, and forms huge condensed middle chromosome regions on prophase chromosomes. Remarkably, it decondenses in differentiating cells. The second fraction is euchromatin confined to distal chromosome segments, positive for histone H3 lysine 4 dimethylation and for histone H3 lysine 27 trimethylation. The end-segments are deprived of canonical telomeres but capped with constitutive heterochromatin. This genomic organization promotes translocation breakpoints between the two chromatin fractions, thus facilitating exchanges of end-segments. We challenge the whole-arm translocation hypothesis by demonstrating why reciprocal translocations of chromosomal end-segments should strongly promote meiotic rings and evolution toward permanent translocation heterozygosity. Reshuffled end-segments, each possessing a major crossover hot spot, can furthermore explain meiotic compatibility between genomes with different translocation histories. PMID:24681616

  20. Joint analysis of bacterial DNA methylation, predicted promoter and regulation motifs for biological significance

    USDA-ARS?s Scientific Manuscript database

    Advances in long-read, single molecule real-time sequencing technology and analysis software over the last two years has enabled the efficient production of closed bacterial genome sequences. However, consistent annotation of these genomes has lagged behind the ability to create them, while the avai...

  1. Termination and read-through proteins encoded by genome segment 9 of Colorado tick fever virus.

    PubMed

    Mohd Jaafar, Fauziah; Attoui, Houssam; De Micco, Philippe; De Lamballerie, Xavier

    2004-08-01

    Genome segment 9 (Seg-9) of Colorado tick fever virus (CTFV) is 1884 bp long and contains a large open reading frame (ORF; 1845 nt in length overall), although a single in-frame stop codon (at nt 1052-1054) reduces the ORF coding capacity by approximately 40 %. However, analyses of highly conserved RNA sequences in the vicinity of the stop codon indicate that it belongs to a class of 'leaky terminators'. The third nucleotide positions in codons situated both before and after the stop codon, shows the highest variability, suggesting that both regions are translated during virus replication. This also suggests that the stop signal is functionally leaky, allowing read-through translation to occur. Indeed, both the truncated 'termination' protein and the full-length 'read-through' protein (VP9 and VP9', respectively) were detected in CTFV-infected cells, in cells transfected with a plasmid expressing only Seg-9 protein products, and in the in vitro translation products from undenatured Seg-9 ssRNA. The ratios of full-length and truncated proteins generated suggest that read-through may be down-regulated by other viral proteins. Western blot analysis of infected cells and purified CTFV showed that VP9 is a structural component of the virion, while VP9' is a non-structural protein.

  2. A Segment of the Apospory-Specific Genomic Region Is Highly Microsyntenic Not Only between the Apomicts Pennisetum squamulatum and Buffelgrass, But Also with a Rice Chromosome 11 Centromeric-Proximal Genomic Region1[W

    PubMed Central

    Gualtieri, Gustavo; Conner, Joann A.; Morishige, Daryl T.; Moore, L. David; Mullet, John E.; Ozias-Akins, Peggy

    2006-01-01

    Bacterial artificial chromosome (BAC) clones from apomicts Pennisetum squamulatum and buffelgrass (Cenchrus ciliaris), isolated with the apospory-specific genomic region (ASGR) marker ugt197, were assembled into contigs that were extended by chromosome walking. Gene-like sequences from contigs were identified by shotgun sequencing and BLAST searches, and used to isolate orthologous rice contigs. Additional gene-like sequences in the apomicts' contigs were identified by bioinformatics using fully sequenced BACs from orthologous rice contigs as templates, as well as by interspecies, whole-contig cross-hybridizations. Hierarchical contig orthology was rapidly assessed by constructing detailed long-range contig molecular maps showing the distribution of gene-like sequences and markers, and searching for microsyntenic patterns of sequence identity and spatial distribution within and across species contigs. We found microsynteny between P. squamulatum and buffelgrass contigs. Importantly, this approach also enabled us to isolate from within the rice (Oryza sativa) genome contig Rice A, which shows the highest microsynteny and is most orthologous to the ugt197-containing C1C buffelgrass contig. Contig Rice A belongs to the rice genome database contig 77 (according to the current September 12, 2003, rice fingerprint contig build) that maps proximal to the chromosome 11 centromere, a feature that interestingly correlates with the mapping of ASGR-linked BACs proximal to the centromere or centromere-like sequences. Thus, relatedness between these two orthologous contigs is supported both by their molecular microstructure and by their centromeric-proximal location. Our discoveries promote the use of a microsynteny-based positional-cloning approach using the rice genome as a template to aid in constructing the ASGR toward the isolation of genes underlying apospory. PMID:16415213

  3. Raw Cow Milk Bacterial Population Shifts Attributable to Refrigeration

    PubMed Central

    Lafarge, Véronique; Ogier, Jean-Claude; Girard, Victoria; Maladen, Véronique; Leveau, Jean-Yves; Gruss, Alexandra; Delacroix-Buchet, Agnès

    2004-01-01

    We monitored the dynamic changes in the bacterial population in milk associated with refrigeration. Direct analyses of DNA by using temporal temperature gel electrophoresis (TTGE) and denaturing gradient gel electrophoresis (DGGE) allowed us to make accurate species assignments for bacteria with low-GC-content (low-GC%) (<55%) and medium- or high-GC% (>55%) genomes, respectively. We examined raw milk samples before and after 24-h conservation at 4°C. Bacterial identification was facilitated by comparison with an extensive bacterial reference database (∼150 species) that we established with DNA fragments of pure bacterial strains. Cloning and sequencing of fragments missing from the database were used to achieve complete species identification. Considerable evolution of bacterial populations occurred during conservation at 4°C. TTGE and DGGE are shown to be a powerful tool for identifying the main bacterial species of the raw milk samples and for monitoring changes in bacterial populations during conservation at 4°C. The emergence of psychrotrophic bacteria such as Listeria spp. or Aeromonas hydrophila is demonstrated. PMID:15345453

  4. Genomic cloning and chromosomal localization of HRY, the human homolog to the Drosophila segmentation gene, hairy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feder, J.N.; Jan, L.Y.; Jan, Y.N.

    The Drosophila hairy gene encodes a basic helix- loop-helix protein that functions in at least two steps during Drosophila development: (1) during embryogenesis, when it partakes in the establishment of segments, and (2) during the larval stage, when it functions negatively in determining the pattern of sensory bristles on the adult fly. In the rat, a structurally homologous gene (RHL) behaves as an immediate-early gene in its response to growth factors and can, like that in Drosophila, suppress neuronal differentiation events. Here, the authors report the genomic cloning of the human hairy gene homolog (HRY). The coding region of themore » gene is contained within four exons. The predicted amino acid sequence reveals only four amino acid differences between the human and rat genes. Analysis of the DNA sequence 5[prime] to the coding region reveals a putatitve untranslated exon. To increase the value of the HRY gene as a genetic marker and to assess its potential involvement in genetic disorders, they sublocalized the locus to chromosome 3q28-q29 by fluorescence in situ hybridization. 34 refs., 4 figs., 1 tab.« less

  5. CRISPR Genome Engineering for Human Pluripotent Stem Cell Research

    PubMed Central

    Chaterji, Somali; Ahn, Eun Hyun; Kim, Deok-Ho

    2017-01-01

    The emergence of targeted and efficient genome editing technologies, such as repurposed bacterial programmable nucleases (e.g., CRISPR-Cas systems), has abetted the development of cell engineering approaches. Lessons learned from the development of RNA-interference (RNA-i) therapies can spur the translation of genome editing, such as those enabling the translation of human pluripotent stem cell engineering. In this review, we discuss the opportunities and the challenges of repurposing bacterial nucleases for genome editing, while appreciating their roles, primarily at the epigenomic granularity. First, we discuss the evolution of high-precision, genome editing technologies, highlighting CRISPR-Cas9. They exist in the form of programmable nucleases, engineered with sequence-specific localizing domains, and with the ability to revolutionize human stem cell technologies through precision targeting with greater on-target activities. Next, we highlight the major challenges that need to be met prior to bench-to-bedside translation, often learning from the path-to-clinic of complementary technologies, such as RNA-i. Finally, we suggest potential bioinformatics developments and CRISPR delivery vehicles that can be deployed to circumvent some of the challenges confronting genome editing technologies en route to the clinic. PMID:29158838

  6. Genomics of Escherichia and Shigella

    NASA Astrophysics Data System (ADS)

    Perna, Nicole T.

    The laboratory workhorse Escherichia coli K-12 is among the most intensively studied living organisms on earth, and this single strain serves as the model system behind much of our understanding of prokaryotic molecular biology. Dense genome sequencing and recent insightful comparative analyses are making the species E. coli, as a whole, an emerging system for studying prokaryotic population genetics and the relationship between system-scale, or genome-scale, molecular evolution and complex traits like host range and pathogenic potential. Genomic perspective has revealed a coherent but dynamic species united by intraspecific gene flow via homologous lateral or horizontal transfer and differentiated by content flux mediated by acquisition of DNA segments from interspecies transfers.

  7. Errant processing and structural alterations of genomes present in a varicella-zoster virus vaccine.

    PubMed Central

    Vlazny, D A; Hyman, R W

    1985-01-01

    Five minority populations of aberrant, varicella-zoster virus (VZV)-derived genomes were identified among the encapsidated DNAs obtained from the nuclear and cytoplasmic fractions of an in vitro infection initiated with a lyophilized sample of the BIKEN VZV vaccine (strain Oka). These were (i) VZV genomes, present within nuclear but not cytoplasmic viral capsids, which had been cleaved at a specific site within the short segment and which were, therefore, 3.15 megadaltons (approximately 4% of the VZV genome length) short of full length; (ii) highly deleted, repetitive VZV genomes which contained the errant cleavage site but not the usual VZV genome terminal sequences; (iii) VZV genomes into which multiples of 1 through 5 defective genome repeat units had been inserted into a homologous site; (iv) VZV genomes with additions of 0.1 or 0.18 megadaltons of DNA at both the terminal and internal ends of the short segment; and (v) VZV DNA which had lost the HindIII restriction site at map position 0.11. Images PMID:2993670

  8. NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.

    PubMed

    Kulsum, Umay; Kapil, Arti; Singh, Harpreet; Kaur, Punit

    2018-01-01

    Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe .

  9. Genomic Epidemiology of Tuberculosis.

    PubMed

    Comas, Iñaki

    2017-01-01

    The application of next generation sequencing technologies has opened the door to a new molecular epidemiology of tuberculosis, in which we can now look at transmission at a resolution not possible before. At the same time, new technical and analytical challenges have appeared, and we are still exploring the wider potential of this new technology. Whole genome sequencing in tuberculosis still requires bacterial cultures. Thus, although whole genome sequencing has revolutionized the interpretation of transmission patterns, it is not yet ready to be applied at the point-of-care. In this chapter, I will review the promises and challenges of genomic epidemiology, as well as some of the new questions that have arisen from the use of this new technology. In addition, I will examine the role of molecular epidemiology within the general frame of global tuberculosis control and how genomic epidemiology can contribute towards the elimination of the disease.

  10. Insights from the Genome Sequence of Acidovorax citrulli M6, a Group I Strain of the Causal Agent of Bacterial Fruit Blotch of Cucurbits.

    PubMed

    Eckshtain-Levi, Noam; Shkedy, Dafna; Gershovits, Michael; Da Silva, Gustavo M; Tamir-Ariel, Dafna; Walcott, Ron; Pupko, Tal; Burdman, Saul

    2016-01-01

    Acidovorax citrulli is a seedborne bacterium that causes bacterial fruit blotch of cucurbit plants including watermelon and melon. A. citrulli strains can be divided into two major groups based on DNA fingerprint analyses and biochemical properties. Group I strains have been generally isolated from non-watermelon cucurbits, while group II strains are closely associated with watermelon. In the present study, we report the genome sequence of M6, a group I model A. citrulli strain, isolated from melon. We used comparative genome analysis to investigate differences between the genome of strain M6 and the genome of the group II model strain AAC00-1. The draft genome sequence of A. citrulli M6 harbors 139 contigs, with an overall approximate size of 4.85 Mb. The genome of M6 is ∼500 Kb shorter than that of strain AAC00-1. Comparative analysis revealed that this size difference is mainly explained by eight fragments, ranging from ∼35-120 Kb and distributed throughout the AAC00-1 genome, which are absent in the M6 genome. In agreement with this finding, while AAC00-1 was found to possess 532 open reading frames (ORFs) that are absent in strain M6, only 123 ORFs in M6 were absent in AAC00-1. Most of these M6 ORFs are hypothetical proteins and most of them were also detected in two group I strains that were recently sequenced, tw6 and pslb65. Further analyses by PCR assays and coverage analyses with other A. citrulli strains support the notion that some of these fragments or significant portions of them are discriminative between groups I and II strains of A. citrulli. Moreover, GC content, effective number of codon values and cluster of orthologs' analyses indicate that these fragments were introduced into group II strains by horizontal gene transfer events. Our study reports the genome sequence of a model group I strain of A. citrulli, one of the most important pathogens of cucurbits. It also provides the first comprehensive comparison at the genomic level between the

  11. Insights from the Genome Sequence of Acidovorax citrulli M6, a Group I Strain of the Causal Agent of Bacterial Fruit Blotch of Cucurbits

    PubMed Central

    Eckshtain-Levi, Noam; Shkedy, Dafna; Gershovits, Michael; Da Silva, Gustavo M.; Tamir-Ariel, Dafna; Walcott, Ron; Pupko, Tal; Burdman, Saul

    2016-01-01

    Acidovorax citrulli is a seedborne bacterium that causes bacterial fruit blotch of cucurbit plants including watermelon and melon. A. citrulli strains can be divided into two major groups based on DNA fingerprint analyses and biochemical properties. Group I strains have been generally isolated from non-watermelon cucurbits, while group II strains are closely associated with watermelon. In the present study, we report the genome sequence of M6, a group I model A. citrulli strain, isolated from melon. We used comparative genome analysis to investigate differences between the genome of strain M6 and the genome of the group II model strain AAC00-1. The draft genome sequence of A. citrulli M6 harbors 139 contigs, with an overall approximate size of 4.85 Mb. The genome of M6 is ∼500 Kb shorter than that of strain AAC00-1. Comparative analysis revealed that this size difference is mainly explained by eight fragments, ranging from ∼35–120 Kb and distributed throughout the AAC00-1 genome, which are absent in the M6 genome. In agreement with this finding, while AAC00-1 was found to possess 532 open reading frames (ORFs) that are absent in strain M6, only 123 ORFs in M6 were absent in AAC00-1. Most of these M6 ORFs are hypothetical proteins and most of them were also detected in two group I strains that were recently sequenced, tw6 and pslb65. Further analyses by PCR assays and coverage analyses with other A. citrulli strains support the notion that some of these fragments or significant portions of them are discriminative between groups I and II strains of A. citrulli. Moreover, GC content, effective number of codon values and cluster of orthologs’ analyses indicate that these fragments were introduced into group II strains by horizontal gene transfer events. Our study reports the genome sequence of a model group I strain of A. citrulli, one of the most important pathogens of cucurbits. It also provides the first comprehensive comparison at the genomic level between

  12. Comparative genomics identifies candidate genes for infectious salmon anemia (ISA) resistance in Atlantic salmon (Salmo salar).

    PubMed

    Li, Jieying; Boroevich, Keith A; Koop, Ben F; Davidson, William S

    2011-04-01

    Infectious salmon anemia (ISA) has been described as the hoof and mouth disease of salmon farming. ISA is caused by a lethal and highly communicable virus, which can have a major impact on salmon aquaculture, as demonstrated by an outbreak in Chile in 2007. A quantitative trait locus (QTL) for ISA resistance has been mapped to three microsatellite markers on linkage group (LG) 8 (Chr 15) on the Atlantic salmon genetic map. We identified bacterial artificial chromosome (BAC) clones and three fingerprint contigs from the Atlantic salmon physical map that contains these markers. We made use of the extensive BAC end sequence database to extend these contigs by chromosome walking and identified additional two markers in this region. The BAC end sequences were used to search for conserved synteny between this segment of LG8 and the fish genomes that have been sequenced. An examination of the genes in the syntenic segments of the tetraodon and medaka genomes identified candidates for association with ISA resistance in Atlantic salmon based on differential expression profiles from ISA challenges or on the putative biological functions of the proteins they encode. One gene in particular, HIV-EP2/MBP-2, caught our attention as it may influence the expression of several genes that have been implicated in the response to infection by infectious salmon anemia virus (ISAV). Therefore, we suggest that HIV-EP2/MBP-2 is a very strong candidate for the gene associated with the ISAV resistance QTL in Atlantic salmon and is worthy of further study.

  13. Full Genome Sequence of Giant Panda Rotavirus Strain CH-1

    PubMed Central

    Guo, Ling; Yang, Shaolin; Wang, Chengdong; Chen, Shijie; Yang, Xiaonong; Hou, Rong; Quan, Zifang; Hao, Zhongxiang

    2013-01-01

    We report here the complete genomic sequence of the giant panda rotavirus strain CH-1. This work is the first to document the complete genomic sequence (segments 1 to 11) of the CH-1 strain, which offers an effective platform for providing authentic research experiences to novice scientists. PMID:23469354

  14. Complete Genome Sequences of 38 Gordonia sp. Bacteriophages

    PubMed Central

    Montgomery, Matthew T.; Bonilla, J. Alfred; Dejong, Randall; Garlena, Rebecca A.; Guerrero Bustamante, Carlos; Klyczek, Karen K.; Russell, Daniel A.; Wertz, John T.; Jacobs-Sera, Deborah; Hatfull, Graham F.

    2017-01-01

    ABSTRACT We report here the genome sequences of 38 newly isolated bacteriophages using Gordonia terrae 3612 (ATCC 25594) and Gordonia neofelifaecis NRRL59395 as bacterial hosts. All of the phages are double-stranded DNA (dsDNA) tail phages with siphoviral morphologies, with genome sizes ranging from 17,118 bp to 93,843 bp and spanning considerable nucleotide sequence diversity. PMID:28057748

  15. Long-read whole genome sequencing and comparative analysis of six strains of the human pathogen Orientia tsutsugamushi.

    PubMed

    Batty, Elizabeth M; Chaemchuen, Suwittra; Blacksell, Stuart; Richards, Allen L; Paris, Daniel; Bowden, Rory; Chan, Caroline; Lachumanan, Ramkumar; Day, Nicholas; Donnelly, Peter; Chen, Swaine; Salje, Jeanne

    2018-06-01

    Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species. We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene islands and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny of Orientia. Complete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have larger genomes compared with most other Rickettsiaceae, with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.

  16. Comparative Genomic and Phenotypic Characterization of Pathogenic and Non-Pathogenic Strains of Xanthomonas arboricola Reveals Insights into the Infection Process of Bacterial Spot Disease of Stone Fruits

    PubMed Central

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M.

    2016-01-01

    Xanthomonas arboricola pv. pruni is the causal agent of bacterial spot disease of stone fruits, a quarantinable pathogen in several areas worldwide, including the European Union. In order to develop efficient control methods for this disease, it is necessary to improve the understanding of the key determinants associated with host restriction, colonization and the development of pathogenesis. After an initial characterization, by multilocus sequence analysis, of 15 strains of X. arboricola isolated from Prunus, one strain did not group into the pathovar pruni or into other pathovars of this species and therefore it was identified and defined as a X. arboricola pv. pruni look-a-like. This non-pathogenic strain and two typical strains of X. arboricola pv. pruni were selected for a whole genome and phenotype comparative analysis in features associated with the pathogenesis process in Xanthomonas. Comparative analysis among these bacterial strains isolated from Prunus spp. and the inclusion of 15 publicly available genome sequences from other pathogenic and non-pathogenic strains of X. arboricola revealed variations in the phenotype associated with variations in the profiles of TonB-dependent transporters, sensors of the two-component regulatory system, methyl accepting chemotaxis proteins, components of the flagella and the type IV pilus, as well as in the repertoire of cell-wall degrading enzymes and the components of the type III secretion system and related effectors. These variations provide a global overview of those mechanisms that could be associated with the development of bacterial spot disease. Additionally, it pointed out some features that might influence the host specificity and the variable virulence observed in X. arboricola. PMID:27571391

  17. Whole-genome sequencing of staphylococcus haemolyticus uncovers the extreme plasticity of its genome and the evolution of human-colonizing staphylococcal species.

    PubMed

    Takeuchi, Fumihiko; Watanabe, Shinya; Baba, Tadashi; Yuzawa, Harumi; Ito, Teruyo; Morimoto, Yuh; Kuroda, Makoto; Cui, Longzhu; Takahashi, Mikio; Ankai, Akiho; Baba, Shin-ichi; Fukui, Shigehiro; Lee, Jean C; Hiramatsu, Keiichi

    2005-11-01

    Staphylococcus haemolyticus is an opportunistic bacterial pathogen that colonizes human skin and is remarkable for its highly antibiotic-resistant phenotype. We determined the complete genome sequence of S.haemolyticus to better understand its pathogenicity and evolutionary relatedness to the other staphylococcal species. A large proportion of the open reading frames in the genomes of S.haemolyticus, Staphylococcus aureus, and Staphylococcus epidermidis were conserved in their sequence and order on the chromosome. We identified a region of the bacterial chromosome just downstream of the origin of replication that showed little homology among the species but was conserved among strains within a species. This novel region, designated the "oriC environ," likely contributes to the evolution and differentiation of the staphylococcal species, since it was enriched for species-specific nonessential genes that contribute to the biological features of each staphylococcal species. A comparative analysis of the genomes of S.haemolyticus, S.aureus, and S.epidermidis elucidated differences in their biological and genetic characteristics and pathogenic potentials. We identified as many as 82 insertion sequences in the S.haemolyticus chromosome that probably mediated frequent genomic rearrangements, resulting in phenotypic diversification of the strain. Such rearrangements could have brought genomic plasticity to this species and contributed to its acquisition of antibiotic resistance.

  18. Whole-Genome Sequencing of Staphylococcus haemolyticus Uncovers the Extreme Plasticity of Its Genome and the Evolution of Human-Colonizing Staphylococcal Species

    PubMed Central

    Takeuchi, Fumihiko; Watanabe, Shinya; Baba, Tadashi; Yuzawa, Harumi; Ito, Teruyo; Morimoto, Yuh; Kuroda, Makoto; Cui, Longzhu; Takahashi, Mikio; Ankai, Akiho; Baba, Shin-ichi; Fukui, Shigehiro; Lee, Jean C.; Hiramatsu, Keiichi

    2005-01-01

    Staphylococcus haemolyticus is an opportunistic bacterial pathogen that colonizes human skin and is remarkable for its highly antibiotic-resistant phenotype. We determined the complete genome sequence of S.haemolyticus to better understand its pathogenicity and evolutionary relatedness to the other staphylococcal species. A large proportion of the open reading frames in the genomes of S.haemolyticus, Staphylococcus aureus, and Staphylococcus epidermidis were conserved in their sequence and order on the chromosome. We identified a region of the bacterial chromosome just downstream of the origin of replication that showed little homology among the species but was conserved among strains within a species. This novel region, designated the “oriC environ,” likely contributes to the evolution and differentiation of the staphylococcal species, since it was enriched for species-specific nonessential genes that contribute to the biological features of each staphylococcal species. A comparative analysis of the genomes of S.haemolyticus, S.aureus, and S.epidermidis elucidated differences in their biological and genetic characteristics and pathogenic potentials. We identified as many as 82 insertion sequences in the S.haemolyticus chromosome that probably mediated frequent genomic rearrangements, resulting in phenotypic diversification of the strain. Such rearrangements could have brought genomic plasticity to this species and contributed to its acquisition of antibiotic resistance. PMID:16237012

  19. Superstatistical model of bacterial DNA architecture

    NASA Astrophysics Data System (ADS)

    Bogachev, Mikhail I.; Markelov, Oleg A.; Kayumov, Airat R.; Bunde, Armin

    2017-02-01

    Understanding the physical principles that govern the complex DNA structural organization as well as its mechanical and thermodynamical properties is essential for the advancement in both life sciences and genetic engineering. Recently we have discovered that the complex DNA organization is explicitly reflected in the arrangement of nucleotides depicted by the universal power law tailed internucleotide interval distribution that is valid for complete genomes of various prokaryotic and eukaryotic organisms. Here we suggest a superstatistical model that represents a long DNA molecule by a series of consecutive ~150 bp DNA segments with the alternation of the local nucleotide composition between segments exhibiting long-range correlations. We show that the superstatistical model and the corresponding DNA generation algorithm explicitly reproduce the laws governing the empirical nucleotide arrangement properties of the DNA sequences for various global GC contents and optimal living temperatures. Finally, we discuss the relevance of our model in terms of the DNA mechanical properties. As an outlook, we focus on finding the DNA sequences that encode a given protein while simultaneously reproducing the nucleotide arrangement laws observed from empirical genomes, that may be of interest in the optimization of genetic engineering of long DNA molecules.

  20. Draft Genome Sequence of “Cohnella kolymensis” B-2846

    PubMed Central

    Kudryashova, Ekaterina B.; Ariskina, Elena V.

    2016-01-01

    A draft genome sequence of “Cohnella kolymensis” strain B-2846 was derived using IonTorrent sequencing technology. The size of the assembly and G+C content were in agreement with those of other species of this genus. Characterization of the genome of a novel species of Cohnella will assist in bacterial systematics. PMID:26769947

  1. Identical bacterial populations colonize premature infant gut, skin, and oral microbiomes and exhibit different in situ growth rates

    PubMed Central

    Olm, Matthew R.; Brown, Christopher T.; Brooks, Brandon; Firek, Brian; Baker, Robyn; Burstein, David; Soenjoyo, Karina; Thomas, Brian C.; Morowitz, Michael; Banfield, Jillian F.

    2017-01-01

    The initial microbiome impacts the health and future development of premature infants. Methodological limitations have led to gaps in our understanding of the habitat range and subpopulation complexity of founding strains, as well as how different body sites support microbial growth. Here, we used metagenomics to reconstruct genomes of strains that colonized the skin, mouth, and gut of two hospitalized premature infants during the first month of life. Seven bacterial populations, considered to be identical given whole-genome average nucleotide identity of >99.9%, colonized multiple body sites, yet none were shared between infants. Gut-associated Citrobacter koseri genomes harbored 47 polymorphic sites that we used to define 10 subpopulations, one of which appeared in the gut after 1 wk but did not spread to other body sites. Differential genome coverage was used to measure bacterial population replication rates in situ. In all cases where the same bacterial population was detected in multiple body sites, replication rates were faster in mouth and skin compared to the gut. The ability of identical strains to colonize multiple body sites underscores the habit flexibility of initial colonists, whereas differences in microbial replication rates between body sites suggest differences in host control and/or resource availability. Population genomic analyses revealed microdiversity within bacterial populations, implying initial inoculation by multiple individual cells with distinct genotypes. Overall, however, the overlap of strains across body sites implies that the premature infant microbiome can exhibit very low microbial diversity. PMID:28073918

  2. A Hybrid Approach for the Automated Finishing of Bacterial Genomes

    PubMed Central

    Robins, William P.; Chin, Chen-Shan; Webster, Dale; Paxinos, Ellen; Hsu, David; Ashby, Meredith; Wang, Susana; Peluso, Paul; Sebra, Robert; Sorenson, Jon; Bullard, James; Yen, Jackie; Valdovino, Marie; Mollova, Emilia; Luong, Khai; Lin, Steven; LaMay, Brianna; Joshi, Amruta; Rowe, Lori; Frace, Michael; Tarr, Cheryl L.; Turnsek, Maryann; Davis, Brigid M; Kasarskis, Andrew; Mekalanos, John J.; Waldor, Matthew K.; Schadt, Eric E.

    2013-01-01

    Dramatic improvements in DNA sequencing technology have revolutionized our ability to characterize most genomic diversity. However, accurate resolution of large structural events has remained challenging due to the comparatively shorter read lengths of second-generation technologies. Emerging third-generation sequencing technologies, which yield markedly increased read length on rapid time scales and for low cost, have the potential to address assembly limitations. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at > 99.9% accuracy. Complex regions with clinically significant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 reference we obtain 14 and 8 scaffolds greater than 1kb, respectively, correcting several errors in the underlying source data. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly. PMID:22750883

  3. Brucella abortus Strain 2308 Wisconsin Genome: Importance of the Definition of Reference Strains

    PubMed Central

    Suárez-Esquivel, Marcela; Ruiz-Villalobos, Nazareth; Castillo-Zeledón, Amanda; Jiménez-Rojas, César; Roop II, R. Martin; Comerci, Diego J.; Barquero-Calvo, Elías; Chacón-Díaz, Carlos; Caswell, Clayton C.; Baker, Kate S.; Chaves-Olarte, Esteban; Thomson, Nicholas R.; Moreno, Edgardo; Letesson, Jean J.; De Bolle, Xavier; Guzmán-Verri, Caterina

    2016-01-01

    Brucellosis is a bacterial infectious disease affecting a wide range of mammals and a neglected zoonosis caused by species of the genetically homogenous genus Brucella. As in most studies on bacterial diseases, research in brucellosis is carried out by using reference strains as canonical models to understand the mechanisms underlying host pathogen interactions. We performed whole genome sequencing analysis of the reference strain B. abortus 2308 routinely used in our laboratory, including manual curated annotation accessible as an editable version through a link at https://en.wikipedia.org/wiki/Brucella#Genomics. Comparison of this genome with two publically available 2308 genomes showed significant differences, particularly indels related to insertional elements, suggesting variability related to the transposition of these elements within the same strain. Considering the outcome of high resolution genomic techniques in the bacteriology field, the conventional concept of strain definition needs to be revised. PMID:27746773

  4. Bacterial signatures in thrombus aspirates of patients with lower limb arterial and venous thrombosis.

    PubMed

    Vakhitov, Damir; Tuomisto, Sari; Martiskainen, Mika; Korhonen, Janne; Pessi, Tanja; Salenius, Juha-Pekka; Suominen, Velipekka; Lehtimäki, Terho; Karhunen, Pekka J; Oksala, Niku

    2018-06-01

    Increasing data supports the role of bacterial inflammation in adverse events of cardiovascular and cerebrovascular diseases. In our previous research, DNA of bacterial species found in coronary artery thrombus aspirates and ruptured cerebral aneurysms were mostly of endodontic and periodontal origin, where Streptococcus mitis group DNA was the most common. We hypothesized that the genomes of S mitis group could be identified in thrombus aspirates of patients with lower limb arterial and deep venous thrombosis. Thrombus aspirates and control blood samples taken from 42 patients with acute or acute-on-chronic lower limb ischemia (Rutherford I-IIb) owing to arterial or graft thrombosis (n = 31) or lower limb deep venous thrombosis (n = 11) were examined using a quantitative real-time polymerase chain reaction to detect all possible bacterial DNA and DNA of S mitis group in particular. The samples were considered positive, if the amount of bacterial DNA in the thrombus aspirates was 2-fold or greater in comparison with control blood samples. In the positive samples the mean difference for the total bacterial DNA was 12.1-fold (median, 7.1), whereas the differences for S mitis group DNA were a mean of 29.1 and a median of 5.2-fold. Of the arterial thrombus aspirates, 57.9% were positive for bacterial DNA, whereas bacterial genomes were found in 75% of bypass graft thrombosis with 77.8% of the prosthetic grafts being positive. Of the deep vein thrombus aspirates, 45.5% contained bacterial genomes. Most (80%) of bacterial DNA-positive cases contained DNA from the S mitis group. Previous arterial interventions were significantly associated with the occurrence of S mitis group DNA (P = .049, Fisher's exact test). This is the first study to report the presence of bacterial DNA, predominantly of S mitis group origin, in the thrombus aspirates of surgical patients with lower limb arterial and deep venous thrombosis, suggesting their possible role in the pathogenesis of

  5. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”

    PubMed Central

    Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Donati, Claudio; Medini, Duccio; Ward, Naomi L.; Angiuoli, Samuel V.; Crabtree, Jonathan; Jones, Amanda L.; Durkin, A. Scott; DeBoy, Robert T.; Davidsen, Tanja M.; Mora, Marirosa; Scarselli, Maria; Margarit y Ros, Immaculada; Peterson, Jeremy D.; Hauser, Christopher R.; Sundaram, Jaideep P.; Nelson, William C.; Madupu, Ramana; Brinkac, Lauren M.; Dodson, Robert J.; Rosovitz, Mary J.; Sullivan, Steven A.; Daugherty, Sean C.; Haft, Daniel H.; Selengut, Jeremy; Gwinn, Michelle L.; Zhou, Liwei; Zafar, Nikhat; Khouri, Hoda; Radune, Diana; Dimitrov, George; Watkins, Kisha; O'Connor, Kevin J. B.; Smith, Shannon; Utterback, Teresa R.; White, Owen; Rubens, Craig E.; Grandi, Guido; Madoff, Lawrence C.; Kasper, Dennis L.; Telford, John L.; Wessels, Michael R.; Rappuoli, Rino; Fraser, Claire M.

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for ≈80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes. PMID:16172379

  6. Plagiarized bacterial genes in the human book of life.

    PubMed

    Ponting, C P

    2001-05-01

    The initial analysis of the human genome draft sequence reveals that our 'book of life' is multi-authored. A small but significant proportion of our genes owes their heritage not to antecedent eukaryotes but instead to bacteria. The publicly funded Human Genome Project study indicates that about 0.5% of all human genes were copied into the genome from bacterial sources. Detailed sequence analyses point to these 'horizontal gene transfer' events having occurred relatively recently. So how did the human 'book of life' evolve to be a chimaera, part animal and part bacterium? And what was the probable evolutionary impact of such gene plagiarism?

  7. Accuracy of genomic selection for BCWD resistance in rainbow trout

    USDA-ARS?s Scientific Manuscript database

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonids. In this study, we aimed to (1) predict genomic breeding values (GEBV) by genotyping training (n=583) and validation samples (n=53) with a SNP50K chip; and (2) assess the accuracy of genomic selection (GS) for BCWD r...

  8. Bacterial genome sequencing in clinical microbiology: a pathogen-oriented review.

    PubMed

    Tagini, F; Greub, G

    2017-11-01

    In recent years, whole-genome sequencing (WGS) has been perceived as a technology with the potential to revolutionise clinical microbiology. Herein, we reviewed the literature on the use of WGS for the most commonly encountered pathogens in clinical microbiology laboratories: Escherichia coli and other Enterobacteriaceae, Staphylococcus aureus and coagulase-negative staphylococci, streptococci and enterococci, mycobacteria and Chlamydia trachomatis. For each pathogen group, we focused on five different aspects: the genome characteristics, the most common genomic approaches and the clinical uses of WGS for (i) typing and outbreak analysis, (ii) virulence investigation and (iii) in silico antimicrobial susceptibility testing. Of all the clinical usages, the most frequent and straightforward usage was to type bacteria and to trace outbreaks back. A next step toward standardisation was made thanks to the development of several new genome-wide multi-locus sequence typing systems based on WGS data. Although virulence characterisation could help in various particular clinical settings, it was done mainly to describe outbreak strains. An increasing number of studies compared genotypic to phenotypic antibiotic susceptibility testing, with mostly promising results. However, routine implementation will preferentially be done in the workflow of particular pathogens, such as mycobacteria, rather than as a broadly applicable generic tool. Overall, concrete uses of WGS in routine clinical microbiology or infection control laboratories were done, but the next big challenges will be the standardisation and validation of the procedures and bioinformatics pipelines in order to reach clinical standards.

  9. Organizational heterogeneity of vertebrate genomes.

    PubMed

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  10. The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes.

    PubMed

    Bohlin, Jon; Eldholm, Vegard; Pettersson, John H O; Brynildsrud, Ola; Snipen, Lars

    2017-02-10

    The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.

  11. An archaeal genomic signature

    NASA Technical Reports Server (NTRS)

    Graham, D. E.; Overbeek, R.; Olsen, G. J.; Woese, C. R.

    2000-01-01

    Comparisons of complete genome sequences allow the most objective and comprehensive descriptions possible of a lineage's evolution. This communication uses the completed genomes from four major euryarchaeal taxa to define a genomic signature for the Euryarchaeota and, by extension, the Archaea as a whole. The signature is defined in terms of the set of protein-encoding genes found in at least two diverse members of the euryarchaeal taxa that function uniquely within the Archaea; most signature proteins have no recognizable bacterial or eukaryal homologs. By this definition, 351 clusters of signature proteins have been identified. Functions of most proteins in this signature set are currently unknown. At least 70% of the clusters that contain proteins from all the euryarchaeal genomes also have crenarchaeal homologs. This conservative set, which appears refractory to horizontal gene transfer to the Bacteria or the Eukarya, would seem to reflect the significant innovations that were unique and fundamental to the archaeal "design fabric." Genomic protein signature analysis methods may be extended to characterize the evolution of any phylogenetically defined lineage. The complete set of protein clusters for the archaeal genomic signature is presented as supplementary material (see the PNAS web site, www.pnas.org).

  12. Hamiltonella defensa, genome evolution of protective bacterial endosymbiont from pathogenic ancestors.

    PubMed

    Degnan, Patrick H; Yu, Yeisoo; Sisneros, Nicholas; Wing, Rod A; Moran, Nancy A

    2009-06-02

    Eukaryotes engage in a multitude of beneficial and deleterious interactions with bacteria. Hamiltonella defensa, an endosymbiont of aphids and other sap-feeding insects, protects its aphid host from attack by parasitoid wasps. Thus H. defensa is only conditionally beneficial to hosts, unlike ancient nutritional symbionts, such as Buchnera, that are obligate. Similar to pathogenic bacteria, H. defensa is able to invade naive hosts and circumvent host immune responses. We have sequenced the genome of H. defensa to identify possible mechanisms that underlie its persistence in healthy aphids and protection from parasitoids. The 2.1-Mb genome has undergone significant reduction in size relative to its closest free-living relatives, which include Yersinia and Serratia species (4.6-5.4 Mb). Auxotrophic for 8 of the 10 essential amino acids, H. defensa is reliant upon the essential amino acids produced by Buchnera. Despite these losses, the H. defensa genome retains more genes and pathways for a variety of cell structures and processes than do obligate symbionts, such as Buchnera. Furthermore, putative pathogenicity loci, encoding type-3 secretion systems, and toxin homologs, which are absent in obligate symbionts, are abundant in the H. defensa genome, as are regulatory genes that likely control the timing of their expression. The genome is also littered with mobile DNA, including phage-derived genes, plasmids, and insertion-sequence elements, highlighting its dynamic nature and the continued role horizontal gene transfer plays in shaping it.

  13. Bacterial α2-macroglobulins: colonization factors acquired by horizontal gene transfer from the metazoan genome?

    PubMed Central

    Budd, Aidan; Blandin, Stephanie; Levashina, Elena A; Gibson, Toby J

    2004-01-01

    Background Invasive bacteria are known to have captured and adapted eukaryotic host genes. They also readily acquire colonizing genes from other bacteria by horizontal gene transfer. Closely related species such as Helicobacter pylori and Helicobacter hepaticus, which exploit different host tissues, share almost none of their colonization genes. The protease inhibitor α2-macroglobulin provides a major metazoan defense against invasive bacteria, trapping attacking proteases required by parasites for successful invasion. Results Database searches with metazoan α2-macroglobulin sequences revealed homologous sequences in bacterial proteomes. The bacterial α2-macroglobulin phylogenetic distribution is patchy and violates the vertical descent model. Bacterial α2-macroglobulin genes are found in diverse clades, including purple bacteria (proteobacteria), fusobacteria, spirochetes, bacteroidetes, deinococcids, cyanobacteria, planctomycetes and thermotogae. Most bacterial species with bacterial α2-macroglobulin genes exploit higher eukaryotes (multicellular plants and animals) as hosts. Both pathogenically invasive and saprophytically colonizing species possess bacterial α2-macroglobulins, indicating that bacterial α2-macroglobulin is a colonization rather than a virulence factor. Conclusions Metazoan α2-macroglobulins inhibit proteases of pathogens. The bacterial homologs may function in reverse to block host antimicrobial defenses. α2-macroglobulin was probably acquired one or more times from metazoan hosts and has then spread widely through other colonizing bacterial species by more than 10 independent horizontal gene transfers. yfhM-like bacterial α2-macroglobulin genes are often found tightly linked with pbpC, encoding an atypical peptidoglycan transglycosylase, PBP1C, that does not function in vegetative peptidoglycan synthesis. We suggest that YfhM and PBP1C are coupled together as a periplasmic defense and repair system. Bacterial α2-macroglobulins might

  14. Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.

    PubMed

    Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G

    2010-06-01

    The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.

  15. Bacterial Sepsis in Patients with Visceral Leishmaniasis in Northwest Ethiopia

    PubMed Central

    Takele, Yegnasew; Woldeyohannes, Desalegn; Tiruneh, Moges; Mohammed, Rezika; Lynen, Lutgarde; van Griensven, Johan

    2014-01-01

    Background and Objectives. Visceral leishmaniasis (VL) is one of the neglected diseases affecting the poorest segment of world populations. Sepsis is one of the predictors for death of patients with VL. This study aimed to assess the prevalence and factors associated with bacterial sepsis, causative agents, and their antimicrobial susceptibility patterns among patients with VL. Methods. A cross-sectional study was conducted among parasitologically confirmed VL patients suspected of sepsis admitted to the University of Gondar Hospital, Northwest Ethiopia, from February 2012 to May 2012. Blood cultures and other clinical samples were collected and cultured following the standard procedures. Results. Among 83 sepsis suspected VL patients 16 (19.3%) had culture confirmed bacterial sepsis. The most frequently isolated organism was Staphylococcus aureus (68.8%; 11/16), including two methicillin-resistant isolates (MRSA). Patients with focal bacterial infection were more likely to have bacterial sepsis (P < 0.001). Conclusions. The prevalence of culture confirmed bacterial sepsis was high, predominantly due to S. aureus. Concurrent focal bacterial infection was associated with bacterial sepsis, suggesting that focal infections could serve as sources for bacterial sepsis among VL patients. Careful clinical evaluation for focal infections and prompt initiation of empiric antibiotic treatment appears warranted in VL patients. PMID:24895569

  16. PATRIC: the Comprehensive Bacterial Bioinformatics Resource with a Focus on Human Pathogenic Species ▿ ‡ #

    PubMed Central

    Gillespie, Joseph J.; Wattam, Alice R.; Cammer, Stephen A.; Gabbard, Joseph L.; Shukla, Maulik P.; Dalay, Oral; Driscoll, Timothy; Hix, Deborah; Mane, Shrinivasrao P.; Mao, Chunhong; Nordberg, Eric K.; Scott, Mark; Schulman, Julie R.; Snyder, Eric E.; Sullivan, Daniel E.; Wang, Chunxia; Warren, Andrew; Williams, Kelly P.; Xue, Tian; Seung Yoo, Hyun; Zhang, Chengdong; Zhang, Yan; Will, Rebecca; Kenyon, Ronald W.; Sobral, Bruno W.

    2011-01-01

    Funded by the National Institute of Allergy and Infectious Diseases, the Pathosystems Resource Integration Center (PATRIC) is a genomics-centric relational database and bioinformatics resource designed to assist scientists in infectious-disease research. Specifically, PATRIC provides scientists with (i) a comprehensive bacterial genomics database, (ii) a plethora of associated data relevant to genomic analysis, and (iii) an extensive suite of computational tools and platforms for bioinformatics analysis. While the primary aim of PATRIC is to advance the knowledge underlying the biology of human pathogens, all publicly available genome-scale data for bacteria are compiled and continually updated, thereby enabling comparative analyses to reveal the basis for differences between infectious free-living and commensal species. Herein we summarize the major features available at PATRIC, dividing the resources into two major categories: (i) organisms, genomes, and comparative genomics and (ii) recurrent integration of community-derived associated data. Additionally, we present two experimental designs typical of bacterial genomics research and report on the execution of both projects using only PATRIC data and tools. These applications encompass a broad range of the data and analysis tools available, illustrating practical uses of PATRIC for the biologist. Finally, a summary of PATRIC's outreach activities, collaborative endeavors, and future research directions is provided. PMID:21896772

  17. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes.

    PubMed

    Zhu, Huaiqiu; Hu, Gang-Qing; Yang, Yi-Fan; Wang, Jin; She, Zhen-Su

    2007-03-16

    Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs) and Translation Initiation Sites (TISs). The former is based on a linguistic "Entropy Density Profile" (EDP) model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED) algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  18. Automated multiplex genome-scale engineering in yeast

    PubMed Central

    Si, Tong; Chao, Ran; Min, Yuhao; Wu, Yuying; Ren, Wen; Zhao, Huimin

    2017-01-01

    Genome-scale engineering is indispensable in understanding and engineering microorganisms, but the current tools are mainly limited to bacterial systems. Here we report an automated platform for multiplex genome-scale engineering in Saccharomyces cerevisiae, an important eukaryotic model and widely used microbial cell factory. Standardized genetic parts encoding overexpression and knockdown mutations of >90% yeast genes are created in a single step from a full-length cDNA library. With the aid of CRISPR-Cas, these genetic parts are iteratively integrated into the repetitive genomic sequences in a modular manner using robotic automation. This system allows functional mapping and multiplex optimization on a genome scale for diverse phenotypes including cellulase expression, isobutanol production, glycerol utilization and acetic acid tolerance, and may greatly accelerate future genome-scale engineering endeavours in yeast. PMID:28469255

  19. Viral dark matter and virus–host interactions resolved from publicly available microbial genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roux, Simon; Hallam, Steven J.; Woyke, Tanja

    The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus–host interactions precludes accurate prediction of their roles and impacts. In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7–38% of ‘unknown’ sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 newmore » viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus–host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.« less

  20. Viral dark matter and virus-host interactions resolved from publicly available microbial genomes.

    PubMed

    Roux, Simon; Hallam, Steven J; Woyke, Tanja; Sullivan, Matthew B

    2015-07-22

    The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus-host interactions precludes accurate prediction of their roles and impacts. In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7-38% of 'unknown' sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus-host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.