Science.gov

Sample records for bacterial genome scale

  1. Genome-scale models of bacterial metabolism: reconstruction and applications

    PubMed Central

    Durot, Maxime; Bourguignon, Pierre-Yves; Schachter, Vincent

    2009-01-01

    Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety of computational methods exploiting metabolic models have been developed and applied to bacteria, yielding valuable insights into bacterial metabolism and evolution, and providing a sound basis for computer-assisted design in metabolic engineering. Recent advances in computational systems biology and high-throughput experimental technologies pave the way for the systematic reconstruction of metabolic models from genomes of new species, and a corresponding expansion of the scope of their applications. In this review, we provide an introduction to the key ideas of metabolic modeling, survey the methods, and resources that enable model reconstruction and refinement, and chart applications to the investigation of global properties of metabolic systems, the interpretation of experimental results, and the re-engineering of their biochemical capabilities. PMID:19067749

  2. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes.

    PubMed

    Iranzo, Jaime; Gómez, Manuel J; López de Saro, Francisco J; Manrubia, Susanna

    2014-06-01

    Insertion sequences (IS) are the simplest and most abundant form of transposable DNA found in bacterial genomes. When present in multiple copies, it is thought that they can promote genomic plasticity and genetic exchange, thus being a major force of evolutionary change. The main processes that determine IS content in genomes are, though, a matter of debate. In this work, we take advantage of the large amount of genomic data currently available and study the abundance distributions of 33 IS families in 1811 bacterial chromosomes. This allows us to test simple models of IS dynamics and estimate their key parameters by means of a maximum likelihood approach. We evaluate the roles played by duplication, lateral gene transfer, deletion and purifying selection. We find that the observed IS abundances are compatible with a neutral scenario where IS proliferation is controlled by deletions instead of purifying selection. Even if there may be some cases driven by selection, neutral behavior dominates over large evolutionary scales. According to this view, IS and hosts tend to coexist in a dynamic equilibrium state for most of the time. Our approach also allows for a detection of recent IS expansions, and supports the hypothesis that rapid expansions constitute transient events-punctuations-during which the state of coexistence of IS and host becomes perturbated. PMID:24967627

  3. Bacterial Genome Instability

    PubMed Central

    Darmon, Elise

    2014-01-01

    SUMMARY Bacterial genomes are remarkably stable from one generation to the next but are plastic on an evolutionary time scale, substantially shaped by horizontal gene transfer, genome rearrangement, and the activities of mobile DNA elements. This implies the existence of a delicate balance between the maintenance of genome stability and the tolerance of genome instability. In this review, we describe the specialized genetic elements and the endogenous processes that contribute to genome instability. We then discuss the consequences of genome instability at the physiological level, where cells have harnessed instability to mediate phase and antigenic variation, and at the evolutionary level, where horizontal gene transfer has played an important role. Indeed, this ability to share DNA sequences has played a major part in the evolution of life on Earth. The evolutionary plasticity of bacterial genomes, coupled with the vast numbers of bacteria on the planet, substantially limits our ability to control disease. PMID:24600039

  4. Genome scale patterns of supercoiling in a bacterial chromosome

    PubMed Central

    Lal, Avantika; Dhar, Amlanjyoti; Trostel, Andrei; Kouzine, Fedor; Seshasayee, Aswin S. N.; Adhya, Sankar

    2016-01-01

    DNA in bacterial cells primarily exists in a negatively supercoiled state. The extent of supercoiling differs between regions of the chromosome, changes in response to external conditions and regulates gene expression. Here we report the use of trimethylpsoralen intercalation to map the extent of supercoiling across the Escherichia coli chromosome during exponential and stationary growth phases. We find that stationary phase E. coli cells display a gradient of negative supercoiling, with the terminus being more negatively supercoiled than the origin of replication, and that such a gradient is absent in exponentially growing cells. This stationary phase pattern is correlated with the binding of the nucleoid-associated protein HU, and we show that it is lost in an HU deletion strain. We suggest that HU establishes higher supercoiling near the terminus of the chromosome during stationary phase, whereas during exponential growth DNA gyrase and/or transcription equalizes supercoiling across the chromosome. PMID:27025941

  5. Targeted Large-Scale Deletion of Bacterial Genomes Using CRISPR-Nickases

    PubMed Central

    2015-01-01

    Programmable CRISPR-Cas systems have augmented our ability to produce precise genome manipulations. Here we demonstrate and characterize the ability of CRISPR-Cas derived nickases to direct targeted recombination of both small and large genomic regions flanked by repetitive elements in Escherichia coli. While CRISPR directed double-stranded DNA breaks are highly lethal in many bacteria, we show that CRISPR-guided nickase systems can be programmed to make precise, nonlethal, single-stranded incisions in targeted genomic regions. This induces recombination events and leads to targeted deletion. We demonstrate that dual-targeted nicking enables deletion of 36 and 97 Kb of the genome. Furthermore, multiplex targeting enables deletion of 133 Kb, accounting for approximately 3% of the entire E. coli genome. This technology provides a framework for methods to manipulate bacterial genomes using CRISPR-nickase systems. We envision this system working synergistically with preexisting bacterial genome engineering methods. PMID:26451892

  6. Genome-scale quantitative characterization of bacterial protein localization dynamics throughout the cell cycle

    PubMed Central

    Kuwada, Nathan J; Traxler, Beth; Wiggins, Paul A

    2015-01-01

    Bacterial cells display both spatial and temporal organization, and this complex structure is known to play a central role in cellular function. Although nearly one-fifth of all proteins in Escherichia coli localize to specific subcellular locations, fundamental questions remain about how cellular-scale structure is encoded at the level of molecular-scale interactions. One significant limitation to our understanding is that the localization behavior of only a small subset of proteins has been characterized in detail. As an essential step toward a global model of protein localization in bacteria, we capture and quantitatively analyze spatial and temporal protein localization patterns throughout the cell cycle for nearly every protein in E. coli that exhibits nondiffuse localization. This genome-scale analysis reveals significant complexity in patterning, notably in the behavior of DNA-binding proteins. Complete cell-cycle imaging also facilitates analysis of protein partitioning to daughter cells at division, revealing a broad and robust assortment of asymmetric partitioning behaviors. PMID:25353361

  7. Biofilm Formation Mechanisms of Pseudomonas aeruginosa Predicted via Genome-Scale Kinetic Models of Bacterial Metabolism

    PubMed Central

    Vital-Lopez, Francisco G.; Reifman, Jaques; Wallqvist, Anders

    2015-01-01

    A hallmark of Pseudomonas aeruginosa is its ability to establish biofilm-based infections that are difficult to eradicate. Biofilms are less susceptible to host inflammatory and immune responses and have higher antibiotic tolerance than free-living planktonic cells. Developing treatments against biofilms requires an understanding of bacterial biofilm-specific physiological traits. Research efforts have started to elucidate the intricate mechanisms underlying biofilm development. However, many aspects of these mechanisms are still poorly understood. Here, we addressed questions regarding biofilm metabolism using a genome-scale kinetic model of the P. aeruginosa metabolic network and gene expression profiles. Specifically, we computed metabolite concentration differences between known mutants with altered biofilm formation and the wild-type strain to predict drug targets against P. aeruginosa biofilms. We also simulated the altered metabolism driven by gene expression changes between biofilm and stationary growth-phase planktonic cultures. Our analysis suggests that the synthesis of important biofilm-related molecules, such as the quorum-sensing molecule Pseudomonas quinolone signal and the exopolysaccharide Psl, is regulated not only through the expression of genes in their own synthesis pathway, but also through the biofilm-specific expression of genes in pathways competing for precursors to these molecules. Finally, we investigated why mutants defective in anthranilate degradation have an impaired ability to form biofilms. Alternative to a previous hypothesis that this biofilm reduction is caused by a decrease in energy production, we proposed that the dysregulation of the synthesis of secondary metabolites derived from anthranilate and chorismate is what impaired the biofilms of these mutants. Notably, these insights generated through our kinetic model-based approach are not accessible from previous constraint-based model analyses of P. aeruginosa biofilm

  8. Consistency Analysis of Genome-Scale Models of Bacterial Metabolism: A Metamodel Approach

    PubMed Central

    Ponce-de-Leon, Miguel; Calle-Espinosa, Jorge; Peretó, Juli; Montero, Francisco

    2015-01-01

    Genome-scale metabolic models usually contain inconsistencies that manifest as blocked reactions and gap metabolites. With the purpose to detect recurrent inconsistencies in metabolic models, a large-scale analysis was performed using a previously published dataset of 130 genome-scale models. The results showed that a large number of reactions (~22%) are blocked in all the models where they are present. To unravel the nature of such inconsistencies a metamodel was construed by joining the 130 models in a single network. This metamodel was manually curated using the unconnected modules approach, and then, it was used as a reference network to perform a gap-filling on each individual genome-scale model. Finally, a set of 36 models that had not been considered during the construction of the metamodel was used, as a proof of concept, to extend the metamodel with new biochemical information, and to assess its impact on gap-filling results. The analysis performed on the metamodel allowed to conclude: 1) the recurrent inconsistencies found in the models were already present in the metabolic database used during the reconstructions process; 2) the presence of inconsistencies in a metabolic database can be propagated to the reconstructed models; 3) there are reactions not manifested as blocked which are active as a consequence of some classes of artifacts, and; 4) the results of an automatic gap-filling are highly dependent on the consistency and completeness of the metamodel or metabolic database used as the reference network. In conclusion the consistency analysis should be applied to metabolic databases in order to detect and fill gaps as well as to detect and remove artifacts and redundant information. PMID:26629901

  9. Analysis of Genome-scale Expression Network in Four Major Bacterial Residents of Cystic Fibrosis Lung

    PubMed Central

    Hosseinkhan, Nazanin; Zarrineh, Peyman; Masoudi-Nejad, Ali

    2014-01-01

    In polymicrobial communities where several species co-exist in a certain niche and consequently the possibility of interactions among species is very high, gene expression data sources can give better insights in to underlying adaptation mechanisms assumed by bacteria. Furthermore, several possible synergistic or antagonistic interactions among species can be investigated through gene expression comparisons. Lung is one of the habitats harboring several distinct pathogens during severe pulmonary disorders such as chronic obstructive pulmonary disease (COPD) and cystic fibrosis (CF). Expression data analysis of these lung residents can help to gain a better understanding on how these species interact with each other within the host cells. The first part of this paper deals with introducing available data sources for the major bacteria responsible for causing lung diseases and their genomic relations. In the second part, the main focus is on the studies concerning gene expression analyses of these species. PMID:25435803

  10. Insights from twenty years of bacterial genome sequencing

    SciTech Connect

    Land, Miriam L; Hauser, Loren John; Jun, Se Ran; Nookaew, Intawat; Leuze, Michael Rex; Ahn, Tae-Hyuk; Karpinets, Tatiana V; Lund, Ole; Kora, Guruprasad H; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  11. Insights from 20 years of bacterial genome sequencing.

    PubMed

    Land, Miriam; Hauser, Loren; Jun, Se-Ran; Nookaew, Intawat; Leuze, Michael R; Ahn, Tae-Hyuk; Karpinets, Tatiana; Lund, Ole; Kora, Guruprased; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

    2015-03-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  12. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.

    PubMed

    Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

    2006-07-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this

  13. Robustness assessment of whole bacterial genome segmentations.

    PubMed

    Devillers, Hugo; Chiapello, Hélène; Schbath, Sophie; Karoui, Meriem El

    2011-09-01

    Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a "backbone" that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate, among other things, bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The two scores described in this article provide useful information and are easy to implement, and their interpretation is intuitive. We show that they are suited to discriminate between robust and non-robust segmentations when genome aligners such as MAUVE and MGA are used. PMID:21899422

  14. Bacterial Communities: Interactions to Scale.

    PubMed

    Stubbendieck, Reed M; Vargas-Bautista, Carol; Straight, Paul D

    2016-01-01

    In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bacterial communities are determined by pairwise interactions that occur between different species in the community. Though interactions occur between a few cells at a time, the outcomes of these interchanges have ramifications that ripple through many orders of magnitude, and ultimately affect the macroscopic world including the health of host organisms. In this review we cover how bacterial competition influences the structures of bacterial communities. We also emphasize methods and insights garnered from culture-dependent pairwise interaction studies, metagenomic analyses, and modeling experiments. Finally, we argue that the integration of multiple approaches will be instrumental to future understanding of the underlying dynamics of bacterial communities. PMID:27551280

  15. Bacterial Communities: Interactions to Scale

    PubMed Central

    Stubbendieck, Reed M.; Vargas-Bautista, Carol; Straight, Paul D.

    2016-01-01

    In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bacterial communities are determined by pairwise interactions that occur between different species in the community. Though interactions occur between a few cells at a time, the outcomes of these interchanges have ramifications that ripple through many orders of magnitude, and ultimately affect the macroscopic world including the health of host organisms. In this review we cover how bacterial competition influences the structures of bacterial communities. We also emphasize methods and insights garnered from culture-dependent pairwise interaction studies, metagenomic analyses, and modeling experiments. Finally, we argue that the integration of multiple approaches will be instrumental to future understanding of the underlying dynamics of bacterial communities. PMID:27551280

  16. Microbial minimalism: genome reduction in bacterial pathogens.

    PubMed

    Moran, Nancy A

    2002-03-01

    When bacterial lineages make the transition from free-living or facultatively parasitic life cycles to permanent associations with hosts, they undergo a major loss of genes and DNA. Complete genome sequences are providing an understanding of how extreme genome reduction affects evolutionary directions and metabolic capabilities of obligate pathogens and symbionts. PMID:11893328

  17. Value of a newly sequenced bacterial genome.

    PubMed

    Barbosa, Eudes Gv; Aburjaile, Flavia F; Ramos, Rommel Tj; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

    2014-05-26

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

  18. Bacterial chromatin: converging views at different scales.

    PubMed

    Dame, Remus T; Tark-Dame, Mariliis

    2016-06-01

    Bacterial genomes are functionally organized and compactly folded into a structure referred to as bacterial chromatin or the nucleoid. An important role in genome folding is attributed to Nucleoid-Associated Proteins, also referred to as bacterial chromatin proteins. Although a lot of molecular insight in the mechanisms of operation of these proteins has been generated in the test tube, knowledge on genome organization in the cellular context is still lagging behind severely. Here, we discuss important advances in the understanding of three-dimensional genome organization due to the application of Chromosome Conformation Capture and super-resolution microscopy techniques. We focus on bacterial chromatin proteins whose proposed role in genome organization is supported by these approaches. Moreover, we discuss recent insights into the interrelationship between genome organization and genome activity/stability in bacteria. PMID:26942688

  19. Dynamics of Genome Rearrangement in Bacterial Populations

    PubMed Central

    Darling, Aaron E.; Miklós, István; Ragan, Mark A.

    2008-01-01

    first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions. Insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes. PMID:18650965

  20. One Bacterial Cell, One Complete Genome

    SciTech Connect

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  1. Integrons: natural tools for bacterial genome evolution.

    PubMed

    Rowe-Magnus, D A; Mazel, D

    2001-10-01

    Integrons were first identified as the primary mechanism for antibiotic resistance gene capture and dissemination among Gram-negative bacteria. More recently, their role in genome evolution has been extended with the discovery of larger integron structures, the super-integrons, as genuine components of the genomes of many species throughout the gamma-proteobacterial radiation. The functional platforms of these integrons appear to be sedentary, whereas their gene cassette contents are highly variable. Nevertheless, the gene cassettes for which an activity has been experimentally demonstrated encode proteins related to simple adaptive functions and their recruitment is seen as providing the bacterial host with a selective advantage. The widespread occurrence of the integron system among Gram-negative bacteria is discussed, with special focus on the super-integrons. Some of the adaptive functions encoded by these genes are also reviewed, and implications of integron-mediated genome evolution in the emergence of novel bacterial species are highlighted. PMID:11587934

  2. Persistence drives gene clustering in bacterial genomes

    PubMed Central

    Fang, Gang; Rocha, Eduardo PC; Danchin, Antoine

    2008-01-01

    Background Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes. Results We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters. Conclusion We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering. PMID:18179692

  3. Insights from Genomics into Bacterial Pathogen Populations

    PubMed Central

    Wilson, Daniel J.

    2012-01-01

    Bacterial pathogens impose a heavy burden of disease on human populations worldwide. The gravest threats are posed by highly virulent respiratory pathogens, enteric pathogens, and HIV-associated infections. Tuberculosis alone is responsible for the deaths of 1.5 million people annually. Treatment options for bacterial pathogens are being steadily eroded by the evolution and spread of drug resistance. However, population-level whole genome sequencing offers new hope in the fight against pathogenic bacteria. By providing insights into bacterial evolution and disease etiology, these approaches pave the way for novel interventions and therapeutic targets. Sequencing populations of bacteria across the whole genome provides unprecedented resolution to investigate (i) within-host evolution, (ii) transmission history, and (iii) population structure. Moreover, advances in rapid benchtop sequencing herald a new era of real-time genomics in which sequencing and analysis can be deployed within hours in response to rapidly changing public health emergencies. The purpose of this review is to highlight the transformative effect of population genomics on bacteriology, and to consider the prospects for answering abiding questions such as why bacteria cause disease. PMID:22969423

  4. Bacterial genome remodeling through bacteriophage recombination.

    PubMed

    Menouni, Rachid; Hutinet, Geoffrey; Petit, Marie-Agnès; Ansaldi, Mireille

    2015-01-01

    Bacteriophages co-exist and co-evolve with their hosts in natural environments. Virulent phages lyse infected cells through lytic cycles, whereas temperate phages often remain dormant and can undergo lysogenic or lytic cycles. In their lysogenic state, prophages are actually part of the host genome and replicate passively in rhythm with host division. However, prophages are far from being passive residents: they can modify or bring new properties to their host. In this review, we focus on two important phage-encoded recombination mechanisms, i.e. site-specific recombination and homologous recombination, and how they remodel bacterial genomes. PMID:25790500

  5. New perspectives from genomic analyses of bacterial infectious agents.

    PubMed

    Goldstone, R J; Smith, D G E

    2016-04-01

    Recent advances in the technologies for genomic sequencing and systems for handling and processing sequencing data have transformed bacterial genomics into a near-routine approach for both small- and large-scale investigations of infectious agents. Nonetheless, the application of genomics - especially largerscale studies - to animal infectious agents lags behind its application to human pathogens, despite the growing importance of many animal species as food sources. Assiduously conducted genomic studies offer major benefits, not merely by providing a detailed understanding of infectious agents but also through the exploitation of such findings to enable more accurate diagnosis, high-resolution typing and the development of improved interventions. The use of genomics for these and other purposes is likely to grow in future years and it must be anticipated that investigation and characterisation of important animal infectious agents will also gain considerable benefits. Using mainly animal pathogens as examples - including several infectious agents listed by the World Organisation for Animal Health - this paper provides a concise summary of some recent purposes and developments in bacterial genomics analysis. PMID:27217179

  6. Genomic perspectives on the evolution and spread of bacterial pathogens

    PubMed Central

    Bentley, Stephen D.

    2015-01-01

    Since the first complete sequencing of a free-living organism, Haemophilus influenzae, genomics has been used to probe both the biology of bacterial pathogens and their evolution. Single-genome approaches provided information on the repertoire of virulence determinants and host-interaction factors, and, along with comparative analyses, allowed the proposal of hypotheses to explain the evolution of many of these traits. These analyses suggested many bacterial pathogens to be of relatively recent origin and identified genome degradation as a key aspect of host adaptation. The advent of very-high-throughput sequencing has allowed for detailed phylogenetic analysis of many important pathogens, revealing patterns of global and local spread, and recent evolution in response to pressure from therapeutics and the human immune system. Such analyses have shown that bacteria can evolve and transmit very rapidly, with emerging clones showing adaptation and global spread over years or decades. The resolution achieved with whole-genome sequencing has shown considerable benefits in clinical microbiology, enabling accurate outbreak tracking within hospitals and across continents. Continued large-scale sequencing promises many further insights into genetic determinants of drug resistance, virulence and transmission in bacterial pathogens. PMID:26702036

  7. Assessing the Robustness of Complete Bacterial Genome Segmentations

    NASA Astrophysics Data System (ADS)

    Devillers, Hugo; Chiapello, Hélène; Schbath, Sophie; El Karoui, Meriem

    Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a "backbone" that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The scores presented in this paper are simple to implement and our results show that they allow to discriminate easily between robust and non-robust bacterial genome segmentations when using aligners such as MAUVE and MGA.

  8. Finishing bacterial genome assemblies with Mix

    PubMed Central

    2013-01-01

    Motivation Among challenges that hamper reaping the benefits of genome assembly are both unfinished assemblies and the ensuing experimental costs. First, numerous software solutions for genome de novo assembly are available, each having its advantages and drawbacks, without clear guidelines as to how to choose among them. Second, these solutions produce draft assemblies that often require a resource intensive finishing phase. Methods In this paper we address these two aspects by developing Mix , a tool that mixes two or more draft assemblies, without relying on a reference genome and having the goal to reduce contig fragmentation and thus speed-up genome finishing. The proposed algorithm builds an extension graph where vertices represent extremities of contigs and edges represent existing alignments between these extremities. These alignment edges are used for contig extension. The resulting output assembly corresponds to a set of paths in the extension graph that maximizes the cumulative contig length. Results We evaluate the performance of Mix on bacterial NGS data from the GAGE-B study and apply it to newly sequenced Mycoplasma genomes. Resulting final assemblies demonstrate a significant improvement in the overall assembly quality. In particular, Mix is consistent by providing better overall quality results even when the choice is guided solely by standard assembly statistics, as is the case for de novo projects. Availability Mix is implemented in Python and is available at https://github.com/cbib/MIX, novel data for our Mycoplasma study is available at http://services.cbib.u-bordeaux2.fr/mix/. PMID:24564706

  9. Bacterial bioinformatics: pathogenesis and the genome.

    PubMed

    Paine, Kelly; Flower, Darren R

    2002-07-01

    As the number of completed microbial genome sequences continues to grow, there is a pressing need for the exploitation of this wealth of data through a synergistic interaction between the well-established science of bacteriology and the emergent discipline of bioinformatics. Antibiotic resistance and pathogenicity in virulent bacteria has become an increasing problem, with even the strongest drugs useless against some species, such as multi-drug resistant Enterococcus faecium and Mycobacterium tuberculosis. The global spread of Human Immunodeficiency Virus (HIV) and Acquired Immune Deficiency Syndrome (AIDS) has contributed to the re-emergence of tuberculosis and the threat from new and emergent diseases. To address these problems, bacterial pathogenicity requires redefinition as Koch's postulates become obsolete. This review discusses how the use of bacterial genomic information, and the in silico tools available at present, may aid in determining the definition of a current pathogen. The combination of both fields should provide a rapid and efficient way of assisting in the future development of antimicrobial therapies. PMID:12125816

  10. Reconstruction of Bacterial and Viral Genomes from Multiple Metagenomes.

    PubMed

    Gupta, Ankit; Kumar, Sanjiv; Prasoodanan, Vishnu P K; Harish, K; Sharma, Ashok K; Sharma, Vineet K

    2016-01-01

    Several metagenomic projects have been accomplished or are in progress. However, in most cases, it is not feasible to generate complete genomic assemblies of species from the metagenomic sequencing of a complex environment. Only a few studies have reported the reconstruction of bacterial genomes from complex metagenomes. In this work, Binning-Assembly approach has been proposed and demonstrated for the reconstruction of bacterial and viral genomes from 72 human gut metagenomic datasets. A total 1156 bacterial genomes belonging to 219 bacterial families and, 279 viral genomes belonging to 84 viral families could be identified. More than 80% complete draft genome sequences could be reconstructed for a total of 126 bacterial and 11 viral genomes. Selected draft assembled genomes could be validated with 99.8% accuracy using their ORFs. The study provides useful information on the assembly expected for a species given its number of reads and abundance. This approach along with spiking was also demonstrated to be useful in improving the draft assembly of a bacterial genome. The Binning-Assembly approach can be successfully used to reconstruct bacterial and viral genomes from multiple metagenomic datasets obtained from similar environments. PMID:27148174

  11. Reconstruction of Bacterial and Viral Genomes from Multiple Metagenomes

    PubMed Central

    Gupta, Ankit; Kumar, Sanjiv; Prasoodanan, Vishnu P. K.; Harish, K.; Sharma, Ashok K.; Sharma, Vineet K.

    2016-01-01

    Several metagenomic projects have been accomplished or are in progress. However, in most cases, it is not feasible to generate complete genomic assemblies of species from the metagenomic sequencing of a complex environment. Only a few studies have reported the reconstruction of bacterial genomes from complex metagenomes. In this work, Binning-Assembly approach has been proposed and demonstrated for the reconstruction of bacterial and viral genomes from 72 human gut metagenomic datasets. A total 1156 bacterial genomes belonging to 219 bacterial families and, 279 viral genomes belonging to 84 viral families could be identified. More than 80% complete draft genome sequences could be reconstructed for a total of 126 bacterial and 11 viral genomes. Selected draft assembled genomes could be validated with 99.8% accuracy using their ORFs. The study provides useful information on the assembly expected for a species given its number of reads and abundance. This approach along with spiking was also demonstrated to be useful in improving the draft assembly of a bacterial genome. The Binning-Assembly approach can be successfully used to reconstruct bacterial and viral genomes from multiple metagenomic datasets obtained from similar environments. PMID:27148174

  12. From bacterial genome to functionality; case bifidobacteria.

    PubMed

    Ventura, Marco; O'Connell-Motherway, Mary; Leahy, Sinead; Moreno-Munoz, Jose Antonio; Fitzgerald, Gerald F; van Sinderen, Douwe

    2007-11-30

    The availability of complete bacterial genome sequences has significantly furthered our understanding of the genetics, physiology and biochemistry of the microorganisms in question, particularly those that have commercially important applications. Bifidobacteria are among such microorganisms, as they constitute mammalian commensals of biotechnological significance due to their perceived role in maintaining a balanced gastrointestinal (GIT) microflora. Bifidobacteria are therefore frequently used as health-promoting or probiotic components in functional food products. A fundamental understanding of the metabolic activities employed by these commensal bacteria, in particular their capability to utilize a wide range of complex oligosaccharides, can reveal ways to provide in vivo growth advantages relative to other competing gut bacteria or pathogens. Furthermore, an in depth analysis of adaptive responses to nutritional or environmental stresses may provide methodologies to retain viability and improve functionality during commercial preparation, storage and delivery of the probiotic organism. PMID:17629975

  13. Mining bacterial genomes for novel arylesterase activity

    PubMed Central

    Wang, Lijun; Mavisakalyan, Valentina; Tillier, Elisabeth R. M.; Clark, Greg W.; Savchenko, Alexei V.; Yakunin, Alexander F.; Master, Emma R.

    2010-01-01

    Summary One hundred and seventy‐one genes encoding potential esterases from 11 bacterial genomes were cloned and overexpressed in Escherichia coli; 74 of the clones produced soluble proteins. All 74 soluble proteins were purified and screened for esterase activity; 36 proteins showed carboxyl esterase activity on short‐chain esters, 17 demonstrated arylesterase activity, while 38 proteins did not exhibit any activity towards the test substrates. Esterases from Rhodopseudomonas palustris (RpEST‐1, RpEST‐2 and RpEST‐3), Pseudomonas putida (PpEST‐1, PpEST‐2 and PpEST‐3), Pseudomonas aeruginosa (PaEST‐1) and Streptomyces avermitilis (SavEST‐1) were selected for detailed biochemical characterization. All of the enzymes showed optimal activity at neutral or alkaline pH, and the half‐life of each enzyme at 50°C ranged from < 5 min to over 5 h. PpEST‐3, RpEST‐1 and RpEST‐2 demonstrated the highest specific activity with pNP‐esters; these enzymes were also among the most stable at 50°C and in the presence of detergents, polar and non‐polar organic solvents, and imidazolium ionic liquids. Accordingly, these enzymes are particularly interesting targets for subsequent application trials. Finally, biochemical and bioinformatic analyses were compared to reveal sequence features that could be correlated to enzymes with arylesterase activity, facilitating subsequent searches for new esterases in microbial genome sequences. PMID:21255363

  14. Bacterial epidemiology and biology - lessons from genome sequencing

    PubMed Central

    2011-01-01

    Next-generation sequencing has ushered in a new era of microbial genomics, enabling the detailed historical and geographical tracing of bacteria. This is helping to shape our understanding of bacterial evolution. PMID:22027015

  15. Holotransformations of bacterial colonies and genome cybernetics

    NASA Astrophysics Data System (ADS)

    Ben-Jacob, Eshel; Tenenbaum, Adam; Shochet, Ofer; Avidan, Orna

    1994-01-01

    We present a study of colony transformations during growth of Bacillus subtilis under adverse environmental conditions. It is a continuation of our pilot study of “Adaptive self-organization during growth of bacterial colonies” (Physica A 187 (1992) 378). First we identify and describe the transformations pathway, i.e. the excitation of the branching modes from Bacillus subtilis 168 (grown under diffusion limited conditions) and the phase transformations between the tip-splitting phase (phase T) and the chiral phase (phase C) which belong to the same mode. This pathway shows the evolution of complexity as the bacteria are exposed to adverse growth conditions. We present the morphology diagram of phases T and C as a function of agar concentration and pepton level. As expected, the growth of phase T is ramified (fractal-like or DLA-like) at low pepton level (about 1 g/1) and turns compact at high pepton level (about 10 g/1). The growth of phase C is also ramified at low pepton level and turns denser and finally compact as the pepton level increases. Generally speaking, the colonies develop more complex patterns and higher micro-level organization for more adverse environments. We use the growth velocity as a response function to describe the growth. At low agar concentration (and low pepton level) phase C grows faster than phase T, and for a high agar concentration (about 2%) phase T grows faster. We observe colony transformations between the two phases (phase transformations). They are found to be consistent with the “fastest growing morphology” selection principle adopted from azoic systems. The transformations are always from the slower phase to the faster one. Hence, we observe T→ C transformations at low agar concentrations and C→ T transformations at high agar concentrations. We have observed both localized and extended transformations. Usually, the transformations are localized for more adverse growth conditions, and extended for growth conditions

  16. Genome-Scale Variation of Tubeworm Symbionts

    NASA Astrophysics Data System (ADS)

    Robidart, J.; Felbeck, H.

    2005-12-01

    Hydrothermal vent tubeworms are completely dependent on their bacterial symbionts for nutrition. Despite this dependency, many studies have concluded that bacterial symbionts are acquired anew from the environment, every generation rather than the more reliable mode of symbiont transmission from parent directly to offspring. Ribosomal 16S sequences have shown little variation of symbiont phylogeny from worm to worm, but higher resolution genome-scale analyses have found that there is genomic heterogeneity between symbionts from worms in different environments. What genes can be "spared," while resulting in an intact symbiosis? Have symbionts from one environment gained physiological capabilities that make them more fit in that environment? In order to answer these questions, subtractive hybridization was used on symbionts of Riftia pachyptila tubeworms from different environments to gain insight into which genes are present in one symbiont and absent in the other. Many genes were found to be unique to each symbiont and these results will be presented. This technique will be applied to answer many fundamental questions regarding microbial symbiont evolution to a specific physico-chemical environment, to a different host species, and more.

  17. Plastic architecture of bacterial genome revealed by comparative genomics of Photorhabdus variants

    PubMed Central

    Gaudriault, Sophie; Pages, Sylvie; Lanois, Anne; Laroui, Christine; Teyssier, Corinne; Jumas-Bilak, Estelle; Givaudan, Alain

    2008-01-01

    Background The phenotypic consequences of large genomic architecture modifications within a clonal bacterial population are rarely evaluated because of the difficulties associated with using molecular approaches in a mixed population. Bacterial variants frequently arise among Photorhabdus luminescens, a nematode-symbiotic and insect-pathogenic bacterium. We therefore studied genome plasticity within Photorhabdus variants. Results We used a combination of macrorestriction and DNA microarray experiments to perform a comparative genomic study of different P. luminescens TT01 variants. Prolonged culturing of TT01 strain and a genomic variant, collected from the laboratory-maintained symbiotic nematode, generated bacterial lineages composed of primary and secondary phenotypic variants and colonial variants. The primary phenotypic variants exhibit several characteristics that are absent from the secondary forms. We identify substantial plasticity of the genome architecture of some variants, mediated mainly by deletions in the 'flexible' gene pool of the TT01 reference genome and also by genomic amplification. We show that the primary or secondary phenotypic variant status is independent from global genomic architecture and that the bacterial lineages are genomic lineages. We focused on two unusual genomic changes: a deletion at a new recombination hotspot composed of long approximate repeats; and a 275 kilobase single block duplication belonging to a new class of genomic duplications. Conclusion Our findings demonstrate that major genomic variations occur in Photorhabdus clonal populations. The phenotypic consequences of these genomic changes are cryptic. This study provides insight into the field of bacterial genome architecture and further elucidates the role played by clonal genomic variation in bacterial genome evolution. PMID:18647395

  18. Minimum taxonomic criteria for bacterial genome sequence depositions and announcements.

    PubMed

    Bull, Matthew J; Marchesi, Julian R; Vandamme, Peter; Plummer, Sue; Mahenthiralingam, Eshwar

    2012-04-01

    Multiple bioinformatic methods are available to analyse the information encoded within the complete genome sequence of a bacterium and accurately assign its species status or nearest phylogenetic neighbour. However, it is clear that even now in what is the third decade of bacterial genomics, taxonomically incorrect genome sequence depositions are still being made. We outline a simple scheme of bioinformatic analysis and a set of minimum criteria that should be applied to all bacterial genomic data to ensure that they are accurately assigned to the species or genus level prior to database deposition. To illustrate the utility of the bioinformatic workflow, we analysed the recently deposited genome sequence of Lactobacillus acidophilus 30SC and demonstrated that this DNA was in fact derived from a strain of Lactobacillus amylovorus. Using these methods researchers can ensure that the taxonomic accuracy of genome sequence depositions is maintained within the ever increasing nucleic acid datasets. PMID:22366464

  19. LATERAL GENE TRANSFER AND THE HISTORY OF BACTERIAL GENOMES

    SciTech Connect

    Howard Ochman

    2006-02-22

    The aims of this research were to elucidate the role and extent of lateral transfer in the differentiation of bacterial strains and species, and to assess the impact of gene transfer on the evolution of bacterial genomes. The ultimate goal of the project is to examine the dynamics of a core set of protein-coding genes (i.e., those that are distributed universally among Bacteria) by developing conserved primers that would allow their amplification and sequencing in any bacterial taxa. In addition, we adopted a bioinformatic approach to elucidate the extent of lateral gene transfer in sequenced genome.

  20. Restriction endonucleases for pulsed field mapping of bacterial genomes.

    PubMed Central

    McClelland, M; Jones, R; Patel, Y; Nelson, M

    1987-01-01

    Fundamental to many bacterial genome mapping strategies currently under development is the need to cleave the genome into a few large DNA fragments that can be resolved by pulsed field gel electrophoresis. Identification of endonucleases that infrequently cut a genome is of key importance in this process. We show that the tetranucleotide CTAG is extremely rare in most bacterial genomes with G+C contents above 45%. As a consequence, most of the sixteen bacterial genomes we have tested are cleaved less than once every 100,000 base pairs by one or more endonucleases that have CTAG in their recognition sequences: Xba I (TCTAGA), Spe I (ACTAGT), Avr II (CCTAGG) and Nhe I (GCTAGC). Similarly, CCG and CGG are the rarest trinucleotides in many genomes with G+C content of less than 45%. Thus, Sma I (CCCGGG), Rsr II (CGGWCCG), Nae I (GCCGGC) and Sac II (CCGCGG) are often suitable endonucleases for producing fragments that average over 100,000 base pairs from such genomes. Pulsed field gel electrophoresis of the fragments that result from cleavage with endonucleases that cleave only a few times per genome should assist in the physical mapping of many prokaryotic genomes. Images PMID:2819819

  1. Assessing inhomogeneities in bacterial long genomic sequences

    SciTech Connect

    Karlin, S.

    1997-12-01

    Several complete prokaryotic and eukaryotic genomes are already at hand (S. cerevisiae, H. influenzae, M. genitalium, M. jannaschii, Synechocystis, sp.) and many are forthcoming (e.g., E. coli, H, pylori, C. elegans). The comparative analysis of genomes generally strives to identify genes and characterize function/structure relationships inferred mostly via amino acid sequence comparisons. We describe concisely methods for comparing genomes (or long contigs) emphasizing sequence features other than gene comparisons. These center on the following measures of genomic organization and sequence heterogeneity: (i) compositional biases of short oligonucleotides; (ii) dinucleotide relative abundance distances within and between genomes; (iii) rare and frequent word (oligonucleotide) determinations and their distributional properties; (iv) r-scan statistics assessing clustering, overdispersion, or excessive evenness of various marker arrays; and (v) characterizations of repeat structures in the genome. 20 refs., 3 figs.

  2. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    PubMed

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses. PMID:26442252

  3. Ecological and Temporal Constraints in the Evolution of Bacterial Genomes

    PubMed Central

    Boto, Luis; Martínez, Jose Luis

    2011-01-01

    Studies on the experimental evolution of microorganisms, on their in vivo evolution (mainly in the case of bacteria producing chronic infections), as well as the availability of multiple full genomic sequences, are placing bacteria in the playground of evolutionary studies. In the present article we review the differential contribution to the evolution of bacterial genomes that processes such as gene modification, gene acquisition and gene loss may have when bacteria colonize different habitats that present characteristic ecological features. In particular, we review how the different processes contribute to evolution in microbial communities, in free-living bacteria or in bacteria living in isolation. In addition, we discuss the temporal constraints in the evolution of bacterial genomes, considering bacterial evolution from the perspective of processes of short-sighted evolution and punctual acquisition of evolutionary novelties followed by long stasis periods. PMID:24710293

  4. Identifying characteristic scales in the human genome

    NASA Astrophysics Data System (ADS)

    Carpena, P.; Bernaola-Galván, P.; Coronado, A. V.; Hackenberg, M.; Oliver, J. L.

    2007-03-01

    The scale-free, long-range correlations detected in DNA sequences contrast with characteristic lengths of genomic elements, being particularly incompatible with the isochores (long, homogeneous DNA segments). By computing the local behavior of the scaling exponent α of detrended fluctuation analysis (DFA), we discriminate between sequences with and without true scaling, and we find that no single scaling exists in the human genome. Instead, human chromosomes show a common compositional structure with two characteristic scales, the large one corresponding to the isochores and the other to small and medium scale genomic elements.

  5. Bacterial Cellular Engineering by Genome Editing and Gene Silencing

    PubMed Central

    Nakashima, Nobutaka; Miyazaki, Kentaro

    2014-01-01

    Genome editing is an important technology for bacterial cellular engineering, which is commonly conducted by homologous recombination-based procedures, including gene knockout (disruption), knock-in (insertion), and allelic exchange. In addition, some new recombination-independent approaches have emerged that utilize catalytic RNAs, artificial nucleases, nucleic acid analogs, and peptide nucleic acids. Apart from these methods, which directly modify the genomic structure, an alternative approach is to conditionally modify the gene expression profile at the posttranscriptional level without altering the genomes. This is performed by expressing antisense RNAs to knock down (silence) target mRNAs in vivo. This review describes the features and recent advances on methods used in genomic engineering and silencing technologies that are advantageously used for bacterial cellular engineering. PMID:24552876

  6. Bacterial Recombineering: Genome Engineering via Phage-Based Homologous Recombination.

    PubMed

    Pines, Gur; Freed, Emily F; Winkler, James D; Gill, Ryan T

    2015-11-20

    The ability to specifically modify bacterial genomes in a precise and efficient manner is highly desired in various fields, ranging from molecular genetics to metabolic engineering and synthetic biology. Much has changed from the initial realization that phage-derived genes may be employed for such tasks to today, where recombineering enables complex genetic edits within a genome or a population. Here, we review the major developments leading to recombineering becoming the method of choice for in situ bacterial genome editing while highlighting the various applications of recombineering in pushing the boundaries of synthetic biology. We also present the current understanding of the mechanism of recombineering. Finally, we discuss in detail issues surrounding recombineering efficiency and future directions for recombineering-based genome editing. PMID:25856528

  7. Ensembl Genomes 2013: scaling up access to genome-wide data.

    PubMed

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future. PMID:24163254

  8. Ensembl Genomes 2013: scaling up access to genome-wide data

    PubMed Central

    Kersey, Paul Julian; Allen, James E.; Christensen, Mikkel; Davis, Paul; Falin, Lee J.; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D.; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K.; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M.; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future. PMID:24163254

  9. From bacterial genomics to metagenomics: concept, tools and recent advances.

    PubMed

    Sharma, Pooja; Kumari, Hansi; Kumar, Mukesh; Verma, Mansi; Kumari, Kirti; Malhotra, Shweta; Khurana, Jitendra; Lal, Rup

    2008-06-01

    In the last 20 years, the applications of genomics tools have completely transformed the field of microbial research. This has primarily happened due to revolution in sequencing technologies that have become available today. This review therefore, first describes the discoveries, upgradation and automation of sequencing techniques in a chronological order, followed by a brief discussion on microbial genomics. Some of the recently sequenced bacterial genomes are described to explain how complete genome data is now being used to derive interesting findings. Apart from the genomics of individual microbes, the study of unculturable microbiota from different environments is increasingly gaining importance. The second section is thus dedicated to the concept of metagenomics describing environmental DNA isolation, metagenomic library construction and screening methods to look for novel and potentially important genes, enzymes and biomolecules. It also deals with the pioneering studies in the area of metagenomics that are offering new insights into the previously unappreciated microbial world. PMID:23100712

  10. Comparative genomics boosts target prediction for bacterial small RNAs.

    PubMed

    Wright, Patrick R; Richter, Andreas S; Papenfort, Kai; Mann, Martin; Vogel, Jörg; Hess, Wolfgang R; Backofen, Rolf; Georg, Jens

    2013-09-10

    Small RNAs (sRNAs) constitute a large and heterogeneous class of bacterial gene expression regulators. Much like eukaryotic microRNAs, these sRNAs typically target multiple mRNAs through short seed pairing, thereby acting as global posttranscriptional regulators. In some bacteria, evidence for hundreds to possibly more than 1,000 different sRNAs has been obtained by transcriptome sequencing. However, the experimental identification of possible targets and, therefore, their confirmation as functional regulators of gene expression has remained laborious. Here, we present a strategy that integrates phylogenetic information to predict sRNA targets at the genomic scale and reconstructs regulatory networks upon functional enrichment and network analysis (CopraRNA, for Comparative Prediction Algorithm for sRNA Targets). Furthermore, CopraRNA precisely predicts the sRNA domains for target recognition and interaction. When applied to several model sRNAs, CopraRNA revealed additional targets and functions for the sRNAs CyaR, FnrS, RybB, RyhB, SgrS, and Spot42. Moreover, the mRNAs gdhA, lrp, marA, nagZ, ptsI, sdhA, and yobF-cspC were suggested as regulatory hubs targeted by up to seven different sRNAs. The verification of many previously undetected targets by CopraRNA, even for extensively investigated sRNAs, demonstrates its advantages and shows that CopraRNA-based analyses can compete with experimental target prediction approaches. A Web interface allows high-confidence target prediction and efficient classification of bacterial sRNAs. PMID:23980183

  11. Comparative Genomic Analyses of the Bacterial Phosphotransferase System

    PubMed Central

    Barabote, Ravi D.; Saier, Milton H.

    2005-01-01

    We report analyses of 202 fully sequenced genomes for homologues of known protein constituents of the bacterial phosphoenolpyruvate-dependent phosphotransferase system (PTS). These included 174 bacterial, 19 archaeal, and 9 eukaryotic genomes. Homologues of PTS proteins were not identified in archaea or eukaryotes, showing that the horizontal transfer of genes encoding PTS proteins has not occurred between the three domains of life. Of the 174 bacterial genomes (136 bacterial species) analyzed, 30 diverse species have no PTS homologues, and 29 species have cytoplasmic PTS phosphoryl transfer protein homologues but lack recognizable PTS permeases. These soluble homologues presumably function in regulation. The remaining 77 species possess all PTS proteins required for the transport and phosphorylation of at least one sugar via the PTS. Up to 3.2% of the genes in a bacterium encode PTS proteins. These homologues were analyzed for family association, range of protein types, domain organization, and organismal distribution. Different strains of a single bacterial species often possess strikingly different complements of PTS proteins. Types of PTS protein domain fusions were analyzed, showing that certain types of domain fusions are common, while others are rare or prohibited. Select PTS proteins were analyzed from different phylogenetic standpoints, showing that PTS protein phylogeny often differs from organismal phylogeny. The results document the frequent gain and loss of PTS protein-encoding genes and suggest that the lateral transfer of these genes within the bacterial domain has played an important role in bacterial evolution. Our studies provide insight into the development of complex multicomponent enzyme systems and lead to predictions regarding the types of protein-protein interactions that promote efficient PTS-mediated phosphoryl transfer. PMID:16339738

  12. Advances in Understanding Bacterial Pathogenesis Gained from Whole-Genome Sequencing and Phylogenetics.

    PubMed

    Klemm, Elizabeth; Dougan, Gordon

    2016-05-11

    The development of next-generation sequencing as a cost-effective technology has facilitated the analysis of bacterial population structure at a whole-genome level and at scale. From these data, phylogenic trees have been constructed that define population structures at a local, national, and global level, providing a framework for genetic analysis. Although still at an early stage, these approaches have yielded progress in several areas, including pathogen transmission mapping, the genetics of niche colonization and host adaptation, as well as gene-to-phenotype association studies. Antibiotic resistance has proven to be a major challenge in the early 21(st) century, and phylogenetic analyses have uncovered the dramatic effect that the use of antibiotics has had on shaping bacterial population structures. An update on insights into bacterial evolution from comparative genomics is provided in this review. PMID:27173928

  13. Genes but Not Genomes Reveal Bacterial Domestication of Lactococcus Lactis

    PubMed Central

    Passerini, Delphine; Beltramo, Charlotte; Coddeville, Michele; Quentin, Yves; Ritzenthaler, Paul

    2010-01-01

    Background The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST) scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE). Methodology/Principal Findings The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content) did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST) differing by up to 230 kb in genome size. Conclusion/Significance The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between “environmental” strains, the main contributors to the genetic diversity within the subspecies, and “domesticated” strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the “domesticated” strains essentially arose through substantial genomic flux within the dispensable genome

  14. Quantitative prediction for two-dimensional bacterial genomic displays

    NASA Astrophysics Data System (ADS)

    Mercier, Jean-Francois; Kingsburry, Christine; Lafay, Bénédicte; Slater, Gary W.

    2006-03-01

    Two-dimensional bacterial genomic display (2DBGD) is a simple technique that allows one to directly compare complete genomes of closely related bacteria. It consists of two phases. First, polyacrylamide gel electrophoresis (PAGE) is used to separate the DNA fragments resulting from the restriction of the genome by appropriate enzymes according to their size. Then, temperature gradient gel electrophoresis (TGGE) is used in the second dimension to separate the fragments according to their sequence composition. After these two steps, the whole bacterial genome is displayed as clouds of spots on a two-dimensional surface. 2DBGD has been successfully used to distinguish between strains of bacterial species. Unfortunately, this empirical technique remains highly qualitative. We have developed a model to predict the location of DNA spots, as a function of the DNA sequence, the gel electrophoresis and TGGE conditions and the nature of the restriction enzymes used. This model can be used to easily optimize the procedure for the type of bacteria being analyzed.

  15. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    SciTech Connect

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  16. Synthetic genomics and the construction of a synthetic bacterial cell.

    PubMed

    Glass, John I

    2012-01-01

    The first synthetic cellular organism was created in 2010 and based on a very small, very simple bacterium called Mycoplasma mycoides. The bacterium was called synthetic because its DNA genome was chemically synthesized rather than replicated from an existing template DNA, as occurs in all other known cellular life on Earth. The experiment was undertaken in order to develop a system that would allow creation of a minimal bacterial cell that could lead to a better understand of the first principles of cellular life. The effort resulted in new synthetic genomics techniques called genome assembly and genome transplantation. The ability of scientists to design and build bacteria opens new possibilities for creating microbes to solve human problems. PMID:23502559

  17. Genome sequence of the tobacco bacterial wilt pathogen Ralstonia solanacearum.

    PubMed

    Li, Zefeng; Wu, Sanling; Bai, Xuefei; Liu, Yun; Lu, Jianfei; Liu, Yong; Xiao, Bingguang; Lu, Xiuping; Fan, Longjiang

    2011-11-01

    Ralstonia solanacearum is a causal agent of plant bacterial wilt with thousands of distinct strains in a heterogeneous species complex. Here we report the genome sequence of a phylotype IB strain, Y45, isolated from tobacco (Nicotiana tabacum) in China. Compared with the published genomes of eight strains which were isolated from other hosts and habitats, 794 specific genes and many rearrangements/inversion events were identified in the tobacco strain, demonstrating that this strain represents an important node within the R. solanacearum complex. PMID:21994922

  18. Defining Pathogenic Bacterial Species in the Genomic Era

    PubMed Central

    Georgiades, Kalliopi; Raoult, Didier

    2011-01-01

    Actual definitions of bacterial species are limited due to the current criteria of definition and the use of restrictive genetic tools. The 16S ribosomal RNA sequence, for example, has been widely used as a marker for phylogenetic analyses; however, its use often leads to misleading species definitions. According to the first genetic studies, removing a certain number of genes from pathogenic bacteria removes their capacity to infect hosts. However, more recent studies have demonstrated that the specialization of bacteria in eukaryotic cells is associated with massive gene loss, especially for allopatric endosymbionts that have been isolated for a long time in an intracellular niche. Indeed, sympatric free-living bacteria often have bigger genomes and exhibit greater resistance and plasticity and constitute species complexes rather than true species. Specialists, such as pathogenic bacteria, escape these bacterial complexes and colonize a niche, thereby gaining a species name. Their specialization allows them to become allopatric, and their gene losses eventually favor reductive genome evolution. A pathogenic species is characterized by a gene repertoire that is defined not only by genes that are present but also by those that are lacking. It is likely that current bacterial pathogens will disappear soon and be replaced by new ones that will emerge from bacterial complexes that are already in contact with humans. PMID:21687765

  19. Genome-scale neurogenetics: methodology and meaning

    PubMed Central

    McCarroll, Steven A; Feng, Guoping; Hyman, Steven E

    2016-01-01

    Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology. PMID:24866041

  20. Completing bacterial genome assemblies: strategy and performance comparisons

    PubMed Central

    Liao, Yu-Chieh; Lin, Shu-Hung; Lin, Hsin-Hung

    2015-01-01

    Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches—hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction—have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly. PMID:25735824

  1. Completing bacterial genome assemblies: strategy and performance comparisons.

    PubMed

    Liao, Yu-Chieh; Lin, Shu-Hung; Lin, Hsin-Hung

    2015-01-01

    Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches--hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction--have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly. PMID:25735824

  2. How to interpret an anonymous bacterial genome: machine learning approach to gene identification.

    PubMed

    Hayes, W S; Borodovsky, M

    1998-11-01

    In this report we address the problem of accurate statistical modeling of DNA sequences, either coding or noncoding, for a bacterial species whose genome (or a large portion) was sequenced but not yet characterized experimentally. Availability of these models is critical for successful solution of the genome annotation task by statistical methods of gene finding. We present the method, GeneMark-Genesis, which learns the parameters of Markov models of protein-coding and noncoding regions from anonymous bacterial genomic sequence. These models are subsequently used in the GeneMark and GeneMark.hmm gene-finding programs. Although there is basically one model of a noncoding region for a given genome, several models of protein-coding region are automatically obtained by GeneMark-Genesis. The diversity of protein-coding models reflects the diversity of oligonucleotide compositions, particularly the diversity of codon usage strategies observed in genes from one and the same genome. In the simplest and the most important case, there are just two gene models-typical and atypical ones. We show that the atypical model allows one to predict genes that escape identification by the typical model. Many genes predicted by the atypical model appear to be horizontally transferred genes. The early versions of GeneMark-Genesis were used for annotating the genomes of Methanoccocus jannaschii and Helicobacter pylori. We report the results of accuracy testing of the full-scale version of GeneMark-Genesis on 10 completely sequenced bacterial genomes. Interestingly, the GeneMark.hmm program that employed the typical and atypical models defined by GeneMark-Genesis was able to predict 683 new atypical genes with 176 of them confirmed by similarity search. PMID:9847079

  3. Breakthroughs in field-scale bacterial transport

    SciTech Connect

    Balkwill, D; Chen, J; Deflaun, Mary; Dobbs, F; Dong, H; Fredrickson, Jim K. ); Fuller, M; Green, M ); Ginn, T; Griffin, T; Holben, W; Hubbard, S; Johnson, W; Long, Philip E. ); Mailloux, B; Majer, E; Mcinerney, M; Murray, Christopher J. ); Onstott, T; Phelps, T; Scheibe, Timothy D. ); Swift, D; White, D; Wobber, F

    2001-06-01

    This article summarizes a bioaugmentation research project undertaken by a DOE-sponsored, multidisciplinary research team at a field site near Oyster, Virginia. The overall purpose of the ongoing project is to evaluate the relative importance of hydrogeological and geochemical heterogeneities in controlling bacterial transpor, and to develop an approach for quantitative prediction of bacterial transport needed to design optimal bioremediation strategies.

  4. Developing insights into the mechanisms of evolution of bacterial pathogens from whole-genome sequences

    PubMed Central

    Bentley, Stephen D

    2014-01-01

    Evolution of bacterial pathogen populations has been detected in a variety of ways including phenotypic tests, such as metabolic activity, reaction to antisera and drug resistance and genotypic tests that measure variation in chromosome structure, repetitive loci and individual gene sequences. While informative, these methods only capture a small subset of the total variation and, therefore, have limited resolution. Advances in sequencing technologies have made it feasible to capture whole-genome sequence variation for each sample under study, providing the potential to detect all changes at all positions in the genome from single nucleotide changes to large-scale insertions and deletions. In this review, we focus on recent work that has applied this powerful new approach and summarize some of the advances that this has brought in our understanding of the details of how bacterial pathogens evolve. PMID:23075447

  5. mGenomeSubtractor: a web-based tool for parallel in silico subtractive hybridization analysis of multiple bacterial genomes.

    PubMed

    Shao, Yucheng; He, Xinyi; Harrison, Ewan M; Tai, Cui; Ou, Hong-Yu; Rajakumar, Kumar; Deng, Zixin

    2010-07-01

    mGenomeSubtractor performs an mpiBLAST-based comparison of reference bacterial genomes against multiple user-selected genomes for investigation of strain variable accessory regions. With parallel computing architecture, mGenomeSubtractor is able to run rapid BLAST searches of the segmented reference genome against multiple subject genomes at the DNA or amino acid level within a minute. In addition to comparison of protein coding sequences, the highly flexible sliding window-based genome fragmentation approach offered can be used to identify short unique sequences within or between genes. mGenomeSubtractor provides powerful schematic outputs for exploration of identified core and accessory regions, including searches against databases of mobile genetic elements, virulence factors or bacterial essential genes, examination of G+C content and binucleotide distribution bias, and integrated primer design tools. mGenomeSubtractor also allows for the ready definition of species-specific gene pools based on available genomes. Pan-genomic arrays can be easily developed using the efficient oligonucleotide design tool. This simple high-throughput in silico 'subtractive hybridization' analytical tool will support the rapidly escalating number of comparative bacterial genomics studies aimed at defining genomic biomarkers of evolutionary lineage, phenotype, pathotype, environmental adaptation and/or disease-association of diverse bacterial species. mGenomeSubtractor is freely available to all users without any login requirement at: http://bioinfo-mml.sjtu.edu.cn/mGS/. PMID:20435682

  6. Lifestyles of the effector-rich: genome-enabled characterization of bacterial plant pathogens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome sequencing of bacterial plant pathogens is providing transformative insights into the complex network of molecular plant-microbe interactions mediated by extracellular effectors during pathogenesis. Bacterial pathogens sequenced to completion are phylogenetically diverse and vary significant...

  7. Unveiling Bacterial Interactions through Multidimensional Scaling and Dynamics Modeling

    PubMed Central

    Dorado-Morales, Pedro; Vilanova, Cristina; P. Garay, Carlos; Martí, Jose Manuel; Porcar, Manuel

    2015-01-01

    We propose a new strategy to identify and visualize bacterial consortia by conducting replicated culturing of environmental samples coupled with high-throughput sequencing and multidimensional scaling analysis, followed by identification of bacteria-bacteria correlations and interactions. We conducted a proof of concept assay with pine-tree resin-based media in ten replicates, which allowed detecting and visualizing dynamical bacterial associations in the form of statistically significant and yet biologically relevant bacterial consortia. PMID:26671778

  8. Reconstruction of a Bacterial Genome from DNA Cassettes

    SciTech Connect

    Christopher Dupont; John Glass; Laura Sheahan; Shibu Yooseph; Lisa Zeigler Allen; Mathangi Thiagarajan; Andrew Allen; Robert Friedman; J. Craig Venter

    2011-12-31

    This basic research program comprised two major areas: (1) acquisition and analysis of marine microbial metagenomic data and development of genomic analysis tools for broad, external community use; (2) development of a minimal bacterial genome. Our Marine Metagenomic Diversity effort generated and analyzed shotgun sequencing data from microbial communities sampled from over 250 sites around the world. About 40% of the 26 Gbp of sequence data has been made publicly available to date with a complete release anticipated in six months. Our results and those mining the deposited data have revealed a vast diversity of genes coding for critical metabolic processes whose phylogenetic and geographic distributions will enable a deeper understanding of carbon and nutrient cycling, microbial ecology, and rapid rate evolutionary processes such as horizontal gene transfer by viruses and plasmids. A global assembly of the generated dataset resulted in a massive set (5Gbp) of genome fragments that provide context to the majority of the generated data that originated from uncultivated organisms. Our Synthetic Biology team has made significant progress towards the goal of synthesizing a minimal mycoplasma genome that will have all of the machinery for independent life. This project, once completed, will provide fundamentally new knowledge about requirements for microbial life and help to lay a basic research foundation for developing microbiological approaches to bioenergy.

  9. Linking genome-scale metabolic modeling and genome annotation

    PubMed Central

    Blais, Edik M.; Chavali, Arvind K.; Papin, Jason A.

    2014-01-01

    Summary Genome-scale metabolic network reconstructions, assembled from annotated genomes, serve as a platform for integrating data from heterogeneous sources and generating hypotheses for further experimental validation. Implementing constraint-based modeling techniques such as Flux Balance Analysis (FBA) on network reconstructions allow for interrogating metabolism at a systems-level, which aids in identifying and rectifying gaps in knowledge. With genome sequences for various organisms from prokaryotes to eukaryotes becoming increasingly available, a significant bottleneck lies in the structural and functional annotation of these sequences. Using topologically-based and biologically-inspired metabolic network refinement, we can better characterize enzymatic functions present in an organism and link annotation of these functions to candidate transcripts, both steps that can be experimentally validated. PMID:23417799

  10. The bacterial species definition in the genomic era

    PubMed Central

    Konstantinidis, Konstantinos T; Ramette, Alban; Tiedje, James M

    2006-01-01

    The bacterial species definition, despite its eminent practical significance for identification, diagnosis, quarantine and diversity surveys, remains a very difficult issue to advance. Genomics now offers novel insights into intra-species diversity and the potential for emergence of a more soundly based system. Although we share the excitement, we argue that it is premature for a universal change to the definition because current knowledge is based on too few phylogenetic groups and too few samples of natural populations. Our analysis of five important bacterial groups suggests, however, that more stringent standards for species may be justifiable when a solid understanding of gene content and ecological distinctiveness becomes available. Our analysis also reveals what is actually encompassed in a species according to the current standards, in terms of whole-genome sequence and gene-content diversity, and shows that this does not correspond to coherent clusters for the environmental Burkholderia and Shewanella genera examined. In contrast, the obligatory pathogens, which have a very restricted ecological niche, do exhibit clusters. Therefore, the idea of biologically meaningful clusters of diversity that applies to most eukaryotes may not be universally applicable in the microbial world, or if such clusters exist, they may be found at different levels of distinction. PMID:17062412

  11. Reductive genome evolution at both ends of the bacterial population size spectrum.

    PubMed

    Batut, Bérénice; Knibbe, Carole; Marais, Gabriel; Daubin, Vincent

    2014-12-01

    Bacterial genomes show substantial variations in size. The smallest bacterial genomes are those of endocellular symbionts of eukaryotic hosts, which have undergone massive genome reduction and show patterns that are consistent with the degenerative processes that are predicted to occur in species with small effective population sizes. However, similar genome reduction is found in some free-living marine cyanobacteria that are characterized by extremely large populations. In this Opinion article, we discuss the different hypotheses that have been proposed to account for this reductive genome evolution at both ends of the bacterial population size spectrum. PMID:25220308

  12. Multiple Factors Drive Replicating Strand Composition Bias in Bacterial Genomes

    PubMed Central

    Zhao, Hai-Long; Xia, Zhong-Kui; Zhang, Fa-Zhan; Ye, Yuan-Nong; Guo, Feng-Biao

    2015-01-01

    Composition bias from Chargaff’s second parity rule (PR2) has long been found in sequenced genomes, and is believed to relate strongly with the replication process in microbial genomes. However, some disagreement on the underlying reason for strand composition bias remains. We performed an integrative analysis of various genomic features that might influence composition bias using a large-scale dataset of 1111 genomes. Our results indicate (1) the bias was stronger in obligate intracellular bacteria than in other free-living species (p-value = 0.0305); (2) Fusobacteria and Firmicutes had the highest average bias among the 24 microbial phyla analyzed; (3) the strength of selected codon usage bias and generation times were not observably related to strand composition bias (p-value = 0.3247); (4) significant negative relationships were found between GC content, genome size, rearrangement frequency, Clusters of Orthologous Groups (COG) functional subcategories A, C, I, Q, and composition bias (p-values < 1.0 × 10−8); (5) gene density and COG functional subcategories D, F, J, L, and V were positively related with composition bias (p-value < 2.2 × 10−16); and (6) gene density made the most important contribution to composition bias, indicating transcriptional bias was associated strongly with strand composition bias. Therefore, strand composition bias was found to be influenced by multiple factors with varying weights. PMID:26404268

  13. Evolution of simple sequence repeat-mediated phase variation in bacterial genomes.

    PubMed

    Bayliss, Christopher D; Palmer, Michael E

    2012-09-01

    Mutability as mechanism for rapid adaptation to environmental challenge is an alluringly simple concept whose apotheosis is realized in simple sequence repeats (SSR). Bacterial genomes of several species contain SSRs with a proven role in adaptation to environmental fluctuations. SSRs are hypermutable and generate reversible mutations in localized regions of bacterial genomes, leading to phase variable ON/OFF switches in gene expression. The application of genetic, bioinformatic, and mathematical/computational modeling approaches are revolutionizing our current understanding of how genomic molecular forces and environmental factors influence SSR-mediated adaptation and led to evolution of this mechanism of localized hypermutation in bacterial genomes. PMID:22954215

  14. Genome trees constructed using five different approaches suggest new major bacterial clades

    PubMed Central

    Wolf, Yuri I; Rogozin, Igor B; Grishin, Nick V; Tatusov, Roman L; Koonin, Eugene V

    2001-01-01

    Background The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the

  15. Bacterial signal transduction network in a genomic perspective†

    PubMed Central

    Galperin, Michael Y.

    2005-01-01

    Summary Bacterial signalling network includes an array of numerous interacting components that monitor environmental and intracellular parameters and effect cellular response to changes in these parameters. The complexity of bacterial signalling systems makes comparative genome analysis a particularly valuable tool for their studies. Comparative studies revealed certain general trends in the organization of diverse signalling systems. These include (i) modular structure of signalling proteins; (ii) common organization of signalling components with the flow of information from N-terminal sensory domains to the C-terminal transmitter or signal output domains (N-to-C flow); (iii) use of common conserved sensory domains by different membrane receptors; (iv) ability of some organisms to respond to one environmental signal by activating several regulatory circuits; (v) abundance of intracellular signalling proteins, typically consisting of a PAS or GAF sensor domains and various output domains; (vi) importance of secondary messengers, cAMP and cyclic diguanylate; and (vii) crosstalk between components of different signalling pathways. Experimental characterization of the novel domains and domain combinations would be needed for achieving a better understanding of the mechanisms of signalling response and the intracellular hierarchy of different signalling pathways. PMID:15142243

  16. A robust platform for chemical genomics in bacterial systems

    PubMed Central

    French, Shawn; Mangat, Chand; Bharat, Amrita; Côté, Jean-Philippe; Mori, Hirotada; Brown, Eric D.

    2016-01-01

    While genetic perturbation has been the conventional route to probing bacterial systems, small molecules are showing great promise as probes for cellular complexity. Indeed, systematic investigations of chemical-genetic interactions can provide new insights into cell networks and are often starting points for understanding the mechanism of action of novel chemical probes. We have developed a robust and sensitive platform for chemical-genomic investigations in bacteria. The approach monitors colony volume kinetically using transmissive scanning measurements, enabling acquisition of growth rates and conventional endpoint measurements. We found that chemical-genomic profiles were highly sensitive to concentration, necessitating careful selection of compound concentrations. Roughly 20,000,000 data points were collected for 15 different antibiotics. While 1052 chemical-genetic interactions were identified using the conventional endpoint biomass approach, adding interactions in growth rate resulted in 1564 interactions, a 50–200% increase depending on the drug, with many genes uncharacterized or poorly annotated. The chemical-genetic interaction maps generated from these data reveal common genes likely involved in multidrug resistance. Additionally, the maps identified deletion backgrounds exhibiting class-specific potentiation, revealing conceivable targets for combination approaches to drug discovery. This open platform is highly amenable to kinetic screening of any arrayable strain collection, be it prokaryotic or eukaryotic. PMID:26792836

  17. A robust platform for chemical genomics in bacterial systems.

    PubMed

    French, Shawn; Mangat, Chand; Bharat, Amrita; Côté, Jean-Philippe; Mori, Hirotada; Brown, Eric D

    2016-03-15

    While genetic perturbation has been the conventional route to probing bacterial systems, small molecules are showing great promise as probes for cellular complexity. Indeed, systematic investigations of chemical-genetic interactions can provide new insights into cell networks and are often starting points for understanding the mechanism of action of novel chemical probes. We have developed a robust and sensitive platform for chemical-genomic investigations in bacteria. The approach monitors colony volume kinetically using transmissive scanning measurements, enabling acquisition of growth rates and conventional endpoint measurements. We found that chemical-genomic profiles were highly sensitive to concentration, necessitating careful selection of compound concentrations. Roughly 20,000,000 data points were collected for 15 different antibiotics. While 1052 chemical-genetic interactions were identified using the conventional endpoint biomass approach, adding interactions in growth rate resulted in 1564 interactions, a 50-200% increase depending on the drug, with many genes uncharacterized or poorly annotated. The chemical-genetic interaction maps generated from these data reveal common genes likely involved in multidrug resistance. Additionally, the maps identified deletion backgrounds exhibiting class-specific potentiation, revealing conceivable targets for combination approaches to drug discovery. This open platform is highly amenable to kinetic screening of any arrayable strain collection, be it prokaryotic or eukaryotic. PMID:26792836

  18. GOBASE—a database of organelle and bacterial genome information

    PubMed Central

    O'Brien, Emmet A.; Zhang, Yue; Yang, LiuSong; Wang, Eric; Marie, Veronique; Lang, B. Franz; Burger, Gertraud

    2006-01-01

    The organelle genome database GOBASE is now in its twelfth release, and includes 350 000 mitochondrial sequences and 118 000 chloroplast sequences, roughly a 3-fold expansion since previously documented. GOBASE also includes a fully reannotated genome sequence of Rickettsia prowazekii, one of the closest bacterial relatives of mitochondria, and will shortly expand to contain more data from bacteria from which organelles originated. All these sequences are now accessible through a single unified interface. Enhancements to the functionality of GOBASE include addition of pages for RNA structures and a page compiling data about the taxonomic distribution of organelle-encoded genes; incorporation of Gene Ontology terms; addition of features deduced from incomplete annotations to sequences in GenBank; marking of type examples in cases where single genes in single species are oversampled within GenBank; and addition of graphics illustrating gene structure and the position of neighbouring genes on a sequence. The database has been reimplemented in PostgreSQL to facilitate development and maintenance, and structural modifications have been made to speed up queries, particularly those related to taxonomy. The GOBASE database can be queried at and inquiries should be directed to gobase@bch.umontreal.ca. PMID:16381962

  19. Bacterial genomic epidemiology, from local outbreak characterization to species-history reconstruction.

    PubMed

    Gaiarsa, Stefano; De Marco, Leone; Comandatore, Francesco; Marone, Piero; Bandi, Claudio; Sassera, Davide

    2015-10-01

    Bacteriology has embraced the next-generation sequencing revolution, swiftly moving from the time of single genome sequencing to the age of genomic epidemiology. Hundreds and now even thousands of genomes are being sequenced for single bacterial species, allowing unprecedented levels of resolution and insight in the evolution and epidemic diffusion of the main bacterial pathogens. Here, we present a review of some of the most recent and groundbreaking studies in this field. PMID:26878934

  20. A Functional Genomic Yeast Screen to Identify Pathogenic Bacterial Proteins

    PubMed Central

    Slagowski, Naomi L; Kramer, Roger W; Morrison, Monica F; LaBaer, Joshua; Lesser, Cammie F

    2008-01-01

    Many bacterial pathogens promote infection and cause disease by directly injecting into host cells proteins that manipulate eukaryotic cellular processes. Identification of these translocated proteins is essential to understanding pathogenesis. Yet, their identification remains limited. This, in part, is due to their general sequence uniqueness, which confounds homology-based identification by comparative genomic methods. In addition, their absence often does not result in phenotypes in virulence assays limiting functional genetic screens. Translocated proteins have been observed to confer toxic phenotypes when expressed in the yeast Saccharomyces cerevisiae. This observation suggests that yeast growth inhibition can be used as an indicator of protein translocation in functional genomic screens. However, limited information is available regarding the behavior of non-translocated proteins in yeast. We developed a semi-automated quantitative assay to monitor the growth of hundreds of yeast strains in parallel. We observed that expression of half of the 19 Shigella translocated proteins tested but almost none of the 20 non-translocated Shigella proteins nor ∼1,000 Francisella tularensis proteins significantly inhibited yeast growth. Not only does this study establish that yeast growth inhibition is a sensitive and specific indicator of translocated proteins, but we also identified a new substrate of the Shigella type III secretion system (TTSS), IpaJ, previously missed by other experimental approaches. In those cases where the mechanisms of action of the translocated proteins are known, significant yeast growth inhibition correlated with the targeting of conserved cellular processes. By providing positive rather than negative indication of activity our assay complements existing approaches for identification of translocated proteins. In addition, because this assay only requires genomic DNA it is particularly valuable for studying pathogens that are difficult to

  1. Complete Bacteriophage Transfer in a Bacterial Endosymbiont (Wolbachia) Determined by Targeted Genome Capture

    PubMed Central

    Kent, Bethany N.; Salichos, Leonidas; Gibbons, John G.; Rokas, Antonis; Newton, Irene L. G.; Clark, Michael E.; Bordenstein, Seth R.

    2011-01-01

    Bacteriophage flux can cause the majority of genetic diversity in free-living bacteria. This tenet of bacterial genome evolution generally does not extend to obligate intracellular bacteria owing to their reduced contact with other microbes and a predominance of gene deletion over gene transfer. However, recent studies suggest intracellular coinfections in the same host can facilitate exchange of mobile elements between obligate intracellular bacteria—a means by which these bacteria can partially mitigate the reductive forces of the intracellular lifestyle. To test whether bacteriophages transfer as single genes or larger regions between coinfections, we sequenced the genome of the obligate intracellular Wolbachia strain wVitB from the parasitic wasp Nasonia vitripennis and compared it against the prophage sequences of the divergent wVitA coinfection. We applied, for the first time, a targeted sequence capture array to specifically trap the symbiont's DNA from a heterogeneous mixture of eukaryotic, bacterial, and viral DNA. The tiled array successfully captured the genome with 98.3% efficiency. Examination of the genome sequence revealed the largest transfer of bacteriophage and flanking genes (52.2 kb) to date between two obligate intracellular coinfections. The mobile element transfer occurred in the recent evolutionary past based on the 99.9% average nucleotide identity of the phage sequences between the two strains. In addition to discovering an evolutionary recent and large-scale horizontal phage transfer between coinfecting obligate intracellular bacteria, we demonstrate that “targeted genome capture” can enrich target DNA to alleviate the problem of isolating symbiotic microbes that are difficult to culture or purify from the conglomerate of organisms inside eukaryotes. PMID:21292630

  2. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions.

    PubMed

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  3. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    PubMed Central

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  4. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

    PubMed Central

    Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N.; Matsunaga, James; Mechaly, Ariel E.; Monk, Jonathan M.; Nascimento, Ana L. T.; Nelson, Karen E.; Palsson, Bernhard; Peacock, Sharon J.; Picardeau, Mathieu; Ricaldi, Jessica N.; Thaipandungpanit, Janjira; Wunder, Elsio A.; Yang, X. Frank; Zhang, Jun-Jie; Vinetz, Joseph M.

    2016-01-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade’s refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  5. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    PubMed

    Fouts, Derrick E; Matthias, Michael A; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L; Haake, David A; Haft, Daniel H; Hartskeerl, Rudy; Ko, Albert I; Levett, Paul N; Matsunaga, James; Mechaly, Ariel E; Monk, Jonathan M; Nascimento, Ana L T; Nelson, Karen E; Palsson, Bernhard; Peacock, Sharon J; Picardeau, Mathieu; Ricaldi, Jessica N; Thaipandungpanit, Janjira; Wunder, Elsio A; Yang, X Frank; Zhang, Jun-Jie; Vinetz, Joseph M

    2016-02-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  6. Bacterial genome replication at subzero temperatures in permafrost

    PubMed Central

    Tuorto, Steven J; Darias, Phillip; McGuinness, Lora R; Panikov, Nicolai; Zhang, Tingjun; Häggblom, Max M; Kerkhof, Lee J

    2014-01-01

    Microbial metabolic activity occurs at subzero temperatures in permafrost, an environment representing ∼25% of the global soil organic matter. Although much of the observed subzero microbial activity may be due to basal metabolism or macromolecular repair, there is also ample evidence for cellular growth. Unfortunately, most metabolic measurements or culture-based laboratory experiments cannot elucidate the specific microorganisms responsible for metabolic activities in native permafrost, nor, can bulk approaches determine whether different members of the microbial community modulate their responses as a function of changing subzero temperatures. Here, we report on the use of stable isotope probing with 13C-acetate to demonstrate bacterial genome replication in Alaskan permafrost at temperatures of 0 to −20 °C. We found that the majority (80%) of operational taxonomic units detected in permafrost microcosms were active and could synthesize 13C-labeled DNA when supplemented with 13C-acetate at temperatures of 0 to −20 °C during a 6-month incubation. The data indicated that some members of the bacterial community were active across all of the experimental temperatures, whereas many others only synthesized DNA within a narrow subzero temperature range. Phylogenetic analysis of 13C-labeled 16S rRNA genes revealed that the subzero active bacteria were members of the Acidobacteria, Actinobacteria, Chloroflexi, Gemmatimonadetes and Proteobacteria phyla and were distantly related to currently cultivated psychrophiles. These results imply that small subzero temperature changes may lead to changes in the active microbial community, which could have consequences for biogeochemical cycling in permanently frozen systems. PMID:23985750

  7. Conserved Units of Co-Expression in Bacterial Genomes: An Evolutionary Insight into Transcriptional Regulation

    PubMed Central

    Junier, Ivan; Rivoire, Olivier

    2016-01-01

    Genome-wide measurements of transcriptional activity in bacteria indicate that the transcription of successive genes is strongly correlated beyond the scale of operons. Here, we analyze hundreds of bacterial genomes to identify supra-operonic segments of genes that are proximal in a large number of genomes. We show that these synteny segments correspond to genomic units of strong transcriptional co-expression. Structurally, the segments contain operons with specific relative orientations (co-directional or divergent) and nucleoid-associated proteins are found to bind at their boundaries. Functionally, operons inside a same segment are highly co-expressed even in the apparent absence of regulatory factors at their promoter regions. Remote operons along DNA can also be co-expressed if their corresponding segments share a transcriptional or sigma factor, without requiring these factors to bind directly to the promoters of the operons. As evidence that these results apply across the bacterial kingdom, we demonstrate them both in the Gram-negative bacterium Escherichia coli and in the Gram-positive bacterium Bacillus subtilis. The underlying process that we propose involves only RNA-polymerases and DNA: it implies that the transcription of an operon mechanically enhances the transcription of adjacent operons. In support of a primary role of this regulation by facilitated co-transcription, we show that the transcription en bloc of successive operons as a result of transcriptional read-through is strongly and specifically enhanced in synteny segments. Finally, our analysis indicates that facilitated co-transcription may be evolutionary primitive and may apply beyond bacteria. PMID:27195891

  8. Identification of protein secretion systems in bacterial genomes

    PubMed Central

    Abby, Sophie S.; Cury, Jean; Guglielmini, Julien; Néron, Bertrand; Touchon, Marie; Rocha, Eduardo P. C.

    2016-01-01

    Bacteria with two cell membranes (diderms) have evolved complex systems for protein secretion. These systems were extensively studied in some model bacteria, but the characterisation of their diversity has lagged behind due to lack of standard annotation tools. We built online and standalone computational tools to accurately predict protein secretion systems and related appendages in bacteria with LPS-containing outer membranes. They consist of models describing the systems’ components and genetic organization to be used with MacSyFinder to search for T1SS-T6SS, T9SS, flagella, Type IV pili and Tad pili. We identified ~10,000 candidate systems in bacterial genomes, where T1SS and T5SS were by far the most abundant and widespread. All these data are made available in a public database. The recently described T6SSiii and T9SS were restricted to Bacteroidetes, and T6SSii to Francisella. The T2SS, T3SS, and T4SS were frequently encoded in single-copy in one locus, whereas most T1SS were encoded in two loci. The secretion systems of diderm Firmicutes were similar to those found in other diderms. Novel systems may remain to be discovered, since some clades of environmental bacteria lacked all known protein secretion systems. Our models can be fully customized, which should facilitate the identification of novel systems. PMID:26979785

  9. Metabolomic Functional Analysis of Bacterial Genomes: Final Report

    SciTech Connect

    Arp, Daniel J; Sayavedra-Soto, Luis A

    2008-01-01

    The availability of the complete DNA sequence of the bacterial genome of Nitrosomonas europaea offered the opportunity for unprecedented and detailed investigations of function. We studied the function of genes involved in carbohydrate and Fe metabolism. N. europaea has genes for the synthesis and degradation of glycogen and sucrose but cannot grow on substrates other than ammonia and CO2. Granules of glycogen were detected in whole cells by electron microscopy and quantified in cell-free extracts by enzymatic methods. The cellular glycogen and sucrose content varied depending on the composition of the growth medium and cellular growth stage. N. europaea also depends heavily on iron for metabolism of ammonia, is particularly interesting since it lacks genes for siderophore production, and has genes with only low similarity to known iron reductases, yet grows relatively well in medium containing low Fe. By comparing the transcriptomes of cells grown in iron-replete medium versus iron-limited medium, 247 genes were identified as differentially expressed. Mutant strains deficient in genes for sucrose, glycogen and iron metabolism were created and are being used to further our understanding of ammonia oxidizing bacteria.

  10. Diversity-generating Retroelements in Phage and Bacterial Genomes.

    PubMed

    Guo, Huatao; Arambula, Diego; Ghosh, Partho; Miller, Jeff F

    2014-12-01

    Diversity-generating retroelements (DGRs) are DNA diversification machines found in diverse bacterial and bacteriophage genomes that accelerate the evolution of ligand-receptor interactions. Diversification results from a unidirectional transfer of sequence information from an invariant template repeat (TR) to a variable repeat (VR) located in a protein-encoding gene. Information transfer is coupled to site-specific mutagenesis in a process called mutagenic homing, which occurs through an RNA intermediate and is catalyzed by a unique, DGR-encoded reverse transcriptase that converts adenine residues in the TR into random nucleotides in the VR. In the prototype DGR found in the Bordetella bacteriophage BPP-1, the variable protein Mtd is responsible for phage receptor recognition. VR diversification enables progeny phage to switch tropism, accelerating their adaptation to changes in sequence or availability of host cell-surface molecules for infection. Since their discovery, hundreds of DGRs have been identified, and their functions are just beginning to be understood. VR-encoded residues of many DGR-diversified proteins are displayed in the context of a C-type lectin fold, although other scaffolds, including the immunoglobulin fold, may also be used. DGR homing is postulated to occur through a specialized target DNA-primed reverse transcription mechanism that allows repeated rounds of diversification and selection, and the ability to engineer DGRs to target heterologous genes suggests applications for bioengineering. This chapter provides a comprehensive review of our current understanding of this newly discovered family of beneficial retroelements. PMID:26104433

  11. Scale-Invariant Correlations in Dynamic Bacterial Clusters

    NASA Astrophysics Data System (ADS)

    Chen, Xiao; Dong, Xu; Be'er, Avraham; Swinney, Harry L.; Zhang, H. P.

    2012-04-01

    In Bacillus subtilis colonies, motile bacteria move collectively, spontaneously forming dynamic clusters. These bacterial clusters share similarities with other systems exhibiting polarized collective motion, such as bird flocks or fish schools. Here we study experimentally how velocity and orientation fluctuations within clusters are spatially correlated. For a range of cell density and cluster size, the correlation length is shown to be 30% of the spatial size of clusters, and the correlation functions collapse onto a master curve after rescaling the separation with correlation length. Our results demonstrate that correlations of velocity and orientation fluctuations are scale invariant in dynamic bacterial clusters.

  12. The Genomic HyperBrowser: an analysis web server for genome-scale data

    PubMed Central

    Sandve, Geir K.; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K.; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalaš, Matúš; Lien, Tonje; Rye, Morten B.; Frigessi, Arnoldo; Hovig, Eivind

    2013-01-01

    The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome. PMID:23632163

  13. Spatial scale drives patterns in soil bacterial diversity.

    PubMed

    O'Brien, Sarah L; Gibbons, Sean M; Owens, Sarah M; Hampton-Marcell, Jarrad; Johnston, Eric R; Jastrow, Julie D; Gilbert, Jack A; Meyer, Folker; Antonopoulos, Dionysios A

    2016-06-01

    Soil microbial communities are essential for ecosystem function, but linking community composition to biogeochemical processes is challenging because of high microbial diversity and large spatial variability of most soil characteristics. We investigated soil bacterial community structure in a switchgrass stand planted on soil with a history of grassland vegetation at high spatial resolution to determine whether biogeographic trends occurred at the centimeter scale. Moreover, we tested whether such heterogeneity, if present, influenced community structure within or among ecosystems. Pronounced heterogeneity was observed at centimeter scales, with abrupt changes in relative abundance of phyla from sample to sample. At the ecosystem scale (> 10 m), however, bacterial community composition and structure were subtly, but significantly, altered by fertilization, with higher alpha diversity in fertilized plots. Moreover, by comparing these data with data from 1772 soils from the Earth Microbiome Project, it was found that 20% of bacterial taxa were shared between their site and diverse globally sourced soil samples, while grassland soils shared approximately 40% of their operational taxonomic units with the current study. By spanning several orders of magnitude, the analysis suggested that extreme patchiness characterized community structure at smaller scales but that coherent patterns emerged at larger length scales. PMID:26914164

  14. Defining the estimated core genome of bacterial populations using a Bayesian decision model.

    PubMed

    van Tonder, Andries J; Mistry, Shilan; Bray, James E; Hill, Dorothea M C; Cody, Alison J; Farmer, Chris L; Klugman, Keith P; von Gottberg, Anne; Bentley, Stephen D; Parkhill, Julian; Jolley, Keith A; Maiden, Martin C J; Brueggemann, Angela B

    2014-08-01

    The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estimate the number of core genes, calculated pairwise evolutionary distances (p-distances) based on nucleotide sequence diversity, and plotted the median p-distance for each core gene relative to its genome location. We designed visually-informative genome diagrams to depict areas of interest in genomes. Case studies demonstrated how the model could identify areas for further study, e.g. 25% of the core genes with higher sequence diversity in the Campylobacter jejuni and Neisseria meningitidis genomes encoded hypothetical proteins. The core gene with the highest p-distance value in C. jejuni was annotated in the reference genome as a putative hydrolase, but further work revealed that it shared sequence homology with beta-lactamase/metallo-beta-lactamases (enzymes that provide resistance to a range of broad-spectrum antibiotics) and thioredoxin reductase genes (which reduce oxidative stress and are essential for DNA replication) in other C. jejuni genomes. Our Bayesian model of estimating the core genome is principled, easy to use and can be applied to large genome datasets. This study also highlighted the lack of knowledge currently available for many core genes in bacterial genomes of significant global public health importance. PMID:25144616

  15. Draft Genome Sequences of Six Novel Bacterial Isolates from Chicken Ceca

    PubMed Central

    Duggett, Nicholas A.; Kay, Gemma L.; Sergeant, Martin J.; Bedford, Michael; Constantinidou, Chrystala I.; Penn, Charles W.; Millard, Andrew D.

    2016-01-01

    The chicken is the most common domesticated animal and the most abundant bird in the world. However, the chicken gut is home to many previously uncharacterized bacterial taxa. Here, we report draft genome sequences from six bacterial isolates from chicken ceca, all of which fall outside any named species. PMID:27231374

  16. Draft Genome Sequences of Six Novel Bacterial Isolates from Chicken Ceca.

    PubMed

    Duggett, Nicholas A; Kay, Gemma L; Sergeant, Martin J; Bedford, Michael; Constantinidou, Chrystala I; Penn, Charles W; Millard, Andrew D; Pallen, Mark J

    2016-01-01

    The chicken is the most common domesticated animal and the most abundant bird in the world. However, the chicken gut is home to many previously uncharacterized bacterial taxa. Here, we report draft genome sequences from six bacterial isolates from chicken ceca, all of which fall outside any named species. PMID:27231374

  17. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    PubMed Central

    Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available. PMID:27327771

  18. Evolution in an oncogenic bacterial species with extreme genome plasticity: Helicobacter pylori East Asian genomes

    PubMed Central

    2011-01-01

    Background The genome of Helicobacter pylori, an oncogenic bacterium in the human stomach, rapidly evolves and shows wide geographical divergence. The high incidence of stomach cancer in East Asia might be related to bacterial genotype. We used newly developed comparative methods to follow the evolution of East Asian H. pylori genomes using 20 complete genome sequences from Japanese, Korean, Amerind, European, and West African strains. Results A phylogenetic tree of concatenated well-defined core genes supported divergence of the East Asian lineage (hspEAsia; Japanese and Korean) from the European lineage ancestor, and then from the Amerind lineage ancestor. Phylogenetic profiling revealed a large difference in the repertoire of outer membrane proteins (including oipA, hopMN, babABC, sabAB and vacA-2) through gene loss, gain, and mutation. All known functions associated with molybdenum, a rare element essential to nearly all organisms that catalyzes two-electron-transfer oxidation-reduction reactions, appeared to be inactivated. Two pathways linking acetyl~CoA and acetate appeared intact in some Japanese strains. Phylogenetic analysis revealed greater divergence between the East Asian (hspEAsia) and the European (hpEurope) genomes in proteins in host interaction, specifically virulence factors (tipα), outer membrane proteins, and lipopolysaccharide synthesis (human Lewis antigen mimicry) enzymes. Divergence was also seen in proteins in electron transfer and translation fidelity (miaA, tilS), a DNA recombinase/exonuclease that recognizes genome identity (addA), and DNA/RNA hybrid nucleases (rnhAB). Positively selected amino acid changes between hspEAsia and hpEurope were mapped to products of cagA, vacA, homC (outer membrane protein), sotB (sugar transport), and a translation fidelity factor (miaA). Large divergence was seen in genes related to antibiotics: frxA (metronidazole resistance), def (peptide deformylase, drug target), and ftsA (actin-like, drug target

  19. Genomic Epidemiology: Whole-Genome-Sequencing-Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens.

    PubMed

    Deng, Xiangyu; den Bakker, Henk C; Hendriksen, Rene S

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon. Technological, operational, and policy challenges are still present and being addressed by an international and multidisciplinary community of researchers, public health practitioners, and other stakeholders. PMID:26772415

  20. Bacterial DNA Sifted from the Trichoplax adhaerens (Animalia: Placozoa) Genome Project Reveals a Putative Rickettsial Endosymbiont

    PubMed Central

    Driscoll, Timothy; Gillespie, Joseph J.; Nordberg, Eric K.; Azad, Abdu F.; Sobral, Bruno W.

    2013-01-01

    Eukaryotic genome sequencing projects often yield bacterial DNA sequences, data typically considered as microbial contamination. However, these sequences may also indicate either symbiont genes or lateral gene transfer (LGT) to host genomes. These bacterial sequences can provide clues about eukaryote–microbe interactions. Here, we used the genome of the primitive animal Trichoplax adhaerens (Metazoa: Placozoa), which is known to harbor an uncharacterized Gram-negative endosymbiont, to search for the presence of bacterial DNA sequences. Bioinformatic and phylogenomic analyses of extracted data from the genome assembly (181 bacterial coding sequences [CDS]) and trace read archive (16S rDNA) revealed a dominant proteobacterial profile strongly skewed to Rickettsiales (Alphaproteobacteria) genomes. By way of phylogenetic analysis of 16S rDNA and 113 proteins conserved across proteobacterial genomes, as well as identification of 27 rickettsial signature genes, we propose a Rickettsiales endosymbiont of T. adhaerens (RETA). The majority (93%) of the identified bacterial CDS belongs to small scaffolds containing prokaryotic-like genes; however, 12 CDS were identified on large scaffolds comprised of eukaryotic-like genes, suggesting that T. adhaerens might have recently acquired bacterial genes. These putative LGTs may coincide with the placozoan’s aquatic niche and symbiosis with RETA. This work underscores the rich, and relatively untapped, resource of eukaryotic genome projects for harboring data pertinent to host–microbial interactions. The nature of unknown (or poorly characterized) bacterial species may only emerge via analysis of host genome sequencing projects, particularly if these species are resistant to cell culturing, as are many obligate intracellular microbes. Our work provides methodological insight for such an approach. PMID:23475938

  1. Facile, High Quality Sequencing of Bacterial Genomes from Small Amounts of DNA

    PubMed Central

    Vuyisich, Momchilo; Arefin, Ayesha; Davenport, Karen; Feng, Shihai; Gleasner, Cheryl; McMurry, Kim; Parson-Quintana, Beverly; Price, Jennifer; Scholz, Matthew; Chain, Patrick

    2014-01-01

    Sequencing bacterial genomes has traditionally required large amounts of genomic DNA (~1 μg). There have been few studies to determine the effects of the input DNA amount or library preparation method on the quality of sequencing data. Several new commercially available library preparation methods enable shotgun sequencing from as little as 1 ng of input DNA. In this study, we evaluated the NEBNext Ultra library preparation reagents for sequencing bacterial genomes. We have evaluated the utility of NEBNext Ultra for resequencing and de novo assembly of four bacterial genomes and compared its performance with the TruSeq library preparation kit. The NEBNext Ultra reagents enable high quality resequencing and de novo assembly of a variety of bacterial genomes when using 100 ng of input genomic DNA. For the two most challenging genomes (Burkholderia spp.), which have the highest GC content and are the longest, we also show that the quality of both resequencing and de novo assembly is not decreased when only 10 ng of input genomic DNA is used. PMID:25478564

  2. Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes

    PubMed Central

    Dekhtyar, Michael; Morin, Amelie; Sakanyan, Vehary

    2008-01-01

    Background Bacterial promoters, which increase the efficiency of gene expression, differ from other promoters by several characteristics. This difference, not yet widely exploited in bioinformatics, looks promising for the development of relevant computational tools to search for strong promoters in bacterial genomes. Results We describe a new triad pattern algorithm that predicts strong promoter candidates in annotated bacterial genomes by matching specific patterns for the group I σ70 factors of Escherichia coli RNA polymerase. It detects promoter-specific motifs by consecutively matching three patterns, consisting of an UP-element, required for interaction with the α subunit, and then optimally-separated patterns of -35 and -10 boxes, required for interaction with the σ70 subunit of RNA polymerase. Analysis of 43 bacterial genomes revealed that the frequency of candidate sequences depends on the A+T content of the DNA under examination. The accuracy of in silico prediction was experimentally validated for the genome of a hyperthermophilic bacterium, Thermotoga maritima, by applying a cell-free expression assay using the predicted strong promoters. In this organism, the strong promoters govern genes for translation, energy metabolism, transport, cell movement, and other as-yet unidentified functions. Conclusion The triad pattern algorithm developed for predicting strong bacterial promoters is well suited for analyzing bacterial genomes with an A+T content of less than 62%. This computational tool opens new prospects for investigating global gene expression, and individual strong promoters in bacteria of medical and/or economic significance. PMID:18471287

  3. Whole genome sequencing of bacteria in cystic fibrosis as a model for bacterial genome adaptation and evolution.

    PubMed

    Sharma, Poonam; Gupta, Sushim Kumar; Rolain, Jean-Marc

    2014-03-01

    Cystic fibrosis (CF) airways harbor a wide variety of new and/or emerging multidrug resistant bacteria which impose a heavy burden on patients. These bacteria live in close proximity with one another, which increases the frequency of lateral gene transfer. The exchange and movement of mobile genetic elements and genomic islands facilitate the spread of genes between genetically diverse bacteria, which seem to be advantageous to the bacterium as it allows adaptation to the new niches of the CF lungs. Niche adaptation is one of the major evolutionary forces shaping bacterial genome composition and in CF the chronic strains adapt and become less virulent. The purpose of this review is to shed light on CF bacterial genome alterations. Next-generation sequencing technology is an exciting tool that may help us to decipher the genome architecture and the evolution of bacteria colonizing CF lungs. PMID:24502835

  4. BactoGeNIE: a large-scale comparative genome visualization for big displays

    PubMed Central

    2015-01-01

    Background The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. Results In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. Conclusions BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics. PMID:26329021

  5. BactoGeNIE: A large-scale comparative genome visualization for big displays

    DOE PAGESBeta

    Aurisano, Jillian; Reda, Khairi; Johnson, Andrew; Marai, Elisabeta G.; Leigh, Jason

    2015-08-13

    The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE throughmore » a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.« less

  6. BactoGeNIE: A large-scale comparative genome visualization for big displays

    SciTech Connect

    Aurisano, Jillian; Reda, Khairi; Johnson, Andrew; Marai, Elisabeta G.; Leigh, Jason

    2015-08-13

    The volume of complete bacterial genome sequence data available to comparative genomics researchers is rapidly increasing. However, visualizations in comparative genomics--which aim to enable analysis tasks across collections of genomes--suffer from visual scalability issues. While large, multi-tiled and high-resolution displays have the potential to address scalability issues, new approaches are needed to take advantage of such environments, in order to enable the effective visual analysis of large genomics datasets. In this paper, we present Bacterial Gene Neighborhood Investigation Environment, or BactoGeNIE, a novel and visually scalable design for comparative gene neighborhood analysis on large display environments. We evaluate BactoGeNIE through a case study on close to 700 draft Escherichia coli genomes, and present lessons learned from our design process. In conclusion, BactoGeNIE accommodates comparative tasks over substantially larger collections of neighborhoods than existing tools and explicitly addresses visual scalability. Given current trends in data generation, scalable designs of this type may inform visualization design for large-scale comparative research problems in genomics.

  7. Identification of Prophages in Bacterial Genomes by Dinucleotide Relative Abundance Difference

    PubMed Central

    Srividhya, K. V.; Alaguraj, V.; Poornima, G.; Kumar, Dinesh; Singh, G. P.; Raghavenderan, L.; Katta, A. V. S. K. Mohan; Mehta, Preeti; Krishnaswamy, S.

    2007-01-01

    Background Prophages are integrated viral forms in bacterial genomes that have been found to contribute to interstrain genetic variability. Many virulence-associated genes are reported to be prophage encoded. Present computational methods to detect prophages are either by identifying possible essential proteins such as integrases or by an extension of this technique, which involves identifying a region containing proteins similar to those occurring in prophages. These methods suffer due to the problem of low sequence similarity at the protein level, which suggests that a nucleotide based approach could be useful. Methodology Earlier dinucleotide relative abundance (DRA) have been used to identify regions, which deviate from the neighborhood areas, in genomes. We have used the difference in the dinucleotide relative abundance (DRAD) between the bacterial and prophage DNA to aid location of DNA stretches that could be of prophage origin in bacterial genomes. Prophage sequences which deviate from bacterial regions in their dinucleotide frequencies are detected by scanning bacterial genome sequences. The method was validated using a subset of genomes with prophage data from literature reports. A web interface for prophage scan based on this method is available at http://bicmku.in:8082/prophagedb/dra.html. Two hundred bacterial genomes which do not have annotated prophages have been scanned for prophage regions using this method. Conclusions The relative dinucleotide distribution difference helps detect prophage regions in genome sequences. The usefulness of this method is seen in the identification of 461 highly probable loci pertaining to prophages which have not been annotated so earlier. This work emphasizes the need to extend the efforts to detect and annotate prophage elements in genome sequences. PMID:18030328

  8. Construction and analysis of Siberian tiger bacterial artificial chromosome library with approximately 6.5-fold genome equivalent coverage.

    PubMed

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  9. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    PubMed Central

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-01-01

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger. PMID:24608928

  10. Construction and characterization of bacterial artificial chromosomes (BACs) containing herpes simplex virus full-length genomes.

    PubMed

    Nagel, Claus-Henning; Pohlmann, Anja; Sodeik, Beate

    2014-01-01

    Bacterial artificial chromosomes (BACs) are suitable vectors not only to maintain the large genomes of herpesviruses in Escherichia coli but also to enable the traceless introduction of any mutation using modern tools of bacterial genetics. To clone a herpes simplex virus genome, a BAC replication origin is first introduced into the viral genome by homologous recombination in eukaryotic host cells. As part of their nuclear replication cycle, genomes of herpesviruses circularize and these replication intermediates are then used to transform bacteria. After cloning, the integrity of the recombinant viral genomes is confirmed by restriction length polymorphism analysis and sequencing. The BACs may then be used to design virus mutants. Upon transfection into eukaryotic cells new herpesvirus strains harboring the desired mutations can be recovered and used for experiments in cultured cells as well as in animal infection models. PMID:24671676

  11. Phages and the Evolution of Bacterial Pathogens: from Genomic Rearrangements to Lysogenic Conversion

    PubMed Central

    Brüssow, Harald; Canchaya, Carlos; Hardt, Wolf-Dietrich

    2004-01-01

    Comparative genomics demonstrated that the chromosomes from bacteria and their viruses (bacteriophages) are coevolving. This process is most evident for bacterial pathogens where the majority contain prophages or phage remnants integrated into the bacterial DNA. Many prophages from bacterial pathogens encode virulence factors. Two situations can be distinguished: Vibrio cholerae, Shiga toxin-producing Escherichia coli, Corynebacterium diphtheriae, and Clostridium botulinum depend on a specific prophage-encoded toxin for causing a specific disease, whereas Staphylococcus aureus, Streptococcus pyogenes, and Salmonella enterica serovar Typhimurium harbor a multitude of prophages and each phage-encoded virulence or fitness factor makes an incremental contribution to the fitness of the lysogen. These prophages behave like “swarms” of related prophages. Prophage diversification seems to be fueled by the frequent transfer of phage material by recombination with superinfecting phages, resident prophages, or occasional acquisition of other mobile DNA elements or bacterial chromosomal genes. Prophages also contribute to the diversification of the bacterial genome architecture. In many cases, they actually represent a large fraction of the strain-specific DNA sequences. In addition, they can serve as anchoring points for genome inversions. The current review presents the available genomics and biological data on prophages from bacterial pathogens in an evolutionary framework. PMID:15353570

  12. [Homologous recombination among bacterial genomes: the measurement and identification].

    PubMed

    Xianwei, Yang; Ruifu, Yang; Yujun, Cui

    2016-02-01

    Homologous recombination is one of important sources in shaping the bacterial population diversity, which disrupts the clonal relationship among different lineages through horizontal transferring of DNA-segments. As consequence of blurring the vertical inheritance signals, the homologous recombination raises difficulties in phylogenetic analysis and reconstruction of population structure. Here we discuss the impacts of homologous recombination in inferring phylogenetic relationship among bacterial isolates, and summarize the tools and models separately used in recombination measurement and identification. We also highlight the merits and drawbacks of various approaches, aiming to assist in the practical application for the analysis of homologous recombination in bacterial evolution research. PMID:26907777

  13. A highly precise and portable genome engineering method allows comparison of mutational effects across bacterial species

    PubMed Central

    Nyerges, Ákos; Csörgő, Bálint; Nagy, István; Bálint, Balázs; Bihari, Péter; Lázár, Viktória; Apjok, Gábor; Umenhoffer, Kinga; Bogos, Balázs; Pósfai, György; Pál, Csaba

    2016-01-01

    Currently available tools for multiplex bacterial genome engineering are optimized for a few laboratory model strains, demand extensive prior modification of the host strain, and lead to the accumulation of numerous off-target modifications. Building on prior development of multiplex automated genome engineering (MAGE), our work addresses these problems in a single framework. Using a dominant-negative mutant protein of the methyl-directed mismatch repair (MMR) system, we achieved a transient suppression of DNA repair in Escherichia coli, which is necessary for efficient oligonucleotide integration. By integrating all necessary components into a broad-host vector, we developed a new workflow we term pORTMAGE. It allows efficient modification of multiple loci, without any observable off-target mutagenesis and prior modification of the host genome. Because of the conserved nature of the bacterial MMR system, pORTMAGE simultaneously allows genome editing and mutant library generation in other biotechnologically and clinically relevant bacterial species. Finally, we applied pORTMAGE to study a set of antibiotic resistance-conferring mutations in Salmonella enterica and E. coli. Despite over 100 million y of divergence between the two species, mutational effects remained generally conserved. In sum, a single transformation of a pORTMAGE plasmid allows bacterial species of interest to become an efficient host for genome engineering. These advances pave the way toward biotechnological and therapeutic applications. Finally, pORTMAGE allows systematic comparison of mutational effects and epistasis across a wide range of bacterial species. PMID:26884157

  14. A highly precise and portable genome engineering method allows comparison of mutational effects across bacterial species.

    PubMed

    Nyerges, Ákos; Csörgő, Bálint; Nagy, István; Bálint, Balázs; Bihari, Péter; Lázár, Viktória; Apjok, Gábor; Umenhoffer, Kinga; Bogos, Balázs; Pósfai, György; Pál, Csaba

    2016-03-01

    Currently available tools for multiplex bacterial genome engineering are optimized for a few laboratory model strains, demand extensive prior modification of the host strain, and lead to the accumulation of numerous off-target modifications. Building on prior development of multiplex automated genome engineering (MAGE), our work addresses these problems in a single framework. Using a dominant-negative mutant protein of the methyl-directed mismatch repair (MMR) system, we achieved a transient suppression of DNA repair in Escherichia coli, which is necessary for efficient oligonucleotide integration. By integrating all necessary components into a broad-host vector, we developed a new workflow we term pORTMAGE. It allows efficient modification of multiple loci, without any observable off-target mutagenesis and prior modification of the host genome. Because of the conserved nature of the bacterial MMR system, pORTMAGE simultaneously allows genome editing and mutant library generation in other biotechnologically and clinically relevant bacterial species. Finally, we applied pORTMAGE to study a set of antibiotic resistance-conferring mutations in Salmonella enterica and E. coli. Despite over 100 million y of divergence between the two species, mutational effects remained generally conserved. In sum, a single transformation of a pORTMAGE plasmid allows bacterial species of interest to become an efficient host for genome engineering. These advances pave the way toward biotechnological and therapeutic applications. Finally, pORTMAGE allows systematic comparison of mutational effects and epistasis across a wide range of bacterial species. PMID:26884157

  15. Ensembl Genomes 2013: scaling up access to genome-wide data

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  16. Modeling cancer metabolism on a genome scale

    PubMed Central

    Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

    2015-01-01

    Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field. PMID:26130389

  17. Draft genome sequences for the obligate bacterial predators Bacteriovorax spp. of four phylogenetic clusters

    PubMed Central

    2015-01-01

    Bacteriovorax is the halophilic genus of the obligate bacterial predators, Bdellovibrio and like organisms. The predators are known for their unique biphasic life style in which they search for and attack their prey in the free living phase; penetrate, grow, multiply and lyse the prey in the intraperiplasmic phase. Bacteriovorax isolates representing four phylogenetic clusters were selected for genomic sequencing. Only one type strain genome has been published so far from the genus Bacteriovorax. We report the genomes from non-type strains isolated from aquatic environments. Here we describe and compare the genomic features of the four strains, together with the classification and annotation. PMID:26203326

  18. Genome-Wide Molecular Clock and Horizontal Gene Transfer in Bacterial Evolution

    PubMed Central

    Novichkov, Pavel S.; Omelchenko, Marina V.; Gelfand, Mikhail S.; Mironov, Andrei A.; Wolf, Yuri I.; Koonin, Eugene V.

    2004-01-01

    We describe a simple theoretical framework for identifying orthologous sets of genes that deviate from a clock-like model of evolution. The approach used is based on comparing the evolutionary distances within a set of orthologs to a standard intergenomic distance, which was defined as the median of the distribution of the distances between all one-to-one orthologs. Under the clock-like model, the points on a plot of intergenic distances versus intergenomic distances are expected to fit a straight line. A statistical technique to identify significant deviations from the clock-like behavior is described. For several hundred analyzed orthologous sets representing three well-defined bacterial lineages, the α-Proteobacteria, the γ-Proteobacteria, and the Bacillus-Clostridium group, the clock-like null hypothesis could not be rejected for ∼70% of the sets, whereas the rest showed substantial anomalies. Subsequent detailed phylogenetic analysis of the genes with the strongest deviations indicated that over one-half of these genes probably underwent a distinct form of horizontal gene transfer, xenologous gene displacement, in which a gene is displaced by an ortholog from a different lineage. The remaining deviations from the clock-like model could be explained by lineage-specific acceleration of evolution. The results indicate that although xenologous gene displacement is a major force in bacterial evolution, a significant majority of orthologous gene sets in three major bacterial lineages evolved in accordance with the clock-like model. The approach described here allows rapid detection of deviations from this mode of evolution on the genome scale. PMID:15375139

  19. Segmentation of genomic DNA through entropic divergence: Power laws and scaling

    NASA Astrophysics Data System (ADS)

    Azad, Rajeev K.; Bernaola-Galván, Pedro; Ramaswamy, Ramakrishna; Rao, J. Subba

    2002-05-01

    Genomic DNA is fragmented into segments using the Jensen-Shannon divergence. Use of this criterion results in the fragments being entropically homogeneous to within a predefined level of statistical significance. Application of this procedure is made to complete genomes of organisms from archaebacteria, eubacteria, and eukaryotes. The distribution of fragment lengths in bacterial and primitive eukaryotic DNAs shows two distinct regimes of power-law scaling. The characteristic length separating these two regimes appears to be an intrinsic property of the sequence rather than a finite-size artifact, and is independent of the significance level used in segmenting a given genome. Fragment length distributions obtained in the segmentation of the genomes of more highly evolved eukaryotes do not have such distinct regimes of power-law behavior.

  20. Evidence of codon usage in the nearest neighbor spacing distribution of bases in bacterial genomes

    NASA Astrophysics Data System (ADS)

    Higareda, M. F.; Geiger, O.; Mendoza, L.; Méndez-Sánchez, R. A.

    2012-02-01

    Statistical analysis of whole genomic sequences usually assumes a homogeneous nucleotide density throughout the genome, an assumption that has been proved incorrect for several organisms since the nucleotide density is only locally homogeneous. To avoid giving a single numerical value to this variable property, we propose the use of spectral statistics, which characterizes the density of nucleotides as a function of its position in the genome. We show that the cumulative density of bases in bacterial genomes can be separated into an average (or secular) plus a fluctuating part. Bacterial genomes can be divided into two groups according to the qualitative description of their secular part: linear and piecewise linear. These two groups of genomes show different properties when their nucleotide spacing distribution is studied. In order to analyze genomes having a variable nucleotide density, statistically, the use of unfolding is necessary, i.e., to get a separation between the secular part and the fluctuations. The unfolding allows an adequate comparison with the statistical properties of other genomes. With this methodology, four genomes were analyzed Burkholderia, Bacillus, Clostridium and Corynebacterium. Interestingly, the nearest neighbor spacing distributions or detrended distance distributions are very similar for species within the same genus but they are very different for species from different genera. This difference can be attributed to the difference in the codon usage.

  1. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis

    PubMed Central

    Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del

    2015-01-01

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096

  2. Loss of Conserved Noncoding RNAs in Genomes of Bacterial Endosymbionts.

    PubMed

    Matelska, Dorota; Kurkowska, Malgorzata; Purta, Elzbieta; Bujnicki, Janusz M; Dunin-Horkawicz, Stanislaw

    2016-02-01

    The genomes of intracellular symbiotic or pathogenic bacteria, such as of Buchnera, Mycoplasma, and Rickettsia, are typically smaller compared with their free-living counterparts. Here we showed that noncoding RNA (ncRNA) families, which are conserved in free-living bacteria, frequently could not be detected by computational methods in the small genomes. Statistical tests demonstrated that their absence is not an artifact of low GC content or small deletions in these small genomes, and thus it was indicative of an independent loss of ncRNAs in different endosymbiotic lineages. By analyzing the synteny (conservation of gene order) between the reduced and nonreduced genomes, we revealed instances of protein-coding genes that were preserved in the reduced genomes but lost cis-regulatory elements. We found that the loss of cis-regulatory ncRNA sequences, which regulate the expression of cognate protein-coding genes, is characterized by the reduction of secondary structure formation propensity, GC content, and length of the corresponding genomic regions. PMID:26782934

  3. Loss of Conserved Noncoding RNAs in Genomes of Bacterial Endosymbionts

    PubMed Central

    Matelska, Dorota; Kurkowska, Malgorzata; Purta, Elzbieta; Bujnicki, Janusz M.; Dunin-Horkawicz, Stanislaw

    2016-01-01

    The genomes of intracellular symbiotic or pathogenic bacteria, such as of Buchnera, Mycoplasma, and Rickettsia, are typically smaller compared with their free-living counterparts. Here we showed that noncoding RNA (ncRNA) families, which are conserved in free-living bacteria, frequently could not be detected by computational methods in the small genomes. Statistical tests demonstrated that their absence is not an artifact of low GC content or small deletions in these small genomes, and thus it was indicative of an independent loss of ncRNAs in different endosymbiotic lineages. By analyzing the synteny (conservation of gene order) between the reduced and nonreduced genomes, we revealed instances of protein-coding genes that were preserved in the reduced genomes but lost cis-regulatory elements. We found that the loss of cis-regulatory ncRNA sequences, which regulate the expression of cognate protein-coding genes, is characterized by the reduction of secondary structure formation propensity, GC content, and length of the corresponding genomic regions. PMID:26782934

  4. The Bacterial Origins of the CRISPR Genome-Editing Revolution.

    PubMed

    Sontheimer, Erik J; Barrangou, Rodolphe

    2015-07-01

    Like most of the tools that enable modern life science research, the recent genome-editing revolution has its biological roots in the world of bacteria and archaea. Clustered, regularly interspaced, short palindromic repeats (CRISPR) loci are found in the genomes of many bacteria and most archaea, and underlie an adaptive immune system that protects the host cell against invasive nucleic acids such as viral genomes. In recent years, engineered versions of these systems have enabled efficient DNA targeting in living cells from dozens of species (including humans and other eukaryotes), and the exploitation of the resulting endogenous DNA repair pathways has provided a route to fast, easy, and affordable genome editing. In only three years after RNA-guided DNA cleavage was first harnessed, the ability to edit genomes via simple, user-defined RNA sequences has already revolutionized nearly all areas of biological science. CRISPR-based technologies are now poised to similarly revolutionize many facets of clinical medicine, and even promise to advance the long-term goal of directly editing genomic sequences of patients with inherited disease. In this review, we describe the biological and mechanistic basis for these remarkable immune systems, and how their engineered derivatives are revolutionizing basic and clinical research. PMID:26078042

  5. Synchronized navigation and comparative analyses across Ensembl complete bacterial genomes with INSYGHT

    PubMed Central

    Lacroix, Thomas; Thérond, Sylvie; Rugeri, Marc; Nicolas, Pierre; Gendrault, Annie; Loux, Valentin; Gibrat, Jean-François

    2016-01-01

    Motivation: High-throughput sequencing technologies provide access to an increasing number of bacterial genomes. Today, many analyses involve the comparison of biological properties among many strains of a given species, or among species of a particular genus. Tools that can help the microbiologist with these tasks become increasingly important. Results: Insyght is a comparative visualization tool whose core features combine a synchronized navigation across genomic data of multiple organisms with a versatile interoperability between complementary views. In this work, we have greatly increased the scope of the Insyght public dataset by including 2688 complete bacterial genomes available in Ensembl thus vastly improving its phylogenetic coverage. We also report the development of a virtual machine that allows users to easily set up and customize their own local Insyght server. Availability and implementation: http://genome.jouy.inra.fr/Insyght Contact: Thomas.Lacroix@jouy.inra.fr PMID:26607491

  6. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    PubMed

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction. PMID:26464377

  7. The most deviated codon position in AT-rich bacterial genomes: a function related analysis.

    PubMed

    Ma, Bin-Guang; Chen, Ling-Ling

    2005-10-01

    We have performed systematic study on more than 120 archaeal and bacterial genomes. Based on the index proposed in the current paper, clear patterns are observed showing the relation between the base compositional deviation at three codon positions and the genomic GC content. For AT-rich genomes, the Most Deviated Codon Position (MDCP) is the 1st codon position, while for GC-rich genomes, MDCP appears at the 2nd or 3rd codon position alternatively. According to MDCP, the CDSs of a genome can be classified into two types: typical and atypical. In AT-rich genomes the typical represent the majority and account for about 3/4 of all the CDSs. Based on the functional classification of COG database, the two types of CDSs are examined. An apparent bias of distribution is observed that the CDSs with the function of 'information processing' are more likely to present in typical type. PMID:16060688

  8. Gene identification in bacterial and organellar genomes using GeneScan.

    PubMed

    Ramakrishna, R; Srinivasan, R

    1999-03-30

    The performance of the GeneScan algorithm for gene identification has been improved by incorporation of a directed iterative scanning procedure. Application is made here to the cases of bacterial and organnellar genomes. The sensitivity of gene identification was 100% in Plasmodium falciparum plastid-like genome (35 kb) and in 98% in the Mycoplasma genitalium genome (approximately 580 kb) and the Haemophilus influenzae Rd genome (approximately 1.8 Mb). Sensitivity was found to improve in both the Open Reading Frames (ORFs) which have been identified as genes (by homology or by other methods) and those that are classified as hypothetical. False positive assignments (at the nucleotide level) were 0.25% in H. influenzae genome and 0.3% in M. genitalium. There were no false positive assignments in the plastid-like genome. The agreement between the GeneScan predictions and GeneMark predictions of putative ORFs was 97% in M. genitalium genome and 86% in H. influenzae genome. In terms of an exact match between predicted genes/ORFs and the annotation in the databank, GeneScan performance was evaluated to be between 72% and 90% in different genomes. We predict five putative ORFs that were not annotated earlier in the GenBank files for both M. genitalium and H. influenzae genomes. Our preliminary analysis of the newly sequenced G + C rich genome of Mycobacterium tuberculosis H37Rv also shows comparable sensitivity (99%). PMID:10353188

  9. A Markovian analysis of bacterial genome sequence constraints

    PubMed Central

    Skewes, Aaron D.

    2013-01-01

    The arrangement of nucleotides within a bacterial chromosome is influenced by numerous factors. The degeneracy of the third codon within each reading frame allows some flexibility of nucleotide selection; however, the third nucleotide in the triplet of each codon is at least partly determined by the preceding two. This is most evident in organisms with a strong G + C bias, as the degenerate codon must contribute disproportionately to maintaining that bias. Therefore, a correlation exists between the first two nucleotides and the third in all open reading frames. If the arrangement of nucleotides in a bacterial chromosome is represented as a Markov process, we would expect that the correlation would be completely captured by a second-order Markov model and an increase in the order of the model (e.g., third-, fourth-…order) would not capture any additional uncertainty in the process. In this manuscript, we present the results of a comprehensive study of the Markov property that exists in the DNA sequences of 906 bacterial chromosomes. All of the 906 bacterial chromosomes studied exhibit a statistically significant Markov property that extends beyond second-order, and therefore cannot be fully explained by codon usage. An unrooted tree containing all 906 bacterial chromosomes based on their transition probability matrices of third-order shares ∼25% similarity to a tree based on sequence homologies of 16S rRNA sequences. This congruence to the 16S rRNA tree is greater than for trees based on lower-order models (e.g., second-order), and higher-order models result in diminishing improvements in congruence. A nucleotide correlation most likely exists within every bacterial chromosome that extends past three nucleotides. This correlation places significant limits on the number of nucleotide sequences that can represent probable bacterial chromosomes. Transition matrix usage is largely conserved by taxa, indicating that this property is likely inherited, however some

  10. Comparative genomics and functional annotation of bacterial transporters

    NASA Astrophysics Data System (ADS)

    Gelfand, Mikhail S.; Rodionov, Dmitry A.

    2008-03-01

    Transport proteins are difficult to study experimentally, and because of that their functional characterization trails that of enzymes. The comparative genomic analysis is a powerful approach to functional annotation of proteins, which makes it possible to utilize the genomic sequence data from thousands of organisms. The use of computational techniques allows one to identify candidate transporters, predict their structure and localization in the membrane, and perform detailed functional annotation, which includes substrate specificity and cellular role. We overview the main techniques of analysis of transporters' structure and function. We consider the most popular algorithms to identify transmembrane segments in protein sequences and to predict topology of multispanning proteins. We describe the main approaches of the comparative genomics, and how they may be applied to the analysis of transporters, and provide examples showing how combinations of these techniques is used for functional annotation of new transporter specificities in known families, characterization of new families, and prediction of novel transport mechanisms.

  11. Identification of large-scale genomic variation in cancer genomes using in silico reference models.

    PubMed

    Killcoyne, Sarah; Del Sol, Antonio

    2016-01-01

    Identifying large-scale structural variation in cancer genomes continues to be a challenge to researchers. Current methods rely on genome alignments based on a reference that can be a poor fit to highly variant and complex tumor genomes. To address this challenge we developed a method that uses available breakpoint information to generate models of structural variations. We use these models as references to align previously unmapped and discordant reads from a genome. By using these models to align unmapped reads, we show that our method can help to identify large-scale variations that have been previously missed. PMID:26264669

  12. Identification of large-scale genomic variation in cancer genomes using in silico reference models

    PubMed Central

    Killcoyne, Sarah; del Sol, Antonio

    2016-01-01

    Identifying large-scale structural variation in cancer genomes continues to be a challenge to researchers. Current methods rely on genome alignments based on a reference that can be a poor fit to highly variant and complex tumor genomes. To address this challenge we developed a method that uses available breakpoint information to generate models of structural variations. We use these models as references to align previously unmapped and discordant reads from a genome. By using these models to align unmapped reads, we show that our method can help to identify large-scale variations that have been previously missed. PMID:26264669

  13. The Extent of Genome Flux and Its Role in the Differentiation of Bacterial Lineages

    PubMed Central

    Nowell, Reuben W.; Green, Sarah; Laue, Bridget E.; Sharp, Paul M.

    2014-01-01

    Horizontal gene transfer (HGT) and gene loss are key processes in bacterial evolution. However, the role of gene gain and loss in the emergence and maintenance of ecologically differentiated bacterial populations remains an open question. Here, we use whole-genome sequence data to quantify gene gain and loss for 27 lineages of the plant-associated bacterium Pseudomonas syringae. We apply an extensive error-control procedure that accounts for errors in draft genome data and greatly improves the accuracy of patterns of gene occurrence among these genomes. We demonstrate a history of extensive genome fluctuation for this species and show that individual lineages could have acquired thousands of genes in the same period in which a 1% amino acid divergence accrues in the core genome. Elucidating the dynamics of genome fluctuation reveals the rapid turnover of gained genes, such that the majority of recently gained genes are quickly lost. Despite high observed rates of fluctuation, a phylogeny inferred from patterns of gene occurrence is similar to a phylogeny based on amino acid replacements within the core genome. Furthermore, the core genome phylogeny suggests that P. syringae should be considered a number of distinct species, with levels of divergence at least equivalent to those between recognized bacterial species. Gained genes are transferred from a variety of sources, reflecting the depth and diversity of the potential gene pool available via HGT. Overall, our results provide further insights into the evolutionary dynamics of genome fluctuation and implicate HGT as a major factor contributing to the diversification of P. syringae lineages. PMID:24923323

  14. Design and synthesis of a minimal bacterial genome.

    PubMed

    Hutchison, Clyde A; Chuang, Ray-Yuan; Noskov, Vladimir N; Assad-Garcia, Nacyra; Deerinck, Thomas J; Ellisman, Mark H; Gill, John; Kannan, Krishna; Karas, Bogumil J; Ma, Li; Pelletier, James F; Qi, Zhi-Qing; Richter, R Alexander; Strychalski, Elizabeth A; Sun, Lijie; Suzuki, Yo; Tsvetanova, Billyana; Wise, Kim S; Smith, Hamilton O; Glass, John I; Merryman, Chuck; Gibson, Daniel G; Venter, J Craig

    2016-03-25

    We used whole-genome design and complete chemical synthesis to minimize the 1079-kilobase pair synthetic genome of Mycoplasma mycoides JCVI-syn1.0. An initial design, based on collective knowledge of molecular biology combined with limited transposon mutagenesis data, failed to produce a viable cell. Improved transposon mutagenesis methods revealed a class of quasi-essential genes that are needed for robust growth, explaining the failure of our initial design. Three cycles of design, synthesis, and testing, with retention of quasi-essential genes, produced JCVI-syn3.0 (531 kilobase pairs, 473 genes), which has a genome smaller than that of any autonomously replicating cell found in nature. JCVI-syn3.0 retains almost all genes involved in the synthesis and processing of macromolecules. Unexpectedly, it also contains 149 genes with unknown biological functions. JCVI-syn3.0 is a versatile platform for investigating the core functions of life and for exploring whole-genome design. PMID:27013737

  15. Intron-genome size relationship on a large evolutionary scale.

    PubMed

    Vinogradov, A E

    1999-09-01

    The intron-genome size relationship was studied across a wide evolutionary range (from slime mold and yeast to human and maize), as well as the relationship between genome size and the ratio of intervening/coding sequence size. The average intron size is scaled to genome size with a slope of about one-fourth for the log-transformed values; i.e., on the global scale its increase in evolution is lower than the increase in genome size by four orders of magnitude. There are exceptions to the general trend. In baker's yeast introns are extraordinarily long for its genome size. Tetrapods also have longer introns than expected for their genome sizes. In teleost fish the mean intron size does not differ significantly, notwithstanding the differences in genome size. In contrast to previous reports, avian introns were not found to be significantly shorter than introns of mammals, although avian genomes are smaller than genomes of mammals on average by about a factor of 2.5. The extra-/intragenic ratio of noncoding DNA can be higher in fungi than in animals, notwithstanding the smaller fungal genomes. In vertebrates and invertebrates taken separately, this ratio is increasing as the increase in genome size. Two hypotheses are proposed to explain the variation in the extra-/intragenic ratio of noncoding DNA in organisms with similar numbers of genes: transition (dynamic) and equilibrium (static). According to the transition model, this variation arises with the rapid shift of genome size because the bulk of extragenic DNA can be changed more rapidly than the finely interspersed intron sequences. The equilibrium model assumes that this variation is a result of selective adjustment of genome size with constraints imposed on the intron size due to its putative link to chromatin structure (and constraints of the splicing machinery). PMID:10473779

  16. Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data

    PubMed Central

    2013-01-01

    High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner’s guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available E. coli data and free software tools, all which can be performed on a desktop computer. PMID:23575213

  17. Draft genome sequence of XANTHOMONAS ARBORICOLA strain 3004, causal agent of bacterial disease on barley

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report here the annotated genome sequence of XANTHOMONAS ARBORICOLA str. 3004, a Gram-negative phytopathogenic bacteria that includes several pathovars characterized by virulence specificity. Strain 3004 was isolated from barley leaves with symptoms of streak (bacterial blight) and also can infec...

  18. Complete Genome Sequence of a Human Cytomegalovirus Strain AD169 Bacterial Artificial Chromosome Clone

    PubMed Central

    Ostermann, Eleonore; Spohn, Michael; Indenbirken, Daniela

    2016-01-01

    The complete sequence of the human cytomegalovirus strain AD169 (variant ATCC) cloned as a bacterial artificial chromosome (AD169-BAC, also known as HB15 or pHB15) was determined. The viral genome has a length of 230,290 bp and shows 52 nucleotide differences compared to a previously sequenced AD169varATCC clone. PMID:27034483

  19. On the analysis of large-scale genomic structures.

    PubMed

    Oiwa, Nestor Norio; Goldman, Carla

    2005-01-01

    We apply methods from statistical physics (histograms, correlation functions, fractal dimensions, and singularity spectra) to characterize large-scale structure of the distribution of nucleotides along genomic sequences. We discuss the role of the extension of noncoding segments ("junk DNA") for the genomic organization, and the connection between the coding segment distribution and the high-eukaryotic chromatin condensation. The following sequences taken from GenBank were analyzed: complete genome of Xanthomonas campestri, complete genome of yeast, chromosome V of Caenorhabditis elegans, and human chromosome XVII around gene BRCA1. The results are compared with the random and periodic sequences and those generated by simple and generalized fractal Cantor sets. PMID:15858230

  20. Distribution of repetitive DNA sequences in eubacteria and application to fingerprinting of bacterial genomes.

    PubMed Central

    Versalovic, J; Koeuth, T; Lupski, J R

    1991-01-01

    Dispersed repetitive DNA sequences have been described recently in eubacteria. To assess the distribution and evolutionary conservation of two distinct prokaryotic repetitive elements, consensus oligonucleotides were used in polymerase chain reaction [PCR] amplification and slot blot hybridization experiments with genomic DNA from diverse eubacterial species. Oligonucleotides matching Repetitive Extragenic Palindromic [REP] elements and Enterobacterial Repetitive Intergenic Consensus [ERIC] sequences were synthesized and tested as opposing PCR primers in the amplification of eubacterial genomic DNA. REP and ERIC consensus oligonucleotides produced clearly resolvable bands by agarose gel electrophoresis following PCR amplification. These band patterns provided unambiguous DNA fingerprints of different eubacterial species and strains. Both REP and ERIC probes hybridized preferentially to genomic DNA from Gram-negative enteric bacteria and related species. Widespread distribution of these repetitive DNA elements in the genomes of various microorganisms should enable rapid identification of bacterial species and strains, and be useful for the analysis of prokaryotic genomes. Images PMID:1762913

  1. Using Genome-Scale Models to Predict Biological Capabilities

    PubMed Central

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome-scale have been under development since the first whole genome sequences appeared in the mid-1990s. A few years ago this approach began to demonstrate the ability to predict a range of cellular functions including cellular growth capabilities on various substrates and the effect of gene knockouts at the genome-scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This primer will get you started. PMID:26000478

  2. Large Scale Bacterial Colony Screening of Diversified FRET Biosensors

    PubMed Central

    Litzlbauer, Julia; Schifferer, Martina; Ng, David; Fabritius, Arne; Thestrup, Thomas; Griesbeck, Oliver

    2015-01-01

    Biosensors based on Förster Resonance Energy Transfer (FRET) between fluorescent protein mutants have started to revolutionize physiology and biochemistry. However, many types of FRET biosensors show relatively small FRET changes, making measurements with these probes challenging when used under sub-optimal experimental conditions. Thus, a major effort in the field currently lies in designing new optimization strategies for these types of sensors. Here we describe procedures for optimizing FRET changes by large scale screening of mutant biosensor libraries in bacterial colonies. We describe optimization of biosensor expression, permeabilization of bacteria, software tools for analysis, and screening conditions. The procedures reported here may help in improving FRET changes in multiple suitable classes of biosensors. PMID:26061878

  3. ANItools web: a web tool for fast genome comparison within multiple bacterial strains

    PubMed Central

    Han, Na; Qiang, Yujun; Zhang, Wen

    2016-01-01

    Background: Early classification of prokaryotes was based solely on phenotypic similarities, but modern prokaryote characterization has been strongly influenced by advances in genetic methods. With the fast development of the sequencing technology, the ever increasing number of genomic sequences per species offers the possibility for developing distance determinations based on whole-genome information. The average nucleotide identity (ANI), calculated from pair-wise comparisons of all sequences shared between two given strains, has been proposed as the new metrics for bacterial species definition and classification. Results: In this study, we developed the web version of ANItools (http://ani.mypathogen.cn/), which helps users directly get ANI values from online sources. A database covering ANI values of any two strains in a genus was also included (2773 strains, 1487 species and 668 genera). Importantly, ANItools web can automatically run genome comparison between the input genomic sequence and data sequences (Genus and Species levels), and generate a graphical report for ANI calculation results. Conclusion: ANItools web is useful for defining the relationship between bacterial strains, further contributing to the classification and identification of bacterial species using genome data. Database URL: http://ani.mypathogen.cn/ PMID:27270714

  4. Bacterial communities in full-scale wastewater treatment systems.

    PubMed

    Cydzik-Kwiatkowska, Agnieszka; Zielińska, Magdalena

    2016-04-01

    Bacterial metabolism determines the effectiveness of biological treatment of wastewater. Therefore, it is important to define the relations between the species structure and the performance of full-scale installations. Although there is much laboratory data on microbial consortia, our understanding of dependencies between the microbial structure and operational parameters of full-scale wastewater treatment plants (WWTP) is limited. This mini-review presents the types of microbial consortia in WWTP. Information is given on extracellular polymeric substances production as factor that is key for formation of spatial structures of microorganisms. Additionally, we discuss data on microbial groups including nitrifiers, denitrifiers, Anammox bacteria, and phosphate- and glycogen-accumulating bacteria in full-scale aerobic systems that was obtained with the use of molecular techniques, including high-throughput sequencing, to shed light on dependencies between the microbial ecology of biomass and the overall efficiency and functional stability of wastewater treatment systems. Sludge bulking in WWTPs is addressed, as well as the microbial composition of consortia involved in antibiotic and micropollutant removal. PMID:26931606

  5. MLST revisited: the gene-by-gene approach to bacterial genomics

    PubMed Central

    Maiden, Martin C. J.; Jansen van Rensburg, Melissa J.; Bray, James E.; Earle, Sarah G.; Ford, Suzanne A.; Jolley, Keith A.; McCarthy, Noel D.

    2014-01-01

    Multilocus sequence typing (MLST) was proposed in 1998 as a portable sequence-based method for identifying clonal relationships among bacteria. Today, in the whole-genome era of microbiology, the need for systematic, standardized descriptions of bacterial genotypic variation remains a priority. Here, to meet this need, we draw on the successes of MLST and 16S rRNA gene sequencing to propose a hierarchical gene-by-gene approach that reflects functional and evolutionary relationships and catalogues bacteria ‘from domain to strain’. Our gene-based typing approach using online platforms such as the Bacterial Isolate Genome Sequence Database (BIGSdb) allows the scalable organization and analysis of whole-genome sequence data. PMID:23979428

  6. PREDetector: a new tool to identify regulatory elements in bacterial genomes.

    PubMed

    Hiard, Samuel; Marée, Raphaël; Colson, Séverine; Hoskisson, Paul A; Titgemeyer, Fritz; van Wezel, Gilles P; Joris, Bernard; Wehenkel, Louis; Rigali, Sébastien

    2007-06-15

    In the post-genomic area, the prediction of transcription factor regulons by position weight matrix-based programmes is a powerful approach to decipher biological pathways and to modelize regulatory networks in bacteria. The main difficulty once a regulon prediction is available is to estimate its reliability prior to start expensive experimental validations and therefore trying to find a way how to identify true positive hits from an endless list of potential target genes of a regulatory protein. Here we introduce PREDetector (Prokaryotic Regulatory Elements Detector), a tool developed for predicting regulons of DNA-binding proteins in bacterial genomes that, beside the automatic prediction, scoring and positioning of potential binding sites and their respective target genes in annotated bacterial genomes, it also provides an easy way to estimate the thresholds where to find reliable possible new target genes. PREDetector can be downloaded freely at http://www.montefiore.ulg.ac.be/~hiard/PreDetector/PreDetector.php. PMID:17451648

  7. Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling

    PubMed Central

    Assad-Garcia, Nacyra; Kostylev, Maxim; Noskov, Vladimir N.; Wise, Kim S.; Karas, Bogumil J.; Stam, Jason; Montague, Michael G.; Hanly, Timothy J.; Enriquez, Nico J.; Ramon, Adi; Goldgof, Gregory M.; Richter, R. Alexander; Vashee, Sanjay; Chuang, Ray-Yuan; Winzeler, Elizabeth A.; Hutchison, Clyde A.; Gibson, Daniel G.; Smith, Hamilton O.; Glass, John I.; Venter, J. Craig

    2015-01-01

    The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmal genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ∼10% of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes. PMID:25654978

  8. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    DOE PAGESBeta

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; Tschaplinski, Timothy J.; Land, Miriam L.; Brown, Steven D.; Covalla, Sean; Klingeman, Dawn Marie; Yang, Zamin Koo; Engle, Nancy L.; et al

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  9. Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

    PubMed Central

    Yoo, Wonseok; Lim, Dongbin

    2016-01-01

    A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA. PMID:27103888

  10. The OME Framework for genome-scale systems biology

    SciTech Connect

    Palsson, Bernhard O.; Ebrahim, Ali; Federowicz, Steve

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  11. Identifying Recent Adaptations in Large-scale Genomic Data

    PubMed Central

    Grossman, Sharon R.; Andersen, Kristian G.; Shlyakhter, Ilya; Tabrizi, Shervin; Winnicki, Sarah; Yen, Angela; Park, Daniel J.; Griesemer, Dustin; Karlsson, Elinor K.; Wong, Sunny H.; Cabili, Moran; Adegbola, Richard A.; Bamezai, Rameshwar N. K.; Hill, Adrian V. S.; Vannberg, Fredrik O.; Rinn, John L.; Lander, Eric S.; Schaffner, Stephen F.; Sabeti, Pardis C.

    2013-01-01

    SUMMARY While several hundred regions of the human genome harbor signals of positive natural selection, few of the relevant adaptive traits and variants have been elucidated. Using full-genome sequence variation from the 1000 Genomes Project (1000G) and the Composite of Multiple Signals (CMS) test, we investigated 412 candidate signals and leveraged functional annotation, protein structure modeling, epigenetics, and association studies to identify and extensively annotate candidate causal variants. The resulting catalog provides a tractable list for experimental follow-up; it includes thirty-five high-scoring non-synonymous variants, fifty-nine variants associated with expression levels of a nearby coding gene or lincRNA, and numerous variants associated with susceptibility to infectious disease and other phenotypes. We experimentally characterized one candidate non-synonymous variant in TLR5, and show that it leads to altered NF-κB signaling in response to bacterial flagellin. PMID:23415221

  12. (Actino)Bacterial "intelligence": using comparative genomics to unravel the information processing capacities of microbes.

    PubMed

    Pinto, Daniela; Mascher, Thorsten

    2016-08-01

    Bacterial genomes encode numerous and often sophisticated signaling devices to perceive changes in their environment and mount appropriate adaptive responses. With their help, microbes are able to orchestrate specific decision-making processes that alter the cellular behavior, but also integrate and communicate information. Moreover and beyond, some signal transducing systems also enable bacteria to remember and learn from previous stimuli to anticipate environmental changes. As recently suggested, all of these aspects indicate that bacteria do, in fact, exhibit cognition remarkably reminiscent of what we refer to as intelligent behavior, at least when referred to higher eukaryotes. In this essay, comprehensive data derived from comparative genomics analyses of microbial signal transduction systems are used to probe the concept of cognition in bacterial cells. Using a recent comprehensive analysis of over 100 actinobacterial genomes as a test case, we illustrate the different layers of the capacities of bacteria that result in cognitive and behavioral complexity as well as some form of 'bacterial intelligence'. We try to raise awareness to approach bacteria as cognitive organisms and believe that this view would enrich and open a new path in the experimental studies of bacterial signal transducing systems. PMID:26852121

  13. Large-scale structure of genomic methylation patterns.

    PubMed

    Rollins, Robert A; Haghighi, Fatemeh; Edwards, John R; Das, Rajdeep; Zhang, Michael Q; Ju, Jingyue; Bestor, Timothy H

    2006-02-01

    The mammalian genome depends on patterns of methylated cytosines for normal function, but the relationship between genomic methylation patterns and the underlying sequence is unclear. We have characterized the methylation landscape of the human genome by global analysis of patterns of CpG depletion and by direct sequencing of 3073 unmethylated domains and 2565 methylated domains from human brain DNA. The genome was found to consist of short (<4 kb) unmethylated domains embedded in a matrix of long methylated domains. Unmethylated domains were enriched in promoters, CpG islands, and first exons, while methylated domains comprised interspersed and tandem-repeated sequences, exons other than first exons, and non-annotated single-copy sequences that are depleted in the CpG dinucleotide. The enrichment of regulatory sequences in the relatively small unmethylated compartment suggests that cytosine methylation constrains the effective size of the genome through the selective exposure of regulatory sequences. This buffers regulatory networks against changes in total genome size and provides an explanation for the C value paradox, which concerns the wide variations in genome size that scale independently of gene number. This suggestion is compatible with the finding that cytosine methylation is universal among large-genome eukaryotes, while many eukaryotes with genome sizes <5 x 10(8) bp do not methylate their DNA. PMID:16365381

  14. Genomic analyses of bacterial porin-cytochrome gene clusters

    DOE PAGESBeta

    Shi, Liang; Fredrickson, James K.; Zachara, John M.

    2014-11-26

    In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteriamore » from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III) and Mn(IV) oxides.« less

  15. The genome sequence of Xanthomonas oryzae pathovar oryzae KACC10331, the bacterial blight pathogen of rice

    PubMed Central

    Lee, Byoung-Moo; Park, Young-Jin; Park, Dong-Suk; Kang, Hee-Wan; Kim, Jeong-Gu; Song, Eun-Sung; Park, In-Cheol; Yoon, Ung-Han; Hahn, Jang-Ho; Koo, Bon-Sung; Lee, Gil-Bok; Kim, Hyungtae; Park, Hyun-Seok; Yoon, Kyong-Oh; Kim, Jeong-Hyun; Jung, Chol-hee; Koh, Nae-Hyung; Seo, Jeong-Sun; Go, Seung-Joo

    2005-01-01

    The nucleotide sequence was determined for the genome of Xanthomonas oryzae pathovar oryzae (Xoo) KACC10331, a bacterium that causes bacterial blight in rice (Oryza sativa L.). The genome is comprised of a single, 4 941 439 bp, circular chromosome that is G + C rich (63.7%). The genome includes 4637 open reading frames (ORFs) of which 3340 (72.0%) could be assigned putative function. Orthologs for 80% of the predicted Xoo genes were found in the previously reported X.axonopodis pv. citri (Xac) and X.campestris pv. campestris (Xcc) genomes, but 245 genes apparently specific to Xoo were identified. Xoo genes likely to be associated with pathogenesis include eight with similarity to Xanthomonas avirulence (avr) genes, a set of hypersensitive reaction and pathogenicity (hrp) genes, genes for exopolysaccharide production, and genes encoding extracellular plant cell wall-degrading enzymes. The presence of these genes provides insights into the interactions of this pathogen with its gramineous host. PMID:15673718

  16. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    PubMed Central

    Speth, Daan R.; in 't Zandt, Michiel H.; Guerrero-Cruz, Simon; Dutilh, Bas E.; Jetten, Mike S. M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date. PMID:27029554

  17. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    PubMed

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date. PMID:27029554

  18. BG7: a new approach for bacterial genome annotation designed for next generation sequencing data.

    PubMed

    Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Pareja, Eduardo; Tobes, Raquel

    2012-01-01

    BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version - which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities and efficiency were demonstrated in the 2011 EHEC Germany outbreak in which BG7 was used to get the first annotations right the next day after the first entero-hemorrhagic E. coli genome sequences were made publicly available. The suitability of BG7 for genome annotation has been proved for Illumina, 454, Ion Torrent, and PacBio sequencing technologies. Besides, thanks to its plasticity, our system could be very easily adapted to work with new technologies in the future. PMID:23185310

  19. Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes

    PubMed Central

    2011-01-01

    Background During the replication process of bacteria with circular chromosomes, an odd number of homologous recombination events results in concatenated dimer chromosomes that cannot be partitioned into daughter cells. However, many bacteria harbor a conserved dimer resolution machinery consisting of one or two tyrosine recombinases, XerC and XerD, and their 28-bp target site, dif. Results To study the evolution of the dif/XerCD system and its relationship with replication termination, we report the comprehensive prediction of dif sequences in silico using a phylogenetic prediction approach based on iterated hidden Markov modeling. Using this method, dif sites were identified in 641 organisms among 16 phyla, with a 97.64% identification rate for single-chromosome strains. The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures. Conclusions The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method. The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites. PMID:21223577

  20. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss

    PubMed Central

    2010-01-01

    Background The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains. Results To better understand genome evolution and evolution of virulence characteristics in Listeria, we used a next generation sequencing approach to generate draft genomes for seven strains representing Listeria species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main Listeria species, showed evidence for (i) a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii) a critical role of gene loss events in transition of Listeria species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii) divergence of modern pathogenic and non-pathogenic Listeria species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes. Conclusions Genome evolution in Listeria involved limited gene loss and acquisition as supported by (i) a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii) conserved genome size (between 2.8 and 3.2 Mb), and (iii) a highly syntenic genome. Limited gene loss in Listeria did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus Listeria thus provides

  1. Genomic analyses of bacterial porin-cytochrome gene clusters

    SciTech Connect

    Shi, Liang; Fredrickson, James K.; Zachara, John M.

    2014-11-26

    In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteria from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular

  2. Genomic comparisons between paired bacterial strains with strong and weak GC skews.

    PubMed

    Song, Tie-Jun; Wang, Yue; Shen, Jian-Gen; Pan, Jian-Ping; Huang, Jun

    2014-02-01

    A majority of known eubacterial genomes are characteristic of GC skew, i.e., the leading strand has exceeding number of G over C. The cause of this compositional bias is still not very clear. In this study, we chose five pairs of genomes from distantly related bacterial genera, i.e., Buchnera, Haemophilus, Mycoplasma, Mycobacterium, and Synechococcus, each containing one with strong GC skew and the other with weak GC skew. Through comparison of the orthologous genes in these genera, we found that neither chromosomal rearrangement nor CDS skew has direct relationship with GC skew. PMID:23457112

  3. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage

    PubMed Central

    Dupont, Chris L; Rusch, Douglas B; Yooseph, Shibu; Lombardo, Mary-Jane; Alexander Richter, R; Valas, Ruben; Novotny, Mark; Yee-Greenbaum, Joyclyn; Selengut, Jeremy D; Haft, Dan H; Halpern, Aaron L; Lasken, Roger S; Nealson, Kenneth; Friedman, Robert; Craig Venter, J

    2012-01-01

    Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25–1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition. PMID:22170421

  4. Altered tRNA characteristics and 3' maturation in bacterial symbionts with reduced genomes.

    PubMed

    Hansen, Allison K; Moran, Nancy A

    2012-09-01

    Translational efficiency is controlled by tRNAs and other genome-encoded mechanisms. In organelles, translational processes are dramatically altered because of genome shrinkage and horizontal acquisition of gene products. The influence of genome reduction on translation in endosymbionts is largely unknown. Here, we investigate whether divergent lineages of Buchnera aphidicola, the reduced-genome bacterial endosymbiont of aphids, possess altered translational features compared with their free-living relative, Escherichia coli. Our RNAseq data support the hypothesis that translation is less optimal in Buchnera than in E. coli. We observed a specific, convergent, pattern of tRNA loss in Buchnera and other endosymbionts that have undergone genome shrinkage. Furthermore, many modified nucleoside pathways that are important for E. coli translation are lost in Buchnera. Additionally, Buchnera's A + T compositional bias has resulted in reduced tRNA thermostability, and may have altered aminoacyl-tRNA synthetase recognition sites. Buchnera tRNA genes are shorter than those of E. coli, as the majority no longer has a genome-encoded 3' CCA; however, all the expressed, shortened tRNAs undergo 3' CCA maturation. Moreover, expression of tRNA isoacceptors was not correlated with the usage of corresponding codons. Overall, our data suggest that endosymbiont genome evolution alters tRNA characteristics that are known to influence translational efficiency in their free-living relative. PMID:22689638

  5. Diversification of bacterial genome content through distinct mechanisms over different timescales

    PubMed Central

    Croucher, Nicholas J.; Coupland, Paul G.; Stevenson, Abbie E.; Callendrello, Alanna; Bentley, Stephen D.; Hanage, William P.

    2014-01-01

    Bacterial populations often consist of multiple co-circulating lineages. Determining how such population structures arise requires understanding what drives bacterial diversification. Using 616 systematically sampled genomes, we show that Streptococcus pneumoniae lineages are typically characterized by combinations of infrequently transferred stable genomic islands: those moving primarily through transformation, along with integrative and conjugative elements and phage-related chromosomal islands. The only lineage containing extensive unique sequence corresponds to a set of atypical unencapsulated isolates that may represent a distinct species. However, prophage content is highly variable even within lineages, suggesting frequent horizontal transmission that would necessitate rapidly diversifying anti-phage mechanisms to prevent these viruses sweeping through populations. Correspondingly, two loci encoding Type I restriction-modification systems able to change their specificity over short timescales through intragenomic recombination are ubiquitous across the collection. Hence short-term pneumococcal variation is characterized by movement of phage and intragenomic rearrangements, with the slower transfer of stable loci distinguishing lineages. PMID:25407023

  6. GRAT--genome-scale rapid alignment tool.

    PubMed

    Kindlund, Ellen; Tammi, Martti T; Arner, Erik; Nilsson, Daniel; Andersson, Björn

    2007-04-01

    Modern alignment methods designed to work rapidly and efficiently with large datasets often do so at the cost of method sensitivity. To overcome this, we have developed a novel alignment program, GRAT, built to accurately align short, highly similar DNA sequences. The program runs rapidly and requires no more memory and CPU power than a desktop computer. In addition, specificity is ensured by statistically separating the true alignments from spurious matches using phred quality values. An efficient separation is especially important when searching large datasets and whenever there are repeats present in the dataset. Results are superior in comparison to widely used existing software, and analysis of two large genomic datasets show the usefulness and scalability of the algorithm. PMID:17292508

  7. The CRISPR-Cas system - from bacterial immunity to genome engineering.

    PubMed

    Czarnek, Maria; Bereta, Joanna

    2016-01-01

    Precise and efficient genome modifications present a great value in attempts to comprehend the roles of particular genes and other genetic elements in biological processes as well as in various pathologies. In recent years novel methods of genome modification known as genome editing, which utilize so called "programmable" nucleases, came into use. A true revolution in genome editing has been brought about by the introduction of the CRISP-Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) system, in which one of such nucleases, i.e. Cas9, plays a major role. This system is based on the elements of the bacterial and archaeal mechanism responsible for acquired immunity against phage infections and transfer of foreign genetic material. Microorganisms incorporate fragments of foreign DNA into CRISPR loci present in their genomes, which enables fast recognition and elimination of future infections. There are several types of CRISPR-Cas systems among prokaryotes but only elements of CRISPR type II are employed in genome engineering. CRISPR-Cas type II utilizes small RNA molecules (crRNA and tracrRNA) to precisely direct the effector nuclease - Cas9 - to a specific site in the genome, i.e. to the sequence complementary to crRNA. Cas9 may be used to: (i) introduce stable changes into genomes e.g. in the process of generation of knock-out and knock-in animals and cell lines, (ii) activate or silence the expression of a gene of interest, and (iii) visualize specific sites in genomes of living cells. The CRISPR-Cas-based tools have been successfully employed for generation of animal and cell models of a number of diseases, e.g. specific types of cancer. In the future, the genome editing by programmable nucleases may find wide application in medicine e.g. in the therapies of certain diseases of genetic origin and in the therapy of HIV-infected patients. PMID:27594566

  8. GC-Content Evolution in Bacterial Genomes: The Biased Gene Conversion Hypothesis Expands

    PubMed Central

    Lassalle, Florent; Périan, Séverine; Bataillon, Thomas; Nesme, Xavier; Duret, Laurent; Daubin, Vincent

    2015-01-01

    The characterization of functional elements in genomes relies on the identification of the footprints of natural selection. In this quest, taking into account neutral evolutionary processes such as mutation and genetic drift is crucial because these forces can generate patterns that may obscure or mimic signatures of selection. In mammals, and probably in many eukaryotes, another such confounding factor called GC-Biased Gene Conversion (gBGC) has been documented. This mechanism generates patterns identical to what is expected under selection for higher GC-content, specifically in highly recombining genomic regions. Recent results have suggested that a mysterious selective force favouring higher GC-content exists in Bacteria but the possibility that it could be gBGC has been excluded. Here, we show that gBGC is probably at work in most if not all bacterial species. First we find a consistent positive relationship between the GC-content of a gene and evidence of intra-genic recombination throughout a broad spectrum of bacterial clades. Second, we show that the evolutionary force responsible for this pattern is acting independently from selection on codon usage, and could potentially interfere with selection in favor of optimal AU-ending codons. A comparison with data from human populations shows that the intensity of gBGC in Bacteria is comparable to what has been reported in mammals. We propose that gBGC is not restricted to sexual Eukaryotes but also widespread among Bacteria and could therefore be an ancestral feature of cellular organisms. We argue that if gBGC occurs in bacteria, it can account for previously unexplained observations, such as the apparent non-equilibrium of base substitution patterns and the heterogeneity of gene composition within bacterial genomes. Because gBGC produces patterns similar to positive selection, it is essential to take this process into account when studying the evolutionary forces at work in bacterial genomes. PMID:25659072

  9. Physical descriptions of the bacterial nucleoid at large scales, and their biological implications

    NASA Astrophysics Data System (ADS)

    Benza, Vincenzo G.; Bassetti, Bruno; Dorfman, Kevin D.; Scolari, Vittore F.; Bromek, Krystyna; Cicuta, Pietro; Cosentino Lagomarsino, Marco

    2012-07-01

    Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organized at various length scales. This has implications for modulating (when not directly enabling) the core biological processes of replication, transcription and segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. In particular, we highlight the role that statistical and soft condensed matter physics play in describing this system of fundamental biological importance, specifically reviewing classic and more modern tools from the theory of polymers. We also discuss some attempts toward unifying interpretations of the current results, pointing to possible directions for future investigation.

  10. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat

    PubMed Central

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  11. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat.

    PubMed

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  12. Evaluation of genome-enabled selection for bacterial cold water disease resistance using progeny performance data in Rainbow Trout: Insights on genotyping methods and genomic prediction models

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic br...

  13. Territorial Polymers and Large Scale Genome Organization

    NASA Astrophysics Data System (ADS)

    Grosberg, Alexander

    2012-02-01

    Chromatin fiber in interphase nucleus represents effectively a very long polymer packed in a restricted volume. Although polymer models of chromatin organization were considered, most of them disregard the fact that DNA has to stay not too entangled in order to function properly. One polymer model with no entanglements is the melt of unknotted unconcatenated rings. Extensive simulations indicate that rings in the melt at large length (monomer numbers) N approach the compact state, with gyration radius scaling as N^1/3, suggesting every ring being compact and segregated from the surrounding rings. The segregation is consistent with the known phenomenon of chromosome territories. Surface exponent β (describing the number of contacts between neighboring rings scaling as N^β) appears only slightly below unity, β 0.95. This suggests that the loop factor (probability to meet for two monomers linear distance s apart) should decay as s^-γ, where γ= 2 - β is slightly above one. The later result is consistent with HiC data on real human interphase chromosomes, and does not contradict to the older FISH data. The dynamics of rings in the melt indicates that the motion of one ring remains subdiffusive on the time scale well above the stress relaxation time.

  14. Roary: rapid large-scale prokaryote pan genome analysis

    PubMed Central

    Page, Andrew J.; Cummins, Carla A.; Hunt, Martin; Wong, Vanessa K.; Reuter, Sandra; Holden, Matthew T.G.; Fookes, Maria; Falush, Daniel; Keane, Jacqueline A.; Parkhill, Julian

    2015-01-01

    Summary: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors. Availability and implementation: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/Roary Contact: roary@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26198102

  15. Increasing the efficiency of bacterial transcription simulations: When to exclude the genome without loss of accuracy

    PubMed Central

    Iafolla, Marco AJ; Dong, Guang Qiang; McMillen, David R

    2008-01-01

    Background Simulating the major molecular events inside an Escherichia coli cell can lead to a very large number of reactions that compose its overall behaviour. Not only should the model be accurate, but it is imperative for the experimenter to create an efficient model to obtain the results in a timely fashion. Here, we show that for many parameter regimes, the effect of the host cell genome on the transcription of a gene from a plasmid-borne promoter is negligible, allowing one to simulate the system more efficiently by removing the computational load associated with representing the presence of the rest of the genome. The key parameter is the on-rate of RNAP binding to the promoter (k_on), and we compare the total number of transcripts produced from a plasmid vector generated as a function of this rate constant, for two versions of our gene expression model, one incorporating the host cell genome and one excluding it. By sweeping parameters, we identify the k_on range for which the difference between the genome and no-genome models drops below 5%, over a wide range of doubling times, mRNA degradation rates, plasmid copy numbers, and gene lengths. Results We assess the effect of the simulating the presence of the genome over a four-dimensional parameter space, considering: 24 min <= bacterial doubling time <= 100 min; 10 <= plasmid copy number <= 1000; 2 min <= mRNA half-life <= 14 min; and 10 bp <= gene length <= 10000 bp. A simple MATLAB user interface generates an interpolated k_on threshold for any point in this range; this rate can be compared to the ones used in other transcription studies to assess the need for including the genome. Conclusion Exclusion of the genome is shown to yield less than 5% difference in transcript numbers over wide ranges of values, and computational speed is improved by two to 24 times by excluding explicit representation of the genome. PMID:18789148

  16. Bacterial Genomic Data Analysis in the Next-Generation Sequencing Era.

    PubMed

    Orsini, Massimiliano; Cuccuru, Gianmauro; Uva, Paolo; Fotia, Giorgio

    2016-01-01

    Bacterial genome sequencing is now an affordable choice for many laboratories for applications in research, diagnostic, and clinical microbiology. Nowadays, an overabundance of tools is available for genomic data analysis. However, tools differ for algorithms, languages, hardware requirements, and user interface, and combining them as it is necessary for sequence data interpretation often requires (bio)informatics skills which can be difficult to find in many laboratories. In addition, multiple data sources, as well as exceedingly large dataset sizes, and increasingly computational complexity further challenge the accessibility, reproducibility, and transparency of the entire process. In this chapter we will cover the main bioinformatics steps required for a complete bacterial genome analysis using next-generation sequencing data, from the raw sequence data to assembled and annotated genomes. All the tools described are available in the Orione framework ( http://orione.crs4.it ), which uniquely combines in a transparent way the most used open source bioinformatics tools for microbiology, allowing microbiologist without any specific hardware or informatics skill to conduct data-intensive computational analyses from quality control to microbial gene annotation. PMID:27115645

  17. Differential annotation of tRNA genes with anticodon CAT in bacterial genomes

    PubMed Central

    Silva, Francisco J.; Belda, Eugeni; Talens, Santiago E.

    2006-01-01

    We have developed three strategies to discriminate among the three types of tRNA genes with anticodon CAT (tRNAIle, elongator tRNAMet and initiator tRNAfMet) in bacterial genomes. With these strategies, we have classified the tRNA genes from 234 bacterial and several organellar genomes. These sequences, in an aligned or unaligned format, may be used for the identification and annotation of tRNA (CAT) genes in other genomes. The first strategy is based on the position of the problem sequences in a phenogram (a tree-like network), the second on the minimum average number of differences against the tRNA sequences of the three types and the third on the search for the highest score value against the profiles of the three types of tRNA genes. The species with the maximum number of tRNAfMet and tRNAMet was Photobacterium profundum, whereas the genome of one Escherichia coli strain presented the maximum number of tRNAIle (CAT) genes. This last tRNA gene and tilS, encoding an RNA-modifying enzyme, are not essential in bacteria. The acquisition of a tRNAIle (TAT) gene by Mycoplasma mobile has led to the loss of both the tRNAIle (CAT) and the tilS genes. The new tRNA has appropriated the function of decoding AUA codons. PMID:17071718

  18. Identification and analysis of integrons and cassette arrays in bacterial genomes.

    PubMed

    Cury, Jean; Jové, Thomas; Touchon, Marie; Néron, Bertrand; Rocha, Eduardo Pc

    2016-06-01

    Integrons recombine gene arrays and favor the spread of antibiotic resistance. Their broader roles in bacterial adaptation remain mysterious, partly due to lack of computational tools. We made a program - IntegronFinder - to identify integrons with high accuracy and sensitivity. IntegronFinder is available as a standalone program and as a web application. It searches for attC sites using covariance models, for integron-integrases using HMM profiles, and for other features (promoters, attI site) using pattern matching. We searched for integrons, integron-integrases lacking attC sites, and clusters of attC sites lacking a neighboring integron-integrase in bacterial genomes. All these elements are especially frequent in genomes of intermediate size. They are missing in some key phyla, such as α-Proteobacteria, which might reflect selection against cell lineages that acquire integrons. The similarity between attC sites is proportional to the number of cassettes in the integron, and is particularly low in clusters of attC sites lacking integron-integrases. The latter are unexpectedly abundant in genomes lacking integron-integrases or their remains, and have a large novel pool of cassettes lacking homologs in the databases. They might represent an evolutionary step between the acquisition of genes within integrons and their stabilization in the new genome. PMID:27130947

  19. Identification and analysis of integrons and cassette arrays in bacterial genomes

    PubMed Central

    Cury, Jean; Jové, Thomas; Touchon, Marie; Néron, Bertrand; Rocha, Eduardo PC

    2016-01-01

    Integrons recombine gene arrays and favor the spread of antibiotic resistance. Their broader roles in bacterial adaptation remain mysterious, partly due to lack of computational tools. We made a program – IntegronFinder – to identify integrons with high accuracy and sensitivity. IntegronFinder is available as a standalone program and as a web application. It searches for attC sites using covariance models, for integron-integrases using HMM profiles, and for other features (promoters, attI site) using pattern matching. We searched for integrons, integron-integrases lacking attC sites, and clusters of attC sites lacking a neighboring integron-integrase in bacterial genomes. All these elements are especially frequent in genomes of intermediate size. They are missing in some key phyla, such as α-Proteobacteria, which might reflect selection against cell lineages that acquire integrons. The similarity between attC sites is proportional to the number of cassettes in the integron, and is particularly low in clusters of attC sites lacking integron-integrases. The latter are unexpectedly abundant in genomes lacking integron-integrases or their remains, and have a large novel pool of cassettes lacking homologs in the databases. They might represent an evolutionary step between the acquisition of genes within integrons and their stabilization in the new genome. PMID:27130947

  20. Genomic context drives transcription of insertion sequences in the bacterial endosymbiont Wolbachia wVulC.

    PubMed

    Cerveau, Nicolas; Gilbert, Clément; Liu, Chao; Garrett, Roger A; Grève, Pierre; Bouchon, Didier; Cordaux, Richard

    2015-06-10

    Transposable elements (TEs) are DNA pieces that are present in almost all the living world at variable genomic density. Due to their mobility and density, TEs are involved in a large array of genomic modifications. In eukaryotes, TE expression has been studied in detail in several species. In prokaryotes, studies of IS expression are generally linked to particular copies that induce a modification of neighboring gene expression. Here we investigated global patterns of IS transcription in the Alphaproteobacterial endosymbiont Wolbachia wVulC, using both RT-PCR and bioinformatic analyses. We detected several transcriptional promoters in all IS groups. Nevertheless, only one of the potentially functional IS groups possesses a promoter located upstream of the transposase gene, that could lead up to the production of a functional protein. We found that the majority of IS groups are expressed whatever their functional status. RT-PCR analyses indicate that the transcription of two IS groups lacking internal promoters upstream of the transposase start codon may be driven by the genomic environment. We confirmed this observation with the transcription analysis of individual copies of one IS group. These results suggest that the genomic environment is important for IS expression and it could explain, at least partly, copy number variability of the various IS groups present in the wVulC genome and, more generally, in bacterial genomes. PMID:25813874

  1. Generalized bacterial genome editing using mobile group II introns and Cre-lox

    PubMed Central

    Enyeart, Peter J; Chirieleison, Steven M; Dao, Mai N; Perutka, Jiri; Quandt, Erik M; Yao, Jun; Whitt, Jacob T; Keatinge-Clay, Adrian T; Lambowitz, Alan M; Ellington, Andrew D

    2013-01-01

    Efficient bacterial genetic engineering approaches with broad-host applicability are rare. We combine two systems, mobile group II introns (‘targetrons') and Cre/lox, which function efficiently in many different organisms, into a versatile platform we call GETR (Genome Editing via Targetrons and Recombinases). The introns deliver lox sites to specific genomic loci, enabling genomic manipulations. Efficiency is enhanced by adding flexibility to the RNA hairpins formed by the lox sites. We use the system for insertions, deletions, inversions, and one-step cut-and-paste operations. We demonstrate insertion of a 12-kb polyketide synthase operon into the lacZ gene of Escherichia coli, multiple simultaneous and sequential deletions of up to 120 kb in E. coli and Staphylococcus aureus, inversions of up to 1.2 Mb in E. coli and Bacillus subtilis, and one-step cut-and-pastes for translocating 120 kb of genomic sequence to a site 1.5 Mb away. We also demonstrate the simultaneous delivery of lox sites into multiple loci in the Shewanella oneidensis genome. No selectable markers need to be placed in the genome, and the efficiency of Cre-mediated manipulations typically approaches 100%. PMID:24002656

  2. Applying Shannon's information theory to bacterial and phage genomes and metagenomes.

    PubMed

    Akhter, Sajia; Bailey, Barbara A; Salamon, Peter; Aziz, Ramy K; Edwards, Robert A

    2013-01-01

    All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis. PMID:23301154

  3. ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes

    PubMed Central

    Didelot, Xavier; Wilson, Daniel J.

    2015-01-01

    Recombination is an important evolutionary force in bacteria, but it remains challenging to reconstruct the imports that occurred in the ancestry of a genomic sample. Here we present ClonalFrameML, which uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction. ClonalFrameML can analyse hundreds of genomes in a matter of hours, and we demonstrate its usefulness on simulated and real datasets. We find evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582. ClonalFrameML is freely available at http://clonalframeml.googlecode.com/. PMID:25675341

  4. C-Sibelia: an easy-to-use and highly accurate tool for bacterial genome comparison

    PubMed Central

    Minkin, Ilya; Pham, Hoa; Starostina, Ekaterina; Vyahhi, Nikolay; Pham, Son

    2013-01-01

    We present C-Sibelia, a highly accurate and easy-to-use software tool for comparing two closely related bacterial genomes, which can be presented as either finished sequences or fragmented assemblies. C-Sibelia takes as input two FASTA files and produces: (1) a VCF file containing all identified single nucleotide variations and indels; (2) an XMFA file containing alignment information. The software also produces Circos diagrams visualizing high level genomic architecture for rearrangement analyses. C-Sibelia is a part of the Sibelia comparative genomics suite, which is freely available under the GNU GPL v.2 license at http://sourceforge.net/projects/sibelia-bio. C-Sibelia is compatible with Unix-like operating systems. A web-based version of the software is available at http://etool.me/software/csibelia. PMID:25110578

  5. Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

    SciTech Connect

    Muchero, Wellington; Labbe, Jessy L; Priya, Ranjan; DiFazio, Steven P; Tuskan, Gerald A

    2014-01-01

    To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel and fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.

  6. Functional Convergence in Reduced Genomes of Bacterial Symbionts Spanning 200 My of Evolution

    PubMed Central

    McCutcheon, John P.; Moran, Nancy A.

    2010-01-01

    The main genomic changes in the evolution of host-restricted microbial symbionts are ongoing inactivation and loss of genes combined with rapid sequence evolution and extreme structural stability; these changes reflect high levels of genetic drift due to small population sizes and strict clonality. This genomic erosion includes irreversible loss of genes in many functional categories and can include genes that underlie the nutritional contributions to hosts that are the basis of the symbiotic association. Candidatus Sulcia muelleri is an ancient symbiont of sap-feeding insects and is typically coresident with another bacterial symbiont that varies among host subclades. Previously sequenced Sulcia genomes retain pathways for the same eight essential amino acids, whereas coresident symbionts synthesize the remaining two. Here, we describe a dual symbiotic system consisting of Sulcia and a novel species of Betaproteobacteria, Candidatus Zinderia insecticola, both living in the spittlebug Clastoptera arizonana. This Sulcia has completely lost the pathway for the biosynthesis of tryptophan and, therefore, retains the ability to make only 7 of the 10 essential amino acids. Zinderia has a tiny genome (208 kb) and the most extreme nucleotide base composition (13.5% G + C) reported to date, yet retains the ability to make the remaining three essential amino acids, perfectly complementing capabilities of the coresident Sulcia. Combined with the results from related symbiotic systems with complete genomes, these data demonstrate the critical role that bacterial symbionts play in the host insect’s biology and reveal one outcome following the loss of a critical metabolic activity through genome reduction. PMID:20829280

  7. Functional convergence in reduced genomes of bacterial symbionts spanning 200 My of evolution.

    PubMed

    McCutcheon, John P; Moran, Nancy A

    2010-01-01

    The main genomic changes in the evolution of host-restricted microbial symbionts are ongoing inactivation and loss of genes combined with rapid sequence evolution and extreme structural stability; these changes reflect high levels of genetic drift due to small population sizes and strict clonality. This genomic erosion includes irreversible loss of genes in many functional categories and can include genes that underlie the nutritional contributions to hosts that are the basis of the symbiotic association. Candidatus Sulcia muelleri is an ancient symbiont of sap-feeding insects and is typically coresident with another bacterial symbiont that varies among host subclades. Previously sequenced Sulcia genomes retain pathways for the same eight essential amino acids, whereas coresident symbionts synthesize the remaining two. Here, we describe a dual symbiotic system consisting of Sulcia and a novel species of Betaproteobacteria, Candidatus Zinderia insecticola, both living in the spittlebug Clastoptera arizonana. This Sulcia has completely lost the pathway for the biosynthesis of tryptophan and, therefore, retains the ability to make only 7 of the 10 essential amino acids. Zinderia has a tiny genome (208 kb) and the most extreme nucleotide base composition (13.5% G + C) reported to date, yet retains the ability to make the remaining three essential amino acids, perfectly complementing capabilities of the coresident Sulcia. Combined with the results from related symbiotic systems with complete genomes, these data demonstrate the critical role that bacterial symbionts play in the host insect's biology and reveal one outcome following the loss of a critical metabolic activity through genome reduction. PMID:20829280

  8. Genome Scale Transcriptomics of Baculovirus-Insect Interactions

    PubMed Central

    Nguyen, Quan; Nielsen, Lars K.; Reid, Steven

    2013-01-01

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors‚ and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system‚ which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies. PMID:24226166

  9. Large-scale genomic sequencing of extraintestinal pathogenic Escherichia coli strains

    PubMed Central

    Salipante, Stephen J.; Roach, David J.; Kitzman, Jacob O.; Snyder, Matthew W.; Stackhouse, Bethany; Butler-Wu, Susan M.; Lee, Choli; Cookson, Brad T.

    2015-01-01

    Large-scale bacterial genome sequencing efforts to date have provided limited information on the most prevalent category of disease: sporadically acquired infections caused by common pathogenic bacteria. Here, we performed whole-genome sequencing and de novo assembly of 312 blood- or urine-derived isolates of extraintestinal pathogenic (ExPEC) Escherichia coli, a common agent of sepsis and community-acquired urinary tract infections, obtained during the course of routine clinical care at a single institution. We find that ExPEC E. coli are highly genomically heterogeneous, consistent with pan-genome analyses encompassing the larger species. Investigation of differential virulence factor content and antibiotic resistance phenotypes reveals markedly different profiles among lineages and among strains infecting different body sites. We use high-resolution molecular epidemiology to explore the dynamics of infections at the level of individual patients, including identification of possible person-to-person transmission. Notably, a limited number of discrete lineages caused the majority of bloodstream infections, including one subclone (ST131-H30) responsible for 28% of bacteremic E. coli infections over a 3-yr period. We additionally use a microbial genome-wide-association study (GWAS) approach to identify individual genes responsible for antibiotic resistance, successfully recovering known genes but notably not identifying any novel factors. We anticipate that in the near future, whole-genome sequencing of microorganisms associated with clinical disease will become routine. Our study reveals what kind of information can be obtained from sequencing clinical isolates on a large scale, even well-characterized organisms such as E. coli, and provides insight into how this information might be utilized in a healthcare setting. PMID:25373147

  10. Large-scale data mining pilot project in human genome

    SciTech Connect

    Musick, R.; Fidelis, R.; Slezak, T.

    1997-05-01

    This whitepaper briefly describes a new, aggressive effort in large- scale data Livermore National Labs. The implications of `large- scale` will be clarified Section. In the short term, this effort will focus on several @ssion-critical questions of Genome project. We will adapt current data mining techniques to the Genome domain, to quantify the accuracy of inference results, and lay the groundwork for a more extensive effort in large-scale data mining. A major aspect of the approach is that we will be fully-staffed data warehousing effort in the human Genome area. The long term goal is strong applications- oriented research program in large-@e data mining. The tools, skill set gained will be directly applicable to a wide spectrum of tasks involving a for large spatial and multidimensional data. This includes applications in ensuring non-proliferation, stockpile stewardship, enabling Global Ecology (Materials Database Industrial Ecology), advancing the Biosciences (Human Genome Project), and supporting data for others (Battlefield Management, Health Care).

  11. Application of Whole-Genome Sequencing for Bacterial Strain Typing in Molecular Epidemiology

    PubMed Central

    SenGupta, Dhruba J.; Cummings, Lisa A.; Land, Tyler A.; Hoogestraat, Daniel R.; Cookson, Brad T.

    2015-01-01

    Nosocomial infections pose a significant threat to patient health; however, the gold standard laboratory method for determining bacterial relatedness (pulsed-field gel electrophoresis [PFGE]) remains essentially unchanged 20 years after its introduction. Here, we explored bacterial whole-genome sequencing (WGS) as an alternative approach for molecular strain typing. We compared WGS to PFGE for investigating presumptive outbreaks involving three important pathogens: vancomycin-resistant Enterococcus faecium (n = 19), methicillin-resistant Staphylococcus aureus (n = 17), and Acinetobacter baumannii (n = 15). WGS was highly reproducible (average ≤ 0.39 differences between technical replicates), which enabled a functional, quantitative definition for determining clonality. Strain relatedness data determined by PFGE and WGS roughly correlated, but the resolution of WGS was superior (P = 5.6 × 10−8 to 0.016). Several discordant results were noted between the methods. A total of 28.9% of isolates which were indistinguishable by PFGE were nonclonal by WGS. For A. baumannii, a species known to undergo rapid horizontal gene transfer, 16.2% of isolate pairs considered nonidentical by PFGE were clonal by WGS. Sequencing whole bacterial genomes with single-nucleotide resolution demonstrates that PFGE is prone to false-positive and false-negative results and suggests the need for a new gold standard approach for molecular epidemiological strain typing. PMID:25631811

  12. Application of whole-genome sequencing for bacterial strain typing in molecular epidemiology.

    PubMed

    Salipante, Stephen J; SenGupta, Dhruba J; Cummings, Lisa A; Land, Tyler A; Hoogestraat, Daniel R; Cookson, Brad T

    2015-04-01

    Nosocomial infections pose a significant threat to patient health; however, the gold standard laboratory method for determining bacterial relatedness (pulsed-field gel electrophoresis [PFGE]) remains essentially unchanged 20 years after its introduction. Here, we explored bacterial whole-genome sequencing (WGS) as an alternative approach for molecular strain typing. We compared WGS to PFGE for investigating presumptive outbreaks involving three important pathogens: vancomycin-resistant Enterococcus faecium (n=19), methicillin-resistant Staphylococcus aureus (n=17), and Acinetobacter baumannii (n=15). WGS was highly reproducible (average≤0.39 differences between technical replicates), which enabled a functional, quantitative definition for determining clonality. Strain relatedness data determined by PFGE and WGS roughly correlated, but the resolution of WGS was superior (P=5.6×10(-8) to 0.016). Several discordant results were noted between the methods. A total of 28.9% of isolates which were indistinguishable by PFGE were nonclonal by WGS. For A. baumannii, a species known to undergo rapid horizontal gene transfer, 16.2% of isolate pairs considered nonidentical by PFGE were clonal by WGS. Sequencing whole bacterial genomes with single-nucleotide resolution demonstrates that PFGE is prone to false-positive and false-negative results and suggests the need for a new gold standard approach for molecular epidemiological strain typing. PMID:25631811

  13. Bacterial genospecies that are not ecologically coherent: population genomics of Rhizobium leguminosarum

    PubMed Central

    Kumar, Nitin; Lad, Ganesh; Giuntini, Elisa; Kaye, Maria E.; Udomwong, Piyachat; Shamsani, N. Jannah; Young, J. Peter W.; Bailly, Xavier

    2015-01-01

    Biological species may remain distinct because of genetic isolation or ecological adaptation, but these two aspects do not always coincide. To establish the nature of the species boundary within a local bacterial population, we characterized a sympatric population of the bacterium Rhizobium leguminosarum by genomic sequencing of 72 isolates. Although all strains have 16S rRNA typical of R. leguminosarum, they fall into five genospecies by the criterion of average nucleotide identity (ANI). Many genes, on plasmids as well as the chromosome, support this division: recombination of core genes has been largely within genospecies. Nevertheless, variation in ecological properties, including symbiotic host range and carbon-source utilization, cuts across these genospecies, so that none of these phenotypes is diagnostic of genospecies. This phenotypic variation is conferred by mobile genes. The genospecies meet the Mayr criteria for biological species in respect of their core genes, but do not correspond to coherent ecological groups, so periodic selection may not be effective in purging variation within them. The population structure is incompatible with traditional ‘polyphasic taxonomy′ that requires bacterial species to have both phylogenetic coherence and distinctive phenotypes. More generally, genomics has revealed that many bacterial species share adaptive modules by horizontal gene transfer, and we envisage a more consistent taxonomic framework that explicitly recognizes this. Significant phenotypes should be recognized as ‘biovars' within species that are defined by core gene phylogeny. PMID:25589577

  14. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

    PubMed

    Croucher, Nicholas J; Page, Andrew J; Connor, Thomas R; Delaney, Aidan J; Keane, Jacqueline A; Bentley, Stephen D; Parkhill, Julian; Harris, Simon R

    2015-02-18

    The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X. PMID:25414349

  15. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins

    PubMed Central

    Croucher, Nicholas J.; Page, Andrew J.; Connor, Thomas R.; Delaney, Aidan J.; Keane, Jacqueline A.; Bentley, Stephen D.; Parkhill, Julian; Harris, Simon R.

    2015-01-01

    The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates’ recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X. PMID:25414349

  16. Conserved gene clusters in bacterial genomes provide further support for the primacy of RNA

    NASA Technical Reports Server (NTRS)

    Siefert, J. L.; Martin, K. A.; Abdi, F.; Widger, W. R.; Fox, G. E.

    1997-01-01

    Five complete bacterial genome sequences have been released to the scientific community. These include four (eu)Bacteria, Haemophilus influenzae, Mycoplasma genitalium, M. pneumoniae, and Synechocystis PCC 6803, as well as one Archaeon, Methanococcus jannaschii. Features of organization shared by these genomes are likely to have arisen very early in the history of the bacteria and thus can be expected to provide further insight into the nature of early ancestors. Results of a genome comparison of these five organisms confirm earlier observations that gene order is remarkably unpreserved. There are, nevertheless, at least 16 clusters of two or more genes whose order remains the same among the four (eu)Bacteria and these are presumed to reflect conserved elements of coordinated gene expression that require gene proximity. Eight of these gene orders are essentially conserved in the Archaea as well. Many of these clusters are known to be regulated by RNA-level mechanisms in Escherichia coli, which supports the earlier suggestion that this type of regulation of gene expression may have arisen very early. We conclude that although the last common ancestor may have had a DNA genome, it likely was preceded by progenotes with an RNA genome.

  17. Towards a more accurate annotation of tyrosine-based site-specific recombinases in bacterial genomes

    PubMed Central

    2012-01-01

    Background Tyrosine-based site-specific recombinases (TBSSRs) are DNA breaking-rejoining enzymes. In bacterial genomes, they play a major role in the comings and goings of mobile genetic elements (MGEs), such as temperate phage genomes, integrated conjugative elements (ICEs) or integron cassettes. TBSSRs are also involved in the segregation of plasmids and chromosomes, the resolution of plasmid dimers and of co-integrates resulting from the replicative transposition of transposons. With the aim of improving the annotation of TBSSR genes in genomic sequences and databases, which so far is far from robust, we built a set of over 1,300 TBSSR protein sequences tagged with their genome of origin. We organized them in families to investigate: i) whether TBSSRs tend to be more conserved within than between classes of MGE types and ii) whether the (sub)families may help in understanding more about the function of TBSSRs associated in tandem or trios on plasmids and chromosomes. Results A total of 67% of the TBSSRs in our set are MGE type specific. We define a new class of actinobacterial transposons, related to Tn554, containing one abnormally long TBSSR and one of typical size, and we further characterize numerous TBSSRs trios present in plasmids and chromosomes of α- and β-proteobacteria. Conclusions The simple in silico procedure described here, which uses a set of reference TBSSRs from defined MGE types, could contribute to greatly improve the annotation of tyrosine-based site-specific recombinases in plasmid, (pro)phage and other integrated MGE genomes. It also reveals TBSSRs families whose distribution among bacterial taxa suggests they mediate lateral gene transfer. PMID:22502997

  18. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations

    PubMed Central

    Bendall, Matthew L; Stevens, Sarah LR; Chan, Leong-Keat; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Froula, Jeff; Kang, Dongwan; Tringe, Susannah G; Bertilsson, Stefan; Moran, Mary A; Shade, Ashley; Newton, Ryan J; McMahon, Katherine D; Malmstrom, Rex R

    2016-01-01

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Here, from a 9-year metagenomic study of a freshwater lake (2005–2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. These patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the ‘ecotype model' of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment. PMID:26744812

  19. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations.

    PubMed

    Bendall, Matthew L; Stevens, Sarah Lr; Chan, Leong-Keat; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Froula, Jeff; Kang, Dongwan; Tringe, Susannah G; Bertilsson, Stefan; Moran, Mary A; Shade, Ashley; Newton, Ryan J; McMahon, Katherine D; Malmstrom, Rex R

    2016-07-01

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Here, from a 9-year metagenomic study of a freshwater lake (2005-2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. These patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the 'ecotype model' of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment. PMID:26744812

  20. A genomic perspective on a new bacterial genus and species from the Alcaligenaceae family, Basilea psittacipulmonis

    PubMed Central

    2014-01-01

    Background A novel Gram-negative, non-haemolytic, non-motile, rod-shaped bacterium was discovered in the lungs of a dead parakeet (Melopsittacus undulatus) that was kept in captivity in a petshop in Basel, Switzerland. The organism is described with a chemotaxonomic profile and the nearly complete genome sequence obtained through the assembly of short sequence reads. Results Genome sequence analysis and characterization of respiratory quinones, fatty acids, polar lipids, and biochemical phenotype is presented here. Comparison of gene sequences revealed that the most similar species is Pelistega europaea, with BLAST identities of only 93% to the 16S rDNA gene, 76% identity to the rpoB gene, and a similar GC content (~43%) as the organism isolated from the parakeet, DSM 24701 (40%). The closest full genome sequences are those of Bordetella spp. and Taylorella spp. High-throughput sequencing reads from the Illumina-Solexa platform were assembled with the Edena de novo assembler to form 195 contigs comprising the ~2 Mb genome. Genome annotation with RAST, construction of phylogenetic trees with the 16S rDNA (rrs) gene sequence and the rpoB gene, and phylogenetic placement using other highly conserved marker genes with ML Tree all suggest that the bacterial species belongs to the Alcaligenaceae family. Analysis of samples from cages with healthy parakeets suggested that the newly discovered bacterial species is not widespread in parakeet living quarters. Conclusions Classification of this organism in the current taxonomy system requires the formation of a new genus and species. We designate the new genus Basilea and the new species psittacipulmonis. The type strain of Basilea psittacipulmonis is DSM 24701 (= CIP 110308 T, 16S rDNA gene sequence Genbank accession number JX412111 and GI 406042063). PMID:24581117

  1. A Gene-By-Gene Approach to Bacterial Population Genomics: Whole Genome MLST of Campylobacter.

    PubMed

    Sheppard, Samuel K; Jolley, Keith A; Maiden, Martin C J

    2012-01-01

    Campylobacteriosis remains a major human public health problem world-wide. Genetic analyses of Campylobacter isolates, and particularly molecular epidemiology, have been central to the study of this disease, particularly the characterization of Campylobacter genotypes isolated from human infection, farm animals, and retail food. These studies have demonstrated that Campylobacter populations are highly structured, with distinct genotypes associated with particular wild or domestic animal sources, and that chicken meat is the most likely source of most human infection in countries such as the UK. The availability of multiple whole genome sequences from Campylobacter isolates presents the prospect of identifying those genes or allelic variants responsible for host-association and increased human disease risk, but the diversity of Campylobacter genomes present challenges for such analyses. We present a gene-by-gene approach for investigating the genetic basis of phenotypes in diverse bacteria such as Campylobacter, implemented with the BIGSdb software on the pubMLST.org/campylobacter website. PMID:24704917

  2. Construction of an infectious clone of canine herpesvirus genome as a bacterial artificial chromosome.

    PubMed

    Arii, Jun; Hushur, Orkash; Kato, Kentaro; Kawaguchi, Yasushi; Tohya, Yukinobu; Akashi, Hiroomi

    2006-04-01

    Canine herpesvirus (CHV) is an attractive candidate not only for use as a recombinant vaccine to protect dogs from a variety of canine pathogens but also as a viral vector for gene therapy in domestic animals. However, developments in this area have been impeded by the complicated techniques used for eukaryotic homologous recombination. To overcome these problems, we used bacterial artificial chromosomes (BACs) to generate infectious BACs. Our findings may be summarized as follows: (i) the CHV genome (pCHV/BAC), in which a BAC flanked by loxP sites was inserted into the thymidine kinase gene, was maintained in Escherichia coli; (ii) transfection of pCHV/BAC into A-72 cells resulted in the production of infectious virus; (iii) the BAC vector sequence was almost perfectly excisable from the genome of the reconstituted virus CHV/BAC by co-infection with CHV/BAC and a recombinant adenovirus that expressed the Cre recombinase; and (iv) a recombinant virus in which the glycoprotein C gene was deleted was generated by lambda recombination followed by Flp recombination, which resulted in a reduction in viral titer compared with that of the wild-type virus. The infectious clone pCHV/BAC is useful for the modification of the CHV genome using bacterial genetics, and CHV/BAC should have multiple applications in the rapid generation of genetically engineered CHV recombinants and the development of CHV vectors for vaccination and gene therapy in domestic animals. PMID:16515874

  3. De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer

    PubMed Central

    Hernandez, David; François, Patrice; Farinelli, Laurent; Østerås, Magne; Schrenzel, Jacques

    2008-01-01

    Novel high-throughput DNA sequencing technologies allow researchers to characterize a bacterial genome during a single experiment and at a moderate cost. However, the increase in sequencing throughput that is allowed by using such platforms is obtained at the expense of individual sequence read length, which must be assembled into longer contigs to be exploitable. This study focuses on the Illumina sequencing platform that produces millions of very short sequences that are 35 bases in length. We propose a de novo assembler software that is dedicated to process such data. Based on a classical overlap graph representation and on the detection of potentially spurious reads, our software generates a set of accurate contigs of several kilobases that cover most of the bacterial genome. The assembly results were validated by comparing data sets that were obtained experimentally for Staphylococcus aureus strain MW2 and Helicobacter acinonychis strain Sheeba with that of their published genomes acquired by conventional sequencing of 1.5- to 3.0-kb fragments. We also provide indications that the broad coverage achieved by high-throughput sequencing might allow for the detection of clonal polymorphisms in the set of DNA molecules being sequenced. PMID:18332092

  4. Correlation Between Heterogeneous Bacterial Attachment Rate Coefficients and Hydraulic Conductivity and Impacts on Field-Scale Bacterial Transport

    SciTech Connect

    Scheibe, Timothy D.

    2002-10-28

    In granular porous media, bacterial transport is often modeled using the advection-dispersion transport equation, modified to account for interactions between the bacteria and grain surfaces (attachment and detachment) using a linear kinetic reaction model. In this paper we examine the relationships among the parameters of the above model in the context of bacterial transport for bioaugmentation. In this context, we wish to quantify the distance to which significant concentrations of bacteria can be transported, as well as the uniformity with which they can be distributed within the subsurface. Because kinetic detachment rates (Kr) are typically much smaller than corresponding attachment rates (Kf), the attachment rate exerts primary control on the distance of bacterial transport. Hydraulic conductivity (K) also plays a significant role because of its direct relationship to the advective velocity and its typically high degree of spatial variability at field scales. Because Kf is related to the velocity, grain size, and porosity of the medium, as is K, we expect that there exists correlation between these two parameters. Previous investigators have assumed a form of correlation between Kf and ln(K) based in part on reparameterization of clean-bed filtration equations in terms of published relations between grain size, effective porosity, and ln(K). The hypotheses examined here are that (1) field-scale relationships between K and Kf can be developed by combining a number of theoretical and empirical results in the context of a heterogeneous aquifer flow model (following a similar approach to previous investigators with some extensions), and (2) correlation between K and Kf will enhance the distance of field-scale bacterial transport in granular aquifers. We test these hypotheses using detailed numerical models and observations of field-scale bacterial transport in a shallow sandy aquifer within the South Oyster Site near Oyster, Virginia, USA.

  5. Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing.

    PubMed

    Eastman, Alexander W; Yuan, Ze-Chun

    2014-01-01

    Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID

  6. Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing

    PubMed Central

    Eastman, Alexander W.; Yuan, Ze-Chun

    2015-01-01

    Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID

  7. Accelerating the reconstruction of genome-scale metabolic networks

    PubMed Central

    Notebaart, Richard A; van Enckevort, Frank HJ; Francke, Christof; Siezen, Roland J; Teusink, Bas

    2006-01-01

    Background The genomic information of a species allows for the genome-scale reconstruction of its metabolic capacity. Such a metabolic reconstruction gives support to metabolic engineering, but also to integrative bioinformatics and visualization. Sequence-based automatic reconstructions require extensive manual curation, which can be very time-consuming. Therefore, we present a method to accelerate the time-consuming process of network reconstruction for a query species. The method exploits the availability of well-curated metabolic networks and uses high-resolution predictions of gene equivalency between species, allowing the transfer of gene-reaction associations from curated networks. Results We have evaluated the method using Lactococcus lactis IL1403, for which a genome-scale metabolic network was published recently. We recovered most of the gene-reaction associations (i.e. 74 – 85%) which are incorporated in the published network. Moreover, we predicted over 200 additional genes to be associated to reactions, including genes with unknown function, genes for transporters and genes with specific metabolic reactions, which are good candidates for an extension to the previously published network. In a comparison of our developed method with the well-established approach Pathologic, we predicted 186 additional genes to be associated to reactions. We also predicted a relatively high number of complete conserved protein complexes, which are derived from curated metabolic networks, illustrating the potential predictive power of our method for protein complexes. Conclusion We show that our methodology can be applied to accelerate the reconstruction of genome-scale metabolic networks by taking optimal advantage of existing, manually curated networks. As orthology detection is the first step in the method, only the translated open reading frames (ORFs) of a newly sequenced genome are necessary to reconstruct a metabolic network. When more manually curated metabolic

  8. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    PubMed Central

    2011-01-01

    Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System) is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys. PMID:21276275

  9. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock

    USGS Publications Warehouse

    Kamath, Pauline L.; Foster, Jeffrey T.; Drees, Kevin P.; Luikart, Gordon; Quance, Christine; Anderson, Neil J.; Clarke, P. Ryan; Cole, Eric K.; Drew, Mark L.; Edwards, William H.; Rhyan, Jack C.; Treanor, John J.; Wallen, Rick L.; White, Patrick J.; Robbe-Austerman, Suelee; Cross, Paul C.

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (B3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations.

  10. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock.

    PubMed

    Kamath, Pauline L; Foster, Jeffrey T; Drees, Kevin P; Luikart, Gordon; Quance, Christine; Anderson, Neil J; Clarke, P Ryan; Cole, Eric K; Drew, Mark L; Edwards, William H; Rhyan, Jack C; Treanor, John J; Wallen, Rick L; White, Patrick J; Robbe-Austerman, Suelee; Cross, Paul C

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (∼3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations. PMID:27165544

  11. Construction and characterization of an eightfold redundant dog genomic bacterial artificial chromosome library.

    PubMed

    Li, R; Mignot, E; Faraco, J; Kadotani, H; Cantanese, J; Zhao, B; Lin, X; Hinton, L; Ostrander, E A; Patterson, D F; de Jong, P J

    1999-05-15

    A large insert canine genomic bacterial artificial chromosome (BAC) library was built from a Doberman pinscher. Approximately 166,000 clones were gridded on nine high-density hybridization filters. Insert analysis of randomly selected clones indicated a mean insert size of 155 kb and predicted 8.1 coverage of the canine genome. Two percent of the clones were nonrecombinant. Chromosomal fluorescence in situ hybridization studies of 60 BAC clones indicated no chimerism. The library was hybridized with dog PCR products representing eight genes (ADA, TNFA, GCA, MYB, HOXA, GUSB, THY1, and TOP1). The resulting positive clones were characterized and shown to be compatible with an eightfold redundant library. PMID:10331940

  12. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock

    PubMed Central

    Kamath, Pauline L.; Foster, Jeffrey T.; Drees, Kevin P.; Luikart, Gordon; Quance, Christine; Anderson, Neil J.; Clarke, P. Ryan; Cole, Eric K.; Drew, Mark L.; Edwards, William H.; Rhyan, Jack C.; Treanor, John J.; Wallen, Rick L.; White, Patrick J.; Robbe-Austerman, Suelee; Cross, Paul C.

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (∼3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations. PMID:27165544

  13. Computer models of bacterial cells: from generalized coarsegrained to genome-specific modular models

    NASA Astrophysics Data System (ADS)

    Nikolaev, Evgeni V.; Atlas, Jordan C.; Shuler, Michael L.

    2006-09-01

    We discuss a modular modelling framework to rapidly develop mathematical models of bacterial cells that would explicitly link genomic details to cell physiology and population response. An initial step in this approach is the development of a coarse-grained model, describing pseudo-chemical interactions between lumped species. A hybrid model of interest can then be constructed by embedding genome-specific detail for a particular cellular subsystem (e.g. central metabolism), called here a module, into the coarse-grained model. Specifically, a new strategy for sensitivity analysis of the cell division limit cycle is introduced to identify which pseudo-molecular processes should be delumped to implement a particular biological function in a growing cell (e.g. ethanol overproduction or pathogen viability). To illustrate the modeling principles and highlight computational challenges, the Cornell coarsegrained model of Escherichia coli B/r-A is used to benchmark the proposed framework.

  14. Reconstruction and analysis of the genome-scale metabolic model of Lactobacillus casei LC2W.

    PubMed

    Xu, Nan; Liu, Jie; Ai, Lianzhong; Liu, Liming

    2015-01-10

    Lactobacillus casei LC2W is a recently isolated probiotic lactic acid bacterial strain, which is widely used in the dairy and pharmaceutical industries and in clinical medicine. The first genome-scale metabolic model for L. casei, composed of 846 genes, 969 metabolic reactions, and 785 metabolites, was reconstructed using both manual genome annotation and an automatic SEED model. Then, the iJL846 model was validated by simulating cell growth on 15 reported carbon sources. The iJL846 model explored the metabolism of L. casei on a genome scale: (1) explanation of the genetic codes-metabolic functions of 342 genes were reannotated in this model; (2) characterization of the physiology-10 amino acids and 7 vitamins were identified to be essential nutrients for L. casei LC2W growth; (3) analyses of metabolic pathways-the transport and metabolism of the 17 essential nutrients and exopolysaccharide (EPS) biosynthesis-were performed; (4) exploration of metabolic capacity was conducted-for lactate, the importance of genes in its biosynthetic pathways was evaluated, and the requirements of amino acids were predicted for mixed acid fermentation; for flavor compounds, the effects of oxygen were analyzed, and three new knockout targets were selected for acetoin production; for EPS, 11 types of nutrients in the rich medium and important reactions in the biosynthetic pathway were identified that enhanced EPS production. In conclusion, the iJL846 model serves as a useful tool for understanding and engineering the metabolism of this probiotic strain. PMID:25452194

  15. Rapid pair-wise synteny analysis of large bacterial genomes using web-based GeneOrder4.0

    PubMed Central

    2010-01-01

    Background The growing whole genome sequence databases necessitate the development of user-friendly software tools to mine these data. Web-based tools are particularly useful to wet-bench biologists as they enable platform-independent analysis of sequence data, without having to perform complex programming tasks and software compiling. Findings GeneOrder4.0 is a web-based "on-the-fly" synteny and gene order analysis tool for comparative bacterial genomics (ca. 8 Mb). It enables the visualization of synteny by plotting protein similarity scores between two genomes and it also provides visual annotation of "hypothetical" proteins from older archived genomes based on more recent annotations. Conclusions The web-based software tool GeneOrder4.0 is a user-friendly application that has been updated to allow the rapid analysis of synteny and gene order in large bacterial genomes. It is developed with the wet-bench researcher in mind. PMID:20178631

  16. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    SciTech Connect

    Zhou, Shiguo; Kile, A.; Bechner, M.; Kvikstad, E.; Deng, W.; Wei, J.; Severin, J.; Runnheim, R.; Churas, C.; Forrest, D.; Dimalanta, E.; Lamers, C.; Burland, V.; Blattner, F. R.; Schwartz, David C.

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  17. An improved method for oriT-directed cloning and functionalization of large bacterial genomic regions.

    PubMed

    Kvitko, Brian H; McMillan, Ian A; Schweizer, Herbert P

    2013-08-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient strain, where it is circularized and maintained as a chimeric mini-F vector. The cloned target region is functionalized in multiple ways to accommodate downstream manipulation. The target region is flanked with Gateway attB sites for recombination into other vectors and by rare 18-bp I-SceI restriction sites for subcloning. The Tn7-functionalized target can also be inserted at a naturally occurring chromosomal attTn7 site(s) or maintained as a broad-host-range plasmid for complementation or heterologous expression studies. We have used the oriTn7 capture technique to clone and complement Burkholderia pseudomallei genomic regions up to 140 kb in size and have created isogenic Burkholderia strains with various combinations of genomic islands. We believe this system will greatly aid the cloning and genetic analysis of genomic islands, biosynthetic gene clusters, and large open reading frames. PMID:23747708

  18. An Improved Method for oriT-Directed Cloning and Functionalization of Large Bacterial Genomic Regions

    PubMed Central

    Kvitko, Brian H.; McMillan, Ian A.

    2013-01-01

    We have made significant improvements to a broad-host-range system for the cloning and manipulation of large bacterial genomic regions based on site-specific recombination between directly repeated oriT sites during conjugation. Using two suicide capture vectors carrying flanking homology regions, oriT sites are recombined on either side of the target region. Using a broad-host-range conjugation helper plasmid, the region between the oriT sites is conjugated into an Escherichia coli recipient strain, where it is circularized and maintained as a chimeric mini-F vector. The cloned target region is functionalized in multiple ways to accommodate downstream manipulation. The target region is flanked with Gateway attB sites for recombination into other vectors and by rare 18-bp I-SceI restriction sites for subcloning. The Tn7-functionalized target can also be inserted at a naturally occurring chromosomal attTn7 site(s) or maintained as a broad-host-range plasmid for complementation or heterologous expression studies. We have used the oriTn7 capture technique to clone and complement Burkholderia pseudomallei genomic regions up to 140 kb in size and have created isogenic Burkholderia strains with various combinations of genomic islands. We believe this system will greatly aid the cloning and genetic analysis of genomic islands, biosynthetic gene clusters, and large open reading frames. PMID:23747708

  19. Multiplex sequencing of bacterial artificial chromosomes for assembling complex plant genomes.

    PubMed

    Beier, Sebastian; Himmelbach, Axel; Schmutzer, Thomas; Felder, Marius; Taudien, Stefan; Mayer, Klaus F X; Platzer, Matthias; Stein, Nils; Scholz, Uwe; Mascher, Martin

    2016-07-01

    Hierarchical shotgun sequencing remains the method of choice for assembling high-quality reference sequences of complex plant genomes. The efficient exploitation of current high-throughput technologies and powerful computational facilities for large-insert clone sequencing necessitates the sequencing and assembly of a large number of clones in parallel. We developed a multiplexed pipeline for shotgun sequencing and assembling individual bacterial artificial chromosomes (BACs) using the Illumina sequencing platform. We illustrate our approach by sequencing 668 barley BACs (Hordeum vulgare L.) in a single Illumina HiSeq 2000 lane. Using a newly designed parallelized computational pipeline, we obtained sequence assemblies of individual BACs that consist, on average, of eight sequence scaffolds and represent >98% of the genomic inserts. Our BAC assemblies are clearly superior to a whole-genome shotgun assembly regarding contiguity, completeness and the representation of the gene space. Our methods may be employed to rapidly obtain high-quality assemblies of a large number of clones to assemble map-based reference sequences of plant and animal species with complex genomes by sequencing along a minimum tiling path. PMID:26801048

  20. Correlation Between Bacterial Attachment Rate Coefficients and Hydraulic Conductivity and its Effect on Field-Scale Bacterial Transport

    SciTech Connect

    Scheibe, Timothy D.; Dong, Hailiang; Xie, YuLong

    2007-06-01

    It has been widely observed in field experiments that the apparent rate of bacterial attachment, particularly as parameterized by the collision efficiency in filtration-based models, decreases with transport distance (i.e., exhibits scale-dependency). This effect has previously been attributed to microbial heterogeneity; that is, variability in cell-surface properties within a single monoclonal population. We demonstrate that this effect could also be interpreted as a field-scale manifestation of local-scale correlation between physical heterogeneity (hydraulic conductivity variability) and reaction heterogeneity (attachment rate coefficient variability). A field-scale model of bacterial transport developed for the South Oyster field research site located near Oyster, Virginia, and observations from field experiments performed at that site, are used as the basis for this study. Three-dimensional Monte Carlo simulations of bacterial transport were performed under four alternative scenarios: 1) homogeneous hydraulic conductivity (K) and attachment rate coefficient (Kf), 2) heterogeneous K, homogeneous Kf, 3) heterogeneous K and Kf with local correlation based on empirical and theoretical relationships, and 4) heterogeneous K and Kf without local correlation. The results of the 3D simulations were analyzed using 1D model approximations following conventional methods of field data analysis. An apparent decrease with transport distance of effective collision efficiency was observed only in the case where the local properties were both heterogeneous and correlated. This effect was observed despite the fact that the local collision efficiency was specified as a constant in the 3D model, and can therefore be interpreted as a scale effect associated with the local correlated heterogeneity as manifested at the field scale.

  1. Statistical Analysis of Hurst Exponents of Essential/Nonessential Genes in 33 Bacterial Genomes

    PubMed Central

    Liu, Xiao; Wang, Baojin; Xu, Luo

    2015-01-01

    Methods for identifying essential genes currently depend predominantly on biochemical experiments. However, there is demand for improved computational methods for determining gene essentiality. In this study, we used the Hurst exponent, a characteristic parameter to describe long-range correlation in DNA, and analyzed its distribution in 33 bacterial genomes. In most genomes (31 out of 33) the significance levels of the Hurst exponents of the essential genes were significantly higher than for the corresponding full-gene-set, whereas the significance levels of the Hurst exponents of the nonessential genes remained unchanged or increased only slightly. All of the Hurst exponents of essential genes followed a normal distribution, with one exception. We therefore propose that the distribution feature of Hurst exponents of essential genes can be used as a classification index for essential gene prediction in bacteria. For computer-aided design in the field of synthetic biology, this feature can build a restraint for pre- or post-design checking of bacterial essential genes. Moreover, considering the relationship between gene essentiality and evolution, the Hurst exponents could be used as a descriptive parameter related to evolutionary level, or be added to the annotation of each gene. PMID:26067107

  2. Group-theoretic models of the inversion process in bacterial genomes.

    PubMed

    Egri-Nagy, Attila; Gebhardt, Volker; Tanaka, Mark M; Francis, Andrew R

    2014-07-01

    The variation in genome arrangements among bacterial taxa is largely due to the process of inversion. Recent studies indicate that not all inversions are equally probable, suggesting, for instance, that shorter inversions are more frequent than longer, and those that move the terminus of replication are less probable than those that do not. Current methods for establishing the inversion distance between two bacterial genomes are unable to incorporate such information. In this paper we suggest a group-theoretic framework that in principle can take these constraints into account. In particular, we show that by lifting the problem from circular permutations to the affine symmetric group, the inversion distance can be found in polynomial time for a model in which inversions are restricted to acting on two regions. This requires the proof of new results in group theory, and suggests a vein of new combinatorial problems concerning permutation groups on which group theorists will be needed to collaborate with biologists. We apply the new method to inferring distances and phylogenies for published Yersinia pestis data. PMID:23793228

  3. Gain and Loss of Phototrophic Genes Revealed by Comparison of Two Citromicrobium Bacterial Genomes

    PubMed Central

    Zheng, Qiang; Zhang, Rui; Fogg, Paul C. M.; Beatty, J. Thomas; Wang, Yu; Jiao, Nianzhi

    2012-01-01

    Proteobacteria are thought to have diverged from a phototrophic ancestor, according to the scattered distribution of phototrophy throughout the proteobacterial clade, and so the occurrence of numerous closely related phototrophic and chemotrophic microorganisms may be the result of the loss of genes for phototrophy. A widespread form of bacterial phototrophy is based on the photochemical reaction center, encoded by puf and puh operons that typically are in a ‘photosynthesis gene cluster’ (abbreviated as the PGC) with pigment biosynthesis genes. Comparison of two closely related Citromicrobial genomes (98.1% sequence identity of complete 16S rRNA genes), Citromicrobium sp. JL354, which contains two copies of reaction center genes, and Citromicrobium strain JLT1363, which is chemotrophic, revealed evidence for the loss of phototrophic genes. However, evidence of horizontal gene transfer was found in these two bacterial genomes. An incomplete PGC (pufLMC-puhCBA) in strain JL354 was located within an integrating conjugative element, which indicates a potential mechanism for the horizontal transfer of genes for phototrophy. PMID:22558224

  4. Genome-Scale Studies of Aging: Challenges and Opportunities

    PubMed Central

    McCormick, Mark A; Kennedy, Brian K

    2012-01-01

    Whole-genome studies involving a phenotype of interest are increasingly prevalent, in part due to a dramatic increase in speed at which many high throughput technologies can be performed coupled to simultaneous decreases in cost. This type of genome-scale methodology has been applied to the phenotype of lifespan, as well as to whole-transcriptome changes during the aging process or in mutants affecting aging. The value of high throughput discovery-based science in this field is clearly evident, but will it yield a true systems-level understanding of the aging process? Here we review some of this work to date, focusing on recent findings and the unanswered puzzles to which they point. In this context, we also discuss recent technological advances and some of the likely future directions that they portend. PMID:23633910

  5. 13C metabolic flux analysis at a genome-scale.

    PubMed

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  6. Tracing the Spread of Clostridium difficile Ribotype 027 in Germany Based on Bacterial Genome Sequences

    PubMed Central

    Steglich, Matthias; Nitsche, Andreas; von Müller, Lutz; Herrmann, Mathias; Kohl, Thomas A.; Niemann, Stefan; Nübel, Ulrich

    2015-01-01

    We applied whole-genome sequencing to reconstruct the spatial and temporal dynamics underpinning the expansion of Clostridium difficile ribotype 027 in Germany. Based on re-sequencing of genomes from 57 clinical C. difficile isolates, which had been collected from hospitalized patients at 36 locations throughout Germany between 1990 and 2012, we demonstrate that C. difficile genomes have accumulated sequence variation sufficiently fast to document the pathogen's spread at a regional scale. We detected both previously described lineages of fluoroquinolone-resistant C. difficile ribotype 027, FQR1 and FQR2. Using Bayesian phylogeographic analyses, we show that fluoroquinolone-resistant C. difficile 027 was imported into Germany at least four times, that it had been widely disseminated across multiple federal states even before the first outbreak was noted in 2007, and that it has continued to spread since. PMID:26444881

  7. First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking

    PubMed Central

    Ohashi, Akiko; Parks, Donovan H.; Yamauchi, Toshihiro; Tyson, Gene W.

    2015-01-01

    Filamentous cells belonging to the candidate bacterial phylum KSB3 were previously identified as the causative agent of fatal filament overgrowth (bulking) in a high-rate industrial anaerobic wastewater treatment bioreactor. Here, we obtained near complete genomes from two KSB3 populations in the bioreactor, including the dominant bulking filament, using differential coverage binning of metagenomic data. Fluorescence in situ hybridization with 16S rRNA-targeted probes specific for the two populations confirmed that both are filamentous organisms. Genome-based metabolic reconstruction and microscopic observation of the KSB3 filaments in the presence of sugar gradients indicate that both filament types are Gram-negative, strictly anaerobic fermenters capable of non-flagellar based gliding motility, and have a strikingly large number of sensory and response regulator genes. We propose that the KSB3 filaments are highly sensitive to their surroundings and that cellular processes, including those causing bulking, are controlled by external stimuli. The obtained genomes lay the foundation for a more detailed understanding of environmental cues used by KSB3 filaments, which may lead to more robust treatment options to prevent bulking. PMID:25650158

  8. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    SciTech Connect

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMahon, Katherine D.; Malmstrom, Rex R.

    2014-06-18

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.

  9. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    SciTech Connect

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMcahon, Katherine D.; Mamlstrom, Rex R.

    2014-05-12

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.

  10. LAF: Logic Alignment Free and its application to bacterial genomes classification.

    PubMed

    Weitschek, Emanuel; Cunial, Fabio; Felici, Giovanni

    2015-01-01

    Alignment-free algorithms can be used to estimate the similarity of biological sequences and hence are often applied to the phylogenetic reconstruction of genomes. Most of these algorithms rely on comparing the frequency of all the distinct substrings of fixed length (k-mers) that occur in the analyzed sequences. In this paper, we present Logic Alignment Free (LAF), a method that combines alignment-free techniques and rule-based classification algorithms in order to assign biological samples to their taxa. This method searches for a minimal subset of k-mers whose relative frequencies are used to build classification models as disjunctive-normal-form logic formulas (if-then rules). We apply LAF successfully to the classification of bacterial genomes to their corresponding taxonomy. In particular, we succeed in obtaining reliable classification at different taxonomic levels by extracting a handful of rules, each one based on the frequency of just few k-mers. State of the art methods to adjust the frequency of k-mers to the character distribution of the underlying genomes have negligible impact on classification performance, suggesting that the signal of each class is strong and that LAF is effective in identifying it. PMID:26664519

  11. Insertion sequence-caused large-scale rearrangements in the genome of Escherichia coli

    PubMed Central

    Lee, Heewook; Doak, Thomas G.; Popodi, Ellen; Foster, Patricia L.; Tang, Haixu

    2016-01-01

    A majority of large-scale bacterial genome rearrangements involve mobile genetic elements such as insertion sequence (IS) elements. Here we report novel insertions and excisions of IS elements and recombination between homologous IS elements identified in a large collection of Escherichia coli mutation accumulation lines by analysis of whole genome shotgun sequencing data. Based on 857 identified events (758 IS insertions, 98 recombinations and 1 excision), we estimate that the rate of IS insertion is 3.5 × 10−4 insertions per genome per generation and the rate of IS homologous recombination is 4.5 × 10−5 recombinations per genome per generation. These events are mostly contributed by the IS elements IS1, IS2, IS5 and IS186. Spatial analysis of new insertions suggest that transposition is biased to proximal insertions, and the length spectrum of IS-caused deletions is largely explained by local hopping. For any of the ISs studied there is no region of the circular genome that is favored or disfavored for new insertions but there are notable hotspots for deletions. Some elements have preferences for non-coding sequence or for the beginning and end of coding regions, largely explained by target site motifs. Interestingly, transposition and deletion rates remain constant across the wild-type and 12 mutant E. coli lines, each deficient in a distinct DNA repair pathway. Finally, we characterized the target sites of four IS families, confirming previous results and characterizing a highly specific pattern at IS186 target-sites, 5′-GGGG(N6/N7)CCCC-3′. We also detected 48 long deletions not involving IS elements. PMID:27431326

  12. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    SciTech Connect

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  13. Genome Sequence of a Copper-Resistant Strain of Acidovorax citrulli Causing Bacterial Fruit Blotch of Melons

    PubMed Central

    Wang, Tielin; Yang, Yuwen

    2015-01-01

    Bacterial fruit blotch (BFB) of melons is a seed-borne disease caused by Acidovorax citrulli. We determined the draft genome of A. citrulli Tw6. The strain was isolated from a watermelon collected from Beijing, China. The A. citrulli Tw6 genome contains 5,080,614 bp and has a G+C content of 68.7 mol%. PMID:25908132

  14. EFFECT OF BACTERIAL SULFATE REDUCTION ON IRON-CORROSION SCALES

    EPA Science Inventory

    Iron-sulfur geochemistry is important in many natural and engineered environments including drinking water systems. In the anaerobic environment beneath scales of corroding iron drinking water distribution system pipes, sulfate reducing bacteria (SRB) produce sulfide from natura...

  15. Genome-scale constraint-based modeling of Geobacter metallireducens

    PubMed Central

    Sun, Jun; Sayyar, Bahareh; Butler, Jessica E; Pharkya, Priti; Fahland, Tom R; Famili, Iman; Schilling, Christophe H; Lovley, Derek R; Mahadevan, Radhakrishnan

    2009-01-01

    Background Geobacter metallireducens was the first organism that can be grown in pure culture to completely oxidize organic compounds with Fe(III) oxide serving as electron acceptor. Geobacter species, including G. sulfurreducens and G. metallireducens, are used for bioremediation and electricity generation from waste organic matter and renewable biomass. The constraint-based modeling approach enables the development of genome-scale in silico models that can predict the behavior of complex biological systems and their responses to the environments. Such a modeling approach was applied to provide physiological and ecological insights on the metabolism of G. metallireducens. Results The genome-scale metabolic model of G. metallireducens was constructed to include 747 genes and 697 reactions. Compared to the G. sulfurreducens model, the G. metallireducens metabolic model contains 118 unique reactions that reflect many of G. metallireducens' specific metabolic capabilities. Detailed examination of the G. metallireducens model suggests that its central metabolism contains several energy-inefficient reactions that are not present in the G. sulfurreducens model. Experimental biomass yield of G. metallireducens growing on pyruvate was lower than the predicted optimal biomass yield. Microarray data of G. metallireducens growing with benzoate and acetate indicated that genes encoding these energy-inefficient reactions were up-regulated by benzoate. These results suggested that the energy-inefficient reactions were likely turned off during G. metallireducens growth with acetate for optimal biomass yield, but were up-regulated during growth with complex electron donors such as benzoate for rapid energy generation. Furthermore, several computational modeling approaches were applied to accelerate G. metallireducens research. For example, growth of G. metallireducens with different electron donors and electron acceptors were studied using the genome-scale metabolic model, which

  16. Spatial Scales of Bacterial Diversity in Cold-Water Coral Reef Ecosystems

    PubMed Central

    Schöttner, Sandra; Wild, Christian; Hoffmann, Friederike; Boetius, Antje; Ramette, Alban

    2012-01-01

    Background Cold-water coral reef ecosystems are recognized as biodiversity hotspots in the deep sea, but insights into their associated bacterial communities are still limited. Deciphering principle patterns of bacterial community variation over multiple spatial scales may however prove critical for a better understanding of factors contributing to cold-water coral reef stability and functioning. Methodology/Principal Findings Bacterial community structure, as determined by Automated Ribosomal Intergenic Spacer Analysis (ARISA), was investigated with respect to (i) microbial habitat type and (ii) coral species and color, as well as the three spatial components (iii) geomorphologic reef zoning, (iv) reef boundary, and (v) reef location. Communities revealed fundamental differences between coral-generated (branch surface, mucus) and ambient microbial habitats (seawater, sediments). This habitat specificity appeared pivotal for determining bacterial community shifts over all other study levels investigated. Coral-derived surfaces showed species-specific patterns, differing significantly between Lophelia pertusa and Madrepora oculata, but not between L. pertusa color types. Within the reef center, no community distinction corresponded to geomorphologic reef zoning for both coral-generated and ambient microbial habitats. Beyond the reef center, however, bacterial communities varied considerably from local to regional scales, with marked shifts toward the reef periphery as well as between different in- and offshore reef sites, suggesting significant biogeographic imprinting but weak microbe-host specificity. Conclusions/Significance This study presents the first multi-scale survey of bacterial diversity in cold-water coral reefs, spanning a total of five observational levels including three spatial scales. It demonstrates that bacterial communities in cold-water coral reefs are structured by multiple factors acting at different spatial scales, which has fundamental

  17. Genomics of Bacterial and Archaeal Viruses: Dynamics within the Prokaryotic Virosphere

    PubMed Central

    Krupovic, Mart; Prangishvili, David; Hendrix, Roger W.; Bamford, Dennis H.

    2011-01-01

    Summary: Prokaryotes, bacteria and archaea, are the most abundant cellular organisms among those sharing the planet Earth with human beings (among others). However, numerous ecological studies have revealed that it is actually prokaryotic viruses that predominate on our planet and outnumber their hosts by at least an order of magnitude. An understanding of how this viral domain is organized and what are the mechanisms governing its evolution is therefore of great interest and importance. The vast majority of characterized prokaryotic viruses belong to the order Caudovirales, double-stranded DNA (dsDNA) bacteriophages with tails. Consequently, these viruses have been studied (and reviewed) extensively from both genomic and functional perspectives. However, albeit numerous, tailed phages represent only a minor fraction of the prokaryotic virus diversity. Therefore, the knowledge which has been generated for this viral system does not offer a comprehensive view of the prokaryotic virosphere. In this review, we discuss all families of bacterial and archaeal viruses that contain more than one characterized member and for which evolutionary conclusions can be attempted by use of comparative genomic analysis. We focus on the molecular mechanisms of their genome evolution as well as on the relationships between different viral groups and plasmids. It becomes clear that evolutionary mechanisms shaping the genomes of prokaryotic viruses vary between different families and depend on the type of the nucleic acid, characteristics of the virion structure, as well as the mode of the life cycle. We also point out that horizontal gene transfer is not equally prevalent in different virus families and is not uniformly unrestricted for diverse viral functions. PMID:22126996

  18. Bacterial origin of a diverse family of UDP-glycosyltransferase genes in the Tetranychus urticae genome.

    PubMed

    Ahn, Seung-Joon; Dermauw, Wannes; Wybouw, Nicky; Heckel, David G; Van Leeuwen, Thomas

    2014-07-01

    UDP-glycosyltransferases (UGTs) catalyze the conjugation of a variety of small lipophilic molecules with uridine diphosphate (UDP) sugars, altering them into more water-soluble metabolites. Thereby, UGTs play an important role in the detoxification of xenobiotics and in the regulation of endobiotics. Recently, the genome sequence was reported for the two-spotted spider mite, Tetranychus urticae, a polyphagous herbivore damaging a number of agricultural crops. Although various gene families implicated in xenobiotic metabolism have been documented in T. urticae, UGTs so far have not. We identified 80 UGT genes in the T. urticae genome, the largest number of UGT genes in a metazoan species reported so far. Phylogenetic analysis revealed that lineage-specific gene expansions increased the diversity of the T. urticae UGT repertoire. Genomic distribution, intron-exon structure and structural motifs in the T. urticae UGTs were also described. In addition, expression profiling after host-plant shifts and in acaricide resistant lines supported an important role for UGT genes in xenobiotic metabolism. Expanded searches of UGTs in other arachnid species (Subphylum Chelicerata), including a spider, a scorpion, two ticks and two predatory mites, unexpectedly revealed the complete absence of UGT genes. However, a centipede (Subphylum Myriapoda) and a water flea and a crayfish (Subphylum Crustacea) contain UGT genes in their genomes similar to insect UGTs, suggesting that the UGT gene family might have been lost early in the Chelicerata lineage and subsequently re-gained in the tetranychid mites. Sequence similarity of T. urticae UGTs and bacterial UGTs and their phylogenetic reconstruction suggest that spider mites acquired UGT genes from bacteria by horizontal gene transfer. Our findings show a unique evolutionary history of the T. urticae UGT gene family among other arthropods and provide important clues to its functions in relation to detoxification and thereby host

  19. Identification of the binding sites of regulatory proteins in bacterial genomes

    PubMed Central

    Li, Hao; Rhodius, Virgil; Gross, Carol; Siggia, Eric D.

    2002-01-01

    We present an algorithm that extracts the binding sites (represented by position-specific weight matrices) for many different transcription factors from the regulatory regions of a genome, without the need for delineating groups of coregulated genes. The algorithm uses the fact that many DNA-binding proteins in bacteria bind to a bipartite motif with two short segments more conserved than the intervening region. It identifies all statistically significant patterns of the form W1NxW2, where W1 and W2 are two short oligonucleotides separated by x arbitrary bases, and groups them into clusters of similar patterns. These clusters are then used to derive quantitative recognition profiles of putative regulatory proteins. For a given cluster, the algorithm finds the matching sequences plus the flanking regions in the genome and performs a multiple sequence alignment to derive position-specific weight matrices. We have analyzed the Escherichia coli genome with this algorithm and found ≈1,500 significant patterns, which give rise to ≈160 distinct position-specific weight matrices. A fraction of these matrices match the binding sites of one-third of the ≈60 characterized transcription factors with high statistical significance. Many of the remaining matrices are likely to describe binding sites and regulons of uncharacterized transcription factors. The significance of these matrices was evaluated by their specificity, the location of the predicted sites, and the biological functions of the corresponding regulons, allowing us to suggest putative regulatory functions. The algorithm is efficient for analyzing newly sequenced bacterial genomes for which little is known about transcriptional regulation. PMID:12181488

  20. Genome scale metabolic modeling of the riboflavin overproducer Ashbya gossypii.

    PubMed

    Ledesma-Amaro, Rodrigo; Kerkhoven, Eduard J; Revuelta, José Luis; Nielsen, Jens

    2014-06-01

    Ashbya gossypii is a filamentous fungus that naturally overproduces riboflavin, or vitamin B2. Advances in genetic and metabolic engineering of A. gossypii have permitted the switch from industrial chemical synthesis to the current biotechnological production of this vitamin. Additionally, A. gossypii is a model organism with one of the smallest eukaryote genomes being phylogenetically close to Saccharomyces cerevisiae. It has therefore been used to study evolutionary aspects of bakers' yeast. We here reconstructed the first genome scale metabolic model of A. gossypii, iRL766. The model was validated by biomass growth, riboflavin production and substrate utilization predictions. Gene essentiality analysis of the A. gossypii model in comparison with the S. cerevisiae model demonstrated how the whole-genome duplication event that separates the two species has led to an even spread of paralogs among all metabolic pathways. Additionally, iRL766 was used to integrate transcriptomics data from two different growth stages of A. gossypii, comparing exponential growth to riboflavin production stages. Both reporter metabolite analysis and in silico identification of transcriptionally regulated enzymes demonstrated the important involvement of beta-oxidation and the glyoxylate cycle in riboflavin production. PMID:24374726

  1. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    PubMed

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. PMID:25110940

  2. Predicting effects of structural stress in a genome-reduced model bacterial metabolism

    NASA Astrophysics Data System (ADS)

    Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

    2012-08-01

    Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.

  3. Engineering the largest RNA virus genome as an infectious bacterial artificial chromosome

    PubMed Central

    Almazán, Fernando; González, José M.; Pénzes, Zoltan; Izeta, Ander; Calvo, Enrique; Plana-Durán, Juan; Enjuanes, Luis

    2000-01-01

    The construction of cDNA clones encoding large-size RNA molecules of biological interest, like coronavirus genomes, which are among the largest mature RNA molecules known to biology, has been hampered by the instability of those cDNAs in bacteria. Herein, we show that the application of two strategies, cloning of the cDNAs into a bacterial artificial chromosome and nuclear expression of RNAs that are typically produced within the cytoplasm, is useful for the engineering of large RNA molecules. A cDNA encoding an infectious coronavirus RNA genome has been cloned as a bacterial artificial chromosome. The rescued coronavirus conserved all of the genetic markers introduced throughout the sequence and showed a standard mRNA pattern and the antigenic characteristics expected for the synthetic virus. The cDNA was transcribed within the nucleus, and the RNA translocated to the cytoplasm. Interestingly, the recovered virus had essentially the same sequence as the original one, and no splicing was observed. The cDNA was derived from an attenuated isolate that replicates exclusively in the respiratory tract of swine. During the engineering of the infectious cDNA, the spike gene of the virus was replaced by the spike gene of an enteric isolate. The synthetic virus replicated abundantly in the enteric tract and was fully virulent, demonstrating that the tropism and virulence of the recovered coronavirus can be modified. This demonstration opens up the possibility of employing this infectious cDNA as a vector for vaccine development in human, porcine, canine, and feline species susceptible to group 1 coronaviruses. PMID:10805807

  4. Genome-scale thermodynamic analysis of Escherichia coli metabolism.

    PubMed

    Henry, Christopher S; Jankowski, Matthew D; Broadbelt, Linda J; Hatzimanikatis, Vassily

    2006-02-15

    Genome-scale metabolic models are an invaluable tool for analyzing metabolic systems as they provide a more complete picture of the processes of metabolism. We have constructed a genome-scale metabolic model of Escherichia coli based on the iJR904 model developed by the Palsson Laboratory at the University of California at San Diego. Group contribution methods were utilized to estimate the standard Gibbs free energy change of every reaction in the constructed model. Reactions in the model were classified based on the activity of the reactions during optimal growth on glucose in aerobic media. The most thermodynamically unfavorable reactions involved in the production of biomass in E. coli were identified as ATP phosphoribosyltransferase, ATP synthase, methylene-tetra-hydrofolate dehydrogenase, and tryptophanase. The effect of a knockout of these reactions on the production of biomass and the production of individual biomass precursors was analyzed. Changes in the distribution of fluxes in the cell after knockout of these unfavorable reactions were also studied. The methodologies and results discussed can be used to facilitate the refinement of the feasible ranges for cellular parameters such as species concentrations and reaction rate constants. PMID:16299075

  5. Current state of genome-scale modeling in filamentous fungi.

    PubMed

    Brandl, Julian; Andersen, Mikael R

    2015-06-01

    The group of filamentous fungi contains important species used in industrial biotechnology for acid, antibiotics and enzyme production. Their unique lifestyle turns these organisms into a valuable genetic reservoir of new natural products and biomass degrading enzymes that has not been used to full capacity. One of the major bottlenecks in the development of new strains into viable industrial hosts is the alteration of the metabolism towards optimal production. Genome-scale models promise a reduction in the time needed for metabolic engineering by predicting the most potent targets in silico before testing them in vivo. The increasing availability of high quality models and molecular biological tools for manipulating filamentous fungi renders the model-guided engineering of these fungal factories possible with comprehensive metabolic networks. A typical fungal model contains on average 1138 unique metabolic reactions and 1050 ORFs, making them a vast knowledge-base of fungal metabolism. In the present review we focus on the current state as well as potential future applications of genome-scale models in filamentous fungi. PMID:25700817

  6. Flux Coupling Analysis of Genome-Scale Metabolic Network Reconstructions

    PubMed Central

    Burgard, Anthony P.; Nikolaev, Evgeni V.; Schilling, Christophe H.; Maranas, Costas D.

    2004-01-01

    In this paper, we introduce the Flux Coupling Finder (FCF) framework for elucidating the topological and flux connectivity features of genome-scale metabolic networks. The framework is demonstrated on genome-scale metabolic reconstructions of Helicobacter pylori, Escherichia coli, and Saccharomyces cerevisiae. The analysis allows one to determine whether any two metabolic fluxes, v1 and v2, are (1) directionally coupled, if a non-zero flux for v1 implies a non-zero flux for v2 but not necessarily the reverse; (2) partially coupled, if a non-zero flux for v1 implies a non-zero, though variable, flux for v2 and vice versa; or (3) fully coupled, if a non-zero flux for v1 implies not only a non-zero but also a fixed flux for v2 and vice versa. Flux coupling analysis also enables the global identification of blocked reactions, which are all reactions incapable of carrying flux under a certain condition; equivalent knockouts, defined as the set of all possible reactions whose deletion forces the flux through a particular reaction to zero; and sets of affected reactions denoting all reactions whose fluxes are forced to zero if a particular reaction is deleted. The FCF approach thus provides a novel and versatile tool for aiding metabolic reconstructions and guiding genetic manipulations. PMID:14718379

  7. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    SciTech Connect

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  8. Complete Genome Sequence of Lactobacillus rhamnosus Strain BPL5 (CECT 8800), a Probiotic for Treatment of Bacterial Vaginosis.

    PubMed

    Chenoll, Empar; Codoñer, Francisco M; Martinez-Blanch, Juan F; Ramón, Daniel; Genovés, Salvador; Menabrito, Marco

    2016-01-01

    ITALIC! Lactobacillus rhamnosusBPL5 (CECT 8800), is a probiotic strain suitable for the treatment of bacterial vaginosis. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insight into its functional activity. PMID:27103719

  9. Complete Genome Sequence of Lactobacillus rhamnosus Strain BPL5 (CECT 8800), a Probiotic for Treatment of Bacterial Vaginosis

    PubMed Central

    Codoñer, Francisco M.; Martinez-Blanch, Juan F.; Ramón, Daniel; Menabrito, Marco

    2016-01-01

    Lactobacillus rhamnosus BPL5 (CECT 8800), is a probiotic strain suitable for the treatment of bacterial vaginosis. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insight into its functional activity. PMID:27103719

  10. Complete Genome Sequence of Gluconacetobacter hansenii Strain NQ5 (ATCC 53582), an Efficient Producer of Bacterial Cellulose

    PubMed Central

    Pfeffer, Sarah; Mehta, Kalpa

    2016-01-01

    This study reports the release of the complete nucleotide sequence of Gluconacetobacter hansenii strain NQ5 (ATCC 53582). This strain was isolated by R. Malcolm Brown, Jr. in a sugar mill in North Queensland, Australia, and is an efficient producer of bacterial cellulose. The elucidation of the genome will contribute to the study of the molecular mechanisms necessary for cellulose biosynthesis. PMID:27516505

  11. Complete Genome Sequence of Gluconacetobacter hansenii Strain NQ5 (ATCC 53582), an Efficient Producer of Bacterial Cellulose.

    PubMed

    Pfeffer, Sarah; Mehta, Kalpa; Brown, R Malcolm

    2016-01-01

    This study reports the release of the complete nucleotide sequence of Gluconacetobacter hansenii strain NQ5 (ATCC 53582). This strain was isolated by R. Malcolm Brown, Jr. in a sugar mill in North Queensland, Australia, and is an efficient producer of bacterial cellulose. The elucidation of the genome will contribute to the study of the molecular mechanisms necessary for cellulose biosynthesis. PMID:27516505

  12. A peptide identification-free, genome sequence-independent shotgun proteomics workflow for strain-level bacterial differentiation

    PubMed Central

    Shao, Wenguang; Zhang, Min; Lam, Henry; Lau, Stanley C. K.

    2015-01-01

    Shotgun proteomics is an emerging tool for bacterial identification and differentiation. However, the identification of the mass spectra of peptides to genome-derived peptide sequences remains a key issue that limits the use of shotgun proteomics to bacteria with genome sequences available. In this proof-of-concept study, we report a novel bacterial fingerprinting method that enjoys the resolving power and accuracy of mass spectrometry without the burden of peptide identification (i.e. genome sequence-independent). This method uses a similarity-clustering algorithm to search for mass spectra that are derived from the same peptide and merge them into a unique consensus spectrum as the basis to generate proteomic fingerprints of bacterial isolates. In comparison to a traditional peptide identification-based shotgun proteomics workflow and a PCR-based DNA fingerprinting method targeting the repetitive extragenic palindromes elements in bacterial genomes, the novel method generated fingerprints that were richer in information and more discriminative in differentiating E. coli isolates by their animal sources. The novel method is readily deployable to any cultivable bacteria, and may be used for several fields of study such as environmental microbiology, applied microbiology, and clinical microbiology. PMID:26395646

  13. A peptide identification-free, genome sequence-independent shotgun proteomics workflow for strain-level bacterial differentiation.

    PubMed

    Shao, Wenguang; Zhang, Min; Lam, Henry; Lau, Stanley C K

    2015-01-01

    Shotgun proteomics is an emerging tool for bacterial identification and differentiation. However, the identification of the mass spectra of peptides to genome-derived peptide sequences remains a key issue that limits the use of shotgun proteomics to bacteria with genome sequences available. In this proof-of-concept study, we report a novel bacterial fingerprinting method that enjoys the resolving power and accuracy of mass spectrometry without the burden of peptide identification (i.e. genome sequence-independent). This method uses a similarity-clustering algorithm to search for mass spectra that are derived from the same peptide and merge them into a unique consensus spectrum as the basis to generate proteomic fingerprints of bacterial isolates. In comparison to a traditional peptide identification-based shotgun proteomics workflow and a PCR-based DNA fingerprinting method targeting the repetitive extragenic palindromes elements in bacterial genomes, the novel method generated fingerprints that were richer in information and more discriminative in differentiating E. coli isolates by their animal sources. The novel method is readily deployable to any cultivable bacteria, and may be used for several fields of study such as environmental microbiology, applied microbiology, and clinical microbiology. PMID:26395646

  14. The ClinSeq Project: Piloting large-scale genome sequencing for research in genomic medicine

    PubMed Central

    Biesecker, Leslie G.; Mullikin, James C.; Facio, Flavia M.; Turner, Clesson; Cherukuri, Praveen F.; Blakesley, Robert W.; Bouffard, Gerard G.; Chines, Peter S.; Cruz, Pedro; Hansen, Nancy F.; Teer, Jamie K.; Maskeri, Baishali; Young, Alice C.; Manolio, Teri A.; Wilson, Alexander F.; Finkel, Toren; Hwang, Paul; Arai, Andrew; Remaley, Alan T.; Sachdev, Vandana; Shamburek, Robert; Cannon, Richard O.; Green, Eric D.

    2009-01-01

    ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DNA sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues related to the genetic architecture of disease, implementation of genomic technology, informed consent, disclosure of genetic information, and archiving, analyzing, and displaying sequence data. In the initial phase of ClinSeq, we are enrolling roughly 1000 participants; the evaluation of each includes obtaining a detailed family and medical history, as well as a clinical evaluation. The participants are being consented broadly for research on many traits and for whole-genome sequencing. Initially, Sanger-based sequencing of 300–400 genes thought to be relevant to atherosclerosis is being performed, with the resulting data analyzed for rare, high-penetrance variants associated with specific clinical traits. The participants are also being consented to allow the contact of family members for additional studies of sequence variants to explore their potential association with specific phenotypes. Here, we present the general considerations in designing ClinSeq, preliminary results based on the generation of an initial 826 Mb of sequence data, the findings for several genes that serve as positive controls for the project, and our views about the potential implications of ClinSeq. The early experiences with ClinSeq illustrate how large-scale medical sequencing can be a practical, productive, and critical component of research in genomic medicine. PMID:19602640

  15. Limitations to estimating bacterial cross-speciestransmission using genetic and genomic markers: inferencesfrom simulation modeling

    USGS Publications Warehouse

    Julio Andre, Benavides; Cross, Paul C.; Luikart, Gordon; Scott, Creel

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced.

  16. Application of Whole Genome Expression Analysis to Assess Bacterial Responses to Environmental Conditions

    NASA Astrophysics Data System (ADS)

    Vukanti, R. V.; Mintz, E. M.; Leff, L. G.

    2005-05-01

    Bacterial responses to environmental signals are multifactorial and are coupled to changes in gene expression. An understanding of bacterial responses to environmental conditions is possible using microarray expression analysis. In this study, the utility of microarrays for examining changes in gene expression in Escherichia coli under different environmental conditions was assessed. RNA was isolated, hybridized to Affymetrix E. coli Genome 2.0 chips and analyzed using Affymetrix GCOS and Genespring software. Major limiting factors were obtaining enough quality RNA (107-108 cells to get 10μg RNA)and accounting for differences in growth rates under different conditions. Stabilization of RNA prior to isolation and taking extreme precautions while handling RNA were crucial. In addition, use of this method in ecological studies is limited by availability and cost of commercial arrays; choice of primers for cDNA synthesis, reproducibility, complexity of results generated and need to validate findings. This method may be more widely applicable with the development of better approaches for RNA recovery from environmental samples and increased number of available strain-specific arrays. Diligent experimental design and verification of results with real-time PCR or northern blots is needed. Overall, there is a great potential for use of this technology to discover mechanisms underlying organisms' responses to environmental conditions.

  17. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

    PubMed Central

    Damienikan, Aliaksandr U.

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  18. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.

    PubMed

    Nikolaichik, Yevgeny; Damienikan, Aliaksandr U

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  19. Distinct soil bacterial communities along a small-scale elevational gradient in alpine tundra

    PubMed Central

    Shen, Congcong; Ni, Yingying; Liang, Wenju; Wang, Jianjun; Chu, Haiyan

    2015-01-01

    The elevational diversity pattern for microorganisms has received great attention recently but is still understudied, and phylogenetic relatedness is rarely studied for microbial elevational distributions. Using a bar-coded pyrosequencing technique, we examined the biodiversity patterns for soil bacterial communities of tundra ecosystem along 2000–2500 m elevations on Changbai Mountain in China. Bacterial taxonomic richness displayed a linear decreasing trend with increasing elevation. Phylogenetic diversity and mean nearest taxon distance (MNTD) exhibited a unimodal pattern with elevation. Bacterial communities were more phylogenetically clustered than expected by chance at all elevations based on the standardized effect size of MNTD metric. The bacterial communities differed dramatically among elevations, and the community composition was significantly correlated with soil total carbon (TC), total nitrogen, C:N ratio, and dissolved organic carbon. Multiple ordinary least squares regression analysis showed that the observed biodiversity patterns strongly correlated with soil TC and C:N ratio. Taken together, this is the first time that a significant bacterial diversity pattern has been observed across a small-scale elevational gradient. Our results indicated that soil carbon and nitrogen contents were the critical environmental factors affecting bacterial elevational distribution in Changbai Mountain tundra. This suggested that ecological niche-based environmental filtering processes related to soil carbon and nitrogen contents could play a dominant role in structuring bacterial communities along the elevational gradient. PMID:26217308

  20. Dynamic bacterial communities on reverse-osmosis membranes in a full-scale desalination plant.

    PubMed

    Manes, C-L de O; West, N; Rapenne, S; Lebaron, P

    2011-01-01

    To better understand biofouling of seawater reverse osmosis (SWRO) membranes, bacterial diversity was characterized in the intake water, in subsequently pretreated water and on SWRO membranes from a full-scale desalination plant (FSDP) during a 9 month period. 16S rRNA gene fingerprinting and sequencing revealed that bacterial communities in the water samples and on the SWRO membranes were very different. For the different sampling dates, the bacterial diversity of the active and the total bacterial fractions of the water samples remained relatively stable over the sampling period whereas the bacterial community structure on the four SWRO membrane samples was significantly different. The richness and evenness of the SWRO membrane bacterial communities increased with usage time with an increase in the Shannon diversity index of 2.2 to 3.7. In the oldest SWRO membrane (330 days), no single operational taxonomic unit (OTU) dominated and the majority of the OTUs fell into the Alphaproteobacteria or the Planctomycetes. In striking contrast, a Betaproteobacteria OTU affiliated to the genus Ideonella was dominant and exclusively found in the membrane used for the shortest time (10 days). This suggests that bacteria belonging to this genus could be one of the primary colonizers of the SWRO membrane. Knowledge of the dominant bacterial species on SWRO membranes and their dynamics should help guide culture studies for physiological characterization of biofilm forming species. PMID:21108068

  1. Sodium Ion Cycle in Bacterial Pathogens: Evidence from Cross-Genome Comparisons

    PubMed Central

    Häse, Claudia C.; Fedorova, Natalie D.; Galperin, Michael Y.; Dibrov, Pavel A.

    2001-01-01

    Analysis of the bacterial genome sequences shows that many human and animal pathogens encode primary membrane Na+ pumps, Na+-transporting dicarboxylate decarboxylases or Na+-translocating NADH:ubiquinone oxidoreductase, and a number of Na+-dependent permeases. This indicates that these bacteria can utilize Na+ as a coupling ion instead of or in addition to the H+ cycle. This capability to use a Na+ cycle might be an important virulence factor for such pathogens as Vibrio cholerae, Neisseria meningitidis, Salmonella enterica serovar Typhi, and Yersinia pestis. In Treponema pallidum, Chlamydia trachomatis, and Chlamydia pneumoniae, the Na+ gradient may well be the only energy source for secondary transport. A survey of preliminary genome sequences of Porphyromonas gingivalis, Actinobacillus actinomycetemcomitans, and Treponema denticola indicates that these oral pathogens also rely on the Na+ cycle for at least part of their energy metabolism. The possible roles of the Na+ cycling in the energy metabolism and pathogenicity of these organisms are reviewed. The recent discovery of an effective natural antibiotic, korormicin, targeted against the Na+-translocating NADH:ubiquinone oxidoreductase, suggests a potential use of Na+ pumps as drug targets and/or vaccine candidates. The antimicrobial potential of other inhibitors of the Na+ cycle, such as monensin, Li+ and Ag+ ions, and amiloride derivatives, is discussed. PMID:11528000

  2. Spatial scales of bacterial community diversity at cold seeps (Eastern Mediterranean Sea).

    PubMed

    Pop Ristova, Petra; Wenzhöfer, Frank; Ramette, Alban; Felden, Janine; Boetius, Antje

    2015-06-01

    Cold seeps are highly productive, fragmented marine ecosystems that form at the seafloor around hydrocarbon emission pathways. The products of microbial utilization of methane and other hydrocarbons fuel rich chemosynthetic communities at these sites, with much higher respiration rates compared with the surrounding deep-sea floor. Yet little is known as to the richness, composition and spatial scaling of bacterial communities of cold seeps compared with non-seep communities. Here we assessed the bacterial diversity across nine different cold seeps in the Eastern Mediterranean deep-sea and surrounding seafloor areas. Community similarity analyses were carried out based on automated ribosomal intergenic spacer analysis (ARISA) fingerprinting and high-throughput 454 tag sequencing and were combined with in situ and ex situ geochemical analyses across spatial scales of a few tens of meters to hundreds of kilometers. Seep communities were dominated by Deltaproteobacteria, Epsilonproteobacteria and Gammaproteobacteria and shared, on average, 36% of bacterial types (ARISA OTUs (operational taxonomic units)) with communities from nearby non-seep deep-sea sediments. Bacterial communities of seeps were significantly different from those of non-seep sediments. Within cold seep regions on spatial scales of only tens to hundreds of meters, the bacterial communities differed considerably, sharing <50% of types at the ARISA OTU level. Their variations reflected differences in porewater sulfide concentrations from anaerobic degradation of hydrocarbons. This study shows that cold seep ecosystems contribute substantially to the microbial diversity of the deep-sea. PMID:25500510

  3. Spatial scales of bacterial community diversity at cold seeps (Eastern Mediterranean Sea)

    PubMed Central

    Pop Ristova, Petra; Wenzhöfer, Frank; Ramette, Alban; Felden, Janine; Boetius, Antje

    2015-01-01

    Cold seeps are highly productive, fragmented marine ecosystems that form at the seafloor around hydrocarbon emission pathways. The products of microbial utilization of methane and other hydrocarbons fuel rich chemosynthetic communities at these sites, with much higher respiration rates compared with the surrounding deep-sea floor. Yet little is known as to the richness, composition and spatial scaling of bacterial communities of cold seeps compared with non-seep communities. Here we assessed the bacterial diversity across nine different cold seeps in the Eastern Mediterranean deep-sea and surrounding seafloor areas. Community similarity analyses were carried out based on automated ribosomal intergenic spacer analysis (ARISA) fingerprinting and high-throughput 454 tag sequencing and were combined with in situ and ex situ geochemical analyses across spatial scales of a few tens of meters to hundreds of kilometers. Seep communities were dominated by Deltaproteobacteria, Epsilonproteobacteria and Gammaproteobacteria and shared, on average, 36% of bacterial types (ARISA OTUs (operational taxonomic units)) with communities from nearby non-seep deep-sea sediments. Bacterial communities of seeps were significantly different from those of non-seep sediments. Within cold seep regions on spatial scales of only tens to hundreds of meters, the bacterial communities differed considerably, sharing <50% of types at the ARISA OTU level. Their variations reflected differences in porewater sulfide concentrations from anaerobic degradation of hydrocarbons. This study shows that cold seep ecosystems contribute substantially to the microbial diversity of the deep-sea. PMID:25500510

  4. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  5. Bacterial Artificial Chromosomes: A Functional Genomics Tool for the Study of Positive-strand RNA Viruses

    PubMed Central

    Yun, Sang-Im; Song, Byung-Hak; Kim, Jin-Kyoung; Lee, Young-Min

    2015-01-01

    Reverse genetics, an approach to rescue infectious virus entirely from a cloned cDNA, has revolutionized the field of positive-strand RNA viruses, whose genomes have the same polarity as cellular mRNA. The cDNA-based reverse genetics system is a seminal method that enables direct manipulation of the viral genomic RNA, thereby generating recombinant viruses for molecular and genetic studies of both viral RNA elements and gene products in viral replication and pathogenesis. It also provides a valuable platform that allows the development of genetically defined vaccines and viral vectors for the delivery of foreign genes. For many positive-strand RNA viruses such as Japanese encephalitis virus (JEV), however, the cloned cDNAs are unstable, posing a major obstacle to the construction and propagation of the functional cDNA. Here, the present report describes the strategic considerations in creating and amplifying a genetically stable full-length infectious JEV cDNA as a bacterial artificial chromosome (BAC) using the following general experimental procedures: viral RNA isolation, cDNA synthesis, cDNA subcloning and modification, assembly of a full-length cDNA, cDNA linearization, in vitro RNA synthesis, and virus recovery. This protocol provides a general methodology applicable to cloning full-length cDNA for a range of positive-strand RNA viruses, particularly those with a genome of >10 kb in length, into a BAC vector, from which infectious RNAs can be transcribed in vitro with a bacteriophage RNA polymerase. PMID:26780115

  6. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report

    SciTech Connect

    Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

    2010-01-26

    Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

  7. Bacterial Artificial Chromosomes: A Functional Genomics Tool for the Study of Positive-strand RNA Viruses.

    PubMed

    Yun, Sang-Im; Song, Byung-Hak; Kim, Jin-Kyoung; Lee, Young-Min

    2015-01-01

    Reverse genetics, an approach to rescue infectious virus entirely from a cloned cDNA, has revolutionized the field of positive-strand RNA viruses, whose genomes have the same polarity as cellular mRNA. The cDNA-based reverse genetics system is a seminal method that enables direct manipulation of the viral genomic RNA, thereby generating recombinant viruses for molecular and genetic studies of both viral RNA elements and gene products in viral replication and pathogenesis. It also provides a valuable platform that allows the development of genetically defined vaccines and viral vectors for the delivery of foreign genes. For many positive-strand RNA viruses such as Japanese encephalitis virus (JEV), however, the cloned cDNAs are unstable, posing a major obstacle to the construction and propagation of the functional cDNA. Here, the present report describes the strategic considerations in creating and amplifying a genetically stable full-length infectious JEV cDNA as a bacterial artificial chromosome (BAC) using the following general experimental procedures: viral RNA isolation, cDNA synthesis, cDNA subcloning and modification, assembly of a full-length cDNA, cDNA linearization, in vitro RNA synthesis, and virus recovery. This protocol provides a general methodology applicable to cloning full-length cDNA for a range of positive-strand RNA viruses, particularly those with a genome of >10 kb in length, into a BAC vector, from which infectious RNAs can be transcribed in vitro with a bacteriophage RNA polymerase. PMID:26780115

  8. Genome sequencing and systems biology analysis of a lipase-producing bacterial strain.

    PubMed

    Li, N; Li, D D; Zhang, Y Z; Yuan, Y Z; Geng, H; Xiong, L; Liu, D L

    2016-01-01

    Lipase-producing bacteria are naturally-occurring, industrially-relevant microorganisms that produce lipases, which can be used to synthesize biodiesel from waste oils. The efficiency of lipase expression varies between various microbial strains. Therefore, strains that can produce lipases with high efficiency must be screened, and the conditions of lipase metabolism and optimization of the production process in a given environment must be thoroughly studied. A high efficiency lipase-producing strain was isolated from the sediments of Jinsha River, identified by 16S rRNA sequence analysis as Serratia marcescens, and designated as HS-L5. A schematic diagram of the genome sequence was constructed by high-throughput genome sequencing. A series of genes related to lipid degradation were identified by functional gene annotation through sequence homology analysis. A genome-scale metabolic model of HS-ML5 was constructed using systems biology techniques. The model consisted of 1722 genes and 1567 metabolic reactions. The topological graph of the genome-scale metabolic model was compared to that of conventional metabolic pathways using a visualization software and KEGG database. The basic components and boundaries of the tributyrin degradation subnetwork were determined, and its flux balance analyzed using Matlab and COBRA Toolbox to simulate the effects of different conditions on the catalytic efficiency of lipases produced by HS-ML5. We proved that the catalytic activity of microbial lipases was closely related to the carbon metabolic pathway. As production and catalytic efficiency of lipases varied greatly with the environment, the catalytic efficiency and environmental adaptability of microbial lipases can be improved by proper control of the production conditions. PMID:27050954

  9. Evolutionary insights about bacterial GlxRS from whole genome analyses: is GluRS2 a chimera?

    PubMed Central

    2014-01-01

    Background Evolutionary histories of glutamyl-tRNA synthetase (GluRS) and glutaminyl-tRNA synthetase (GlnRS) in bacteria are convoluted. After the divergence of eubacteria and eukarya, bacterial GluRS glutamylated both tRNAGln and tRNAGlu until GlnRS appeared by horizontal gene transfer (HGT) from eukaryotes or a duplicate copy of GluRS (GluRS2) that only glutamylates tRNAGln appeared. The current understanding is based on limited sequence data and not always compatible with available experimental results. In particular, the origin of GluRS2 is poorly understood. Results A large database of bacterial GluRS, GlnRS, tRNAGln and the trimeric aminoacyl-tRNA-dependent amidotransferase (gatCAB), constructed from whole genomes by functionally annotating and classifying these enzymes according to their mutual presence and absence in the genome, was analyzed. Phylogenetic analyses showed that the catalytic and the anticodon-binding domains of functional GluRS2 (as in Helicobacter pylori) were independently acquired from evolutionarily distant hosts by HGT. Non-functional GluRS2 (as in Thermotoga maritima), on the other hand, was found to contain an anticodon-binding domain appended to a gene-duplicated catalytic domain. Several genomes were found to possess both GluRS2 and GlnRS, even though they share the common function of aminoacylating tRNAGln. GlnRS was widely distributed among bacterial phyla and although phylogenetic analyses confirmed the origin of most bacterial GlnRS to be through a single HGT from eukarya, many GlnRS sequences also appeared with evolutionarily distant phyla in phylogenetic tree. A GlnRS pseudogene could be identified in Sorangium cellulosum. Conclusions Our analysis broadens the current understanding of bacterial GlxRS evolution and highlights the idiosyncratic evolution of GluRS2. Specifically we show that: i) GluRS2 is a chimera of mismatching catalytic and anticodon-binding domains, ii) the appearance of GlnRS and GluRS2 in a single bacterial

  10. Metabolic modeling of endosymbiont genome reduction on a temporal scale.

    PubMed

    Yizhak, Keren; Tuller, Tamir; Papp, Balázs; Ruppin, Eytan

    2011-03-29

    A fundamental challenge in Systems Biology is whether a cell-scale metabolic model can predict patterns of genome evolution by realistically accounting for associated biochemical constraints. Here, we study the order in which genes are lost in an in silico evolutionary process, leading from the metabolic network of Escherichia coli to that of the endosymbiont Buchnera aphidicola. We examine how this order correlates with the order by which the genes were actually lost, as estimated from a phylogenetic reconstruction. By optimizing this correlation across the space of potential growth and biomass conditions, we compute an upper bound estimate on the model's prediction accuracy (R=0.54). The model's network-based predictive ability outperforms predictions obtained using genomic features of individual genes, reflecting the effect of selection imposed by metabolic stoichiometric constraints. Thus, while the timing of gene loss might be expected to be a completely stochastic evolutionary process, remarkably, we find that metabolic considerations, on their own, make a marked 40% contribution to determining when such losses occur. PMID:21451589

  11. Next-generation genome-scale models for metabolic engineering.

    PubMed

    King, Zachary A; Lloyd, Colton J; Feist, Adam M; Palsson, Bernhard O

    2015-12-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict optimal genetic modifications that improve the rate and yield of chemical production. A new generation of COBRA models and methods is now being developed--encompassing many biological processes and simulation strategies-and next-generation models enable new types of predictions. Here, three key examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering. PMID:25575024

  12. Identifying all moiety conservation laws in genome-scale metabolic networks.

    PubMed

    De Martino, Andrea; De Martino, Daniele; Mulet, Roberto; Pagnani, Andrea

    2014-01-01

    The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation. PMID:24988199

  13. Identifying All Moiety Conservation Laws in Genome-Scale Metabolic Networks

    PubMed Central

    2014-01-01

    The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation. PMID:24988199

  14. Accomplishments in genome-scale in silico modeling for industrial and medical biotechnology

    PubMed Central

    Milne, Caroline B.; Kim, Pan-Jun; Eddy, James A.; Price, Nathan D.

    2011-01-01

    Driven by advancements in high-throughput biological technologies and the growing number of sequenced genomes, the construction of in silico models at the genome scale has provided powerful tools to investigate a vast array of biological systems and applications. Here, we review comprehensively the uses of such models in industrial and medical biotechnology, including biofuel generation, food production, and drug development. While the use of in silico models is still in its early stages for delivering to industry, significant initial successes have been achieved. For the cases presented here, genome-scale models predict engineering strategies to enhance properties of interest in an organism or to inhibit harmful mechanisms of pathogens or in disease. Going forward, genome-scale in silico models promise to extend their application and analysis scope to become a transformative tool in biotechnology. As such, genome-scale models can provide a basis for rational genome-scale engineering and synthetic biology. PMID:19946878

  15. Strategies used for genetically modifying bacterial genome: site-directed mutagenesis, gene inactivation, and gene over-expression.

    PubMed

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-02-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  16. Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

    PubMed Central

    Xu, Jian-zhong; Zhang, Wei-guo

    2016-01-01

    With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010

  17. Broad spectrum detection and "barcoding" of water pollutants by a genome-wide bacterial sensor array.

    PubMed

    Elad, Tal; Belkin, Shimshon

    2013-07-01

    An approach for the rapid detection and classification of a broad spectrum of water pollutants, based on a genome-wide reporter bacterial live cell array, is proposed and demonstrated. An array of ca. 2000 Escherichia coli fluorescent transcriptional reporters was exposed to 25 toxic compounds as well as to unpolluted water, and its responses were recorded after 3 h. The 25 toxic compounds represented 5 pollutant classes: genotoxicants, metals, detergents, alcohols, and monoaromatic hydrocarbons. Identifying unique gene expression patterns, a nearest neighbour-based model detected pollutant presence and predicted class attribution with an estimated accuracy of 87%. Sensitivity and positive predictive values varied among classes, being higher for pollutant classes that were defined by mode of action than for those defined by structure only. Sensitivity for unpolluted water was 0.90 and the positive predictive value was 0.79. All pollutant classes induced the transcription of a statistically significant proportion of membrane associated genes; in addition, the sets of genes responsive to genotoxicants, detergents and alcohols were enriched with genes involved in DNA repair, iron utilization and the translation machinery, respectively. Following further development, a methodology of the type described herein may be suitable for integration in water monitoring schemes in conjunction with existing analytical and biological detection techniques. PMID:23726715

  18. Analysis of herpesvirus host specificity determinants using herpesvirus genomes as bacterial artificial chromosomes.

    PubMed

    Arii, Jun; Kato, Kentaro; Kawaguchi, Yasushi; Tohya, Yukinobu; Akashi, Hiroomi

    2009-08-01

    Almost all mammalian alphaherpesviruses can grow in cells derived from several types of animals in vitro. However, FHV-1 can only infect feline cell lines. For this reason, FHV-1 should be a good model to investigate species barriers to herpesviruses in vivo. To apply bacterial mutagenesis of FHV-1, we cloned the FHV-1 genome as a BAC. Using lambda and flp recombinations, we introduced a monomeric red fluorescence protein into the C-terminus of glycoprotein D. Although GFP in the constructed recombinant FHV-1, a transfectant of the bacmid of FHV-1 that possessed the GFP, acted in non-feline cell lines, the virus could not enter non-feline cell lines, demonstrating that the host specificity of FHV-1 was restricted in an early step of infection. The host range of canine herpesvirus is limited to dogs in vitro and in vivo; it cannot enter non-canine cell lines as a result of infection but the GFP is active by transfection, revealing the same result that the restriction step is at an early stage of infection. These results suggest the possibility of breaking species barriers of FHV-1 and CHV by modifying the gene(s) that act at the early stage of infection. PMID:19659927

  19. Genome-Wide Analysis of Alternative Splicing during Dendritic Cell Response to a Bacterial Challenge

    PubMed Central

    Rodrigues, Raquel; Grosso, Ana Rita; Moita, Luís

    2013-01-01

    The immune system relies on the plasticity of its components to produce appropriate responses to frequent environmental challenges. Dendritic cells (DCs) are critical initiators of innate immunity and orchestrate the later and more specific adaptive immunity. The generation of diversity in transcriptional programs is central for effective immune responses. Alternative splicing is widely considered a key generator of transcriptional and proteomic complexity, but its role has been rarely addressed systematically in immune cells. Here we used splicing-sensitive arrays to assess genome-wide gene- and exon-level expression profiles in human DCs in response to a bacterial challenge. We find widespread alternative splicing events and splicing factor transcriptional signatures induced by an E. coli challenge to human DCs. Alternative splicing acts in concert with transcriptional modulation, but these two mechanisms of gene regulation affect primarily distinct functional gene groups. Alternative splicing is likely to have an important role in DC immunobiology because it affects genes known to be involved in DC development, endocytosis, antigen presentation and cell cycle arrest. PMID:23613991

  20. Comparative fluorescence in situ hybridization mapping of a 431-kb Arabidopsis thaliana bacterial artificial chromosome contig reveals the role of chromosomal duplications in the expansion of the Brassica rapa genome.

    PubMed Central

    Jackson, S A; Cheng, Z; Wang, M L; Goodman, H M; Jiang, J

    2000-01-01

    Comparative genome studies are important contributors to our understanding of genome evolution. Most comparative genome studies in plants have been based on genetic mapping of homologous DNA loci in different genomes. Large-scale comparative physical mapping has been hindered by the lack of efficient and affordable techniques. We report here the adaptation of fluorescence in situ hybridization (FISH) techniques for comparative physical mapping between Arabidopsis thaliana and Brassica rapa. A set of six bacterial artificial chromosomes (BACs) representing a 431-kb contiguous region of chromosome 2 of A. thaliana was mapped on both chromosomes and DNA fibers of B. rapa. This DNA fragment has a single location in the A. thaliana genome, but hybridized to four to six B. rapa chromosomes, indicating multiple duplications in the B. rapa genome. The sizes of the fiber-FISH signals from the same BACs were not longer in B. rapa than those in A. thaliana, suggesting that this genomic region is duplicated but not expanded in the B. rapa genome. The comparative fiber-FISH mapping results support that chromosomal duplications, rather than regional expansion due to accumulation of repetitive sequences in the intergenic regions, played the major role in the evolution of the B. rapa genome. PMID:11014828

  1. Genome-scale computational analysis of DNA curvature and repeats in Arabidopsis and rice uncovers plant-specific genomic properties

    PubMed Central

    2011-01-01

    Background Due to its overarching role in genome function, sequence-dependent DNA curvature continues to attract great attention. The DNA double helix is not a rigid cylinder, but presents both curvature and flexibility in different regions, depending on the sequence. More in depth knowledge of the various orders of complexity of genomic DNA structure has allowed the design of sophisticated bioinformatics tools for its analysis and manipulation, which, in turn, have yielded a better understanding of the genome itself. Curved DNA is involved in many biologically important processes, such as transcription initiation and termination, recombination, DNA replication, and nucleosome positioning. CpG islands and tandem repeats also play significant roles in the dynamics and evolution of genomes. Results In this study, we analyzed the relationship between these three structural features within rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana) genomes. A genome-scale prediction of curvature distribution in rice and Arabidopsis indicated that most of the chromosomes of both genomes have maximal chromosomal DNA curvature adjacent to the centromeric region. By analyzing tandem repeats across the genome, we found that frequencies of repeats are higher in regions adjacent to those with high curvature value. Further analysis of CpG islands shows a clear interdependence between curvature value, repeat frequencies and CpG islands. Each CpG island appears in a local minimal curvature region, and CpG islands usually do not appear in the centromere or regions with high repeat frequency. A statistical evaluation demonstrates the significance and non-randomness of these features. Conclusions This study represents the first systematic genome-scale analysis of DNA curvature, CpG islands and tandem repeats at the DNA sequence level in plant genomes, and finds that not all of the chromosomes in plants follow the same rules common to other eukaryote organisms, suggesting that some

  2. Characterizing acetogenic metabolism using a genome-scale metabolic reconstruction of Clostridium ljungdahlii

    PubMed Central

    2013-01-01

    Background The metabolic capabilities of acetogens to ferment a wide range of sugars, to grow autotrophically on H2/CO2, and more importantly on synthesis gas (H2/CO/CO2) make them very attractive candidates as production hosts for biofuels and biocommodities. Acetogenic metabolism is considered one of the earliest modes of bacterial metabolism. A thorough understanding of various factors governing the metabolism, in particular energy conservation mechanisms, is critical for metabolic engineering of acetogens for targeted production of desired chemicals. Results Here, we present the genome-scale metabolic network of Clostridium ljungdahlii, the first such model for an acetogen. This genome-scale model (iHN637) consisting of 637 genes, 785 reactions, and 698 metabolites captures all the major central metabolic and biosynthetic pathways, in particular pathways involved in carbon fixation and energy conservation. A combination of metabolic modeling, with physiological and transcriptomic data provided insights into autotrophic metabolism as well as aided the characterization of a nitrate reduction pathway in C. ljungdahlii. Analysis of the iHN637 metabolic model revealed that flavin based electron bifurcation played a key role in energy conservation during autotrophic growth and helped identify genes for some of the critical steps in this mechanism. Conclusions iHN637 represents a predictive model that recapitulates experimental data, and provides valuable insights into the metabolic response of C. ljungdahlii to genetic perturbations under various growth conditions. Thus, the model will be instrumental in guiding metabolic engineering of C. ljungdahlii for the industrial production of biocommodities and biofuels. PMID:24274140

  3. Characterizing acetogenic metabolism using a genome-scale metabolic reconstruction of Clostridium ljungdahlii

    SciTech Connect

    Nagarajan, H; Sahin, M; Nogales, J; Latif, H; Lovley, DR; Ebrahim, A; Zengler, K

    2013-11-25

    Background: The metabolic capabilities of acetogens to ferment a wide range of sugars, to grow autotrophically on H-2/CO2, and more importantly on synthesis gas (H-2/CO/CO2) make them very attractive candidates as production hosts for biofuels and biocommodities. Acetogenic metabolism is considered one of the earliest modes of bacterial metabolism. A thorough understanding of various factors governing the metabolism, in particular energy conservation mechanisms, is critical for metabolic engineering of acetogens for targeted production of desired chemicals. Results: Here, we present the genome-scale metabolic network of Clostridium ljungdahlii, the first such model for an acetogen. This genome-scale model (iHN637) consisting of 637 genes, 785 reactions, and 698 metabolites captures all the major central metabolic and biosynthetic pathways, in particular pathways involved in carbon fixation and energy conservation. A combination of metabolic modeling, with physiological and transcriptomic data provided insights into autotrophic metabolism as well as aided the characterization of a nitrate reduction pathway in C. ljungdahlii. Analysis of the iHN637 metabolic model revealed that flavin based electron bifurcation played a key role in energy conservation during autotrophic growth and helped identify genes for some of the critical steps in this mechanism. Conclusions: iHN637 represents a predictive model that recapitulates experimental data, and provides valuable insights into the metabolic response of C. ljungdahlii to genetic perturbations under various growth conditions. Thus, the model will be instrumental in guiding metabolic engineering of C. ljungdahlii for the industrial production of biocommodities and biofuels.

  4. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    NASA Astrophysics Data System (ADS)

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S.; Ellis, Tom

    2016-03-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity.

  5. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582.

    PubMed

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S; Ellis, Tom

    2016-01-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity. PMID:27010592

  6. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    PubMed Central

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S.; Ellis, Tom

    2016-01-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity. PMID:27010592

  7. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing

    PubMed Central

    Ouda, Osama; El-Refy, Ali; El-Feky, Fawzy A.; Mosa, Kareem A.

    2015-01-01

    Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups) in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available. PMID:26880910

  8. Temporal scaling of bacterial taxa is influenced by both stochastic and deterministic ecological factors.

    PubMed

    van der Gast, Christopher J; Ager, Duane; Lilley, Andrew K

    2008-06-01

    Microorganisms operate at a range of spatial and temporal scales acting as key drivers of ecosystem properties. Therefore, many key questions in microbial ecology require the consideration of both spatial and temporal scales. Spatial scaling, in particular the species-area relationship (SAR), has a long history in ecology and has recently been addressed in microbial ecology. However, the temporal analogue of the SAR, the species-time relationship, has received far less attention even in the science of general ecology. Here we focus upon the role of temporal scaling in microbial ecological patterns by coupling molecular characterization of bacterial communities in discrete island (bioreactor) systems with a macroecological approach. Our findings showed that the temporal scaling exponent (slope), and therefore taxa turnover of the bacterial taxa-time relationship decreased as selective pressure (industrial wastewater concentration) increased. Also, as the concentration of industrial wastewater increased across the bioreactors, we observed a gradual switch from stochastic community assembly to more deterministic (niche)-based considerations. The identification of broad-scale statistical patterns is particularly relevant to microbial ecology, as it is frequently difficult to identify individual species or their functions. In this study, we identify wide-reaching statistical patterns of diversity and show that they are shaped by the prevalent underlying ecological factors. PMID:18205822

  9. Genome Sequence of an Environmental Isolate of the Bacterial Pathogen Legionella pneumophila

    PubMed Central

    Ma, Jian; He, Yongqun

    2013-01-01

    We report here the genomic sequence of Legionella pneumophila strain LPE509 from the water distribution system of a hospital in Shanghai, China. This is the first complete genome sequence of an environmental L. pneumophila isolate. Genomic analyses identified approximately 600 genes unique to LPE509 compared to those of the 7 available L. pneumophila genomes. PMID:23792742

  10. The evolution of genome-scale models of cancer metabolism

    PubMed Central

    Lewis, Nathan E.; Abdel-Haleem, Alyaa M.

    2013-01-01

    The importance of metabolism in cancer is becoming increasingly apparent with the identification of metabolic enzyme mutations and the growing awareness of the influence of metabolism on signaling, epigenetic markers, and transcription. However, the complexity of these processes has challenged our ability to make sense of the metabolic changes in cancer. Fortunately, constraint-based modeling, a systems biology approach, now enables one to study the entirety of cancer metabolism and simulate basic phenotypes. With the newness of this field, there has been a rapid evolution of both the scope of these models and their applications. Here we review the various constraint-based models built for cancer metabolism and how their predictions are shedding new light on basic cancer phenotypes, elucidating pathway differences between tumors, and dicovering putative anti-cancer targets. As the field continues to evolve, the scope of these genome-scale cancer models must expand beyond central metabolism to address questions related to the diverse processes contributing to tumor development and metastasis. PMID:24027532

  11. Genome Scale Reconstruction of a Salmonella Metabolic Model

    PubMed Central

    AbuOun, Manal; Suthers, Patrick F.; Jones, Gareth I.; Carter, Ben R.; Saunders, Mark P.; Maranas, Costas D.; Woodward, Martin J.; Anjum, Muna F.

    2009-01-01

    Salmonella are closely related to commensal Escherichia coli but have gained virulence factors enabling them to behave as enteric pathogens. Less well studied are the similarities and differences that exist between the metabolic properties of these organisms that may contribute toward niche adaptation of Salmonella pathogens. To address this, we have constructed a genome scale Salmonella metabolic model (iMA945). The model comprises 945 open reading frames or genes, 1964 reactions, and 1036 metabolites. There was significant overlap with genes present in E. coli MG1655 model iAF1260. In silico growth predictions were simulated using the model on different carbon, nitrogen, phosphorous, and sulfur sources. These were compared with substrate utilization data gathered from high throughput phenotyping microarrays revealing good agreement. Of the compounds tested, the majority were utilizable by both Salmonella and E. coli. Nevertheless a number of differences were identified both between Salmonella and E. coli and also within the Salmonella strains included. These differences provide valuable insight into differences between a commensal and a closely related pathogen and within different pathogenic strains opening new avenues for future explorations. PMID:19690172

  12. Genome-wide survey of codons under diversifying selection in a highly recombining bacterial species, Helicobacter pylori

    PubMed Central

    Yahara, Koji; Furuta, Yoshikazu; Morimoto, Shinpei; Kikutake, Chie; Komukai, Sho; Matelska, Dorota; Dunin-Horkawicz, Stanisław; Bujnicki, Janusz M.; Uchiyama, Ikuo; Kobayashi, Ichizo

    2016-01-01

    Selection has been a central issue in biology in eukaryotes as well as prokaryotes. Inference of selection in recombining bacterial species, compared with clonal ones, has been a challenge. It is not known how codons under diversifying selection are distributed along the chromosome or among functional categories or how frequently such codons are subject to mutual homologous recombination. Here, we explored these questions by analysing genes present in >90% among 29 genomes of Helicobacter pylori, one of the bacterial species with the highest mutation and recombination rates. By a method for recombining sequences, we identified codons under diversifying selection (dN/dS > 1), which were widely distributed and accounted for ∼0.2% of all the codons of the genome. The codons were enriched in genes of host interaction/cell surface and genome maintenance (DNA replication, recombination, repair, and restriction modification system). The encoded amino acid residues were sometimes found adjacent to critical catalytic/binding residues in protein structures. Furthermore, by estimating the intensity of homologous recombination at a single nucleotide level, we found that these codons appear to be more frequently subject to recombination. We expect that the present study provides a new approach to population genomics of selection in recombining prokaryotes. PMID:26961370

  13. Construction of bacterial artificial chromosome libraries from the parasitic nematode Brugia malayi and physical mapping of the genome of its Wolbachia endosymbiont.

    PubMed

    Foster, Jeremy M; Kumar, Sanjay; Ganatra, Mehul B; Kamal, Ibrahim H; Ware, Jennifer; Ingram, Jessica; Pope-Chappell, Jesse; Guiliano, David; Whitton, Claire; Daub, Jennifer; Blaxter, Mark L; Slatko, Barton E

    2004-05-01

    The parasitic nematode, Brugia malayi, causes lymphatic filariasis in humans, which in severe cases leads to the condition known as elephantiasis. The parasite contains an endosymbiotic alpha-proteobacterium of the genus Wolbachia that is required for normal worm development and fecundity and is also implicated in the pathology associated with infections by these filarial nematodes. Bacterial artificial chromosome libraries were constructed from B. malayi DNA and provide over 11-fold coverage of the nematode genome. Wolbachia genomic fragments were simultaneously cloned into the libraries giving over 5-fold coverage of the 1.1 Mb bacterial genome. A physical framework for the Wolbachia genome was developed by construction of a plasmid library enriched for Wolbachia DNA as a source of sequences to hybridise to high-density bacterial artificial chromosome colony filters. Bacterial artificial chromosome end sequencing provided additional Wolbachia probe sequences to facilitate assembly of a contig that spanned the entire genome. The Wolbachia sequences provided a marker approximately every 10 kb. Four rare-cutting restriction endonucleases were used to restriction map the genome to a resolution of approximately 60 kb and demonstrate concordance between the bacterial artificial chromosome clones and native Wolbachia genomic DNA. Comparison of Wolbachia sequences to public databases using BLAST algorithms under stringent conditions allowed confident prediction of 69 Wolbachia peptide functions and two rRNA genes. Comparison to closely related complete genomes revealed that while most sequences had orthologs in the genome of the Wolbachia endosymbiont from Drosophila melanogaster, there was no evidence for long-range synteny. Rather, there were a few cases of short-range conservation of gene order extending over regions of less than 10 kb. The molecular scaffold produced for the genome of the Wolbachia from B. malayi forms the basis of a genomic sequencing effort for

  14. Comparative genomics of non-pseudomonal bacterial species colonising paediatric cystic fibrosis patients

    PubMed Central

    Ormerod, Kate L.; George, Narelle M.; Fraser, James A.; Wainwright, Claire

    2015-01-01

    The genetic disorder cystic fibrosis is a life-limiting condition affecting ∼70,000 people worldwide. Targeted, early, treatment of the dominant infecting species, Pseudomonas aeruginosa, has improved patient outcomes; however, there is concern that other species are now stepping in to take its place. In addition, the necessarily long-term antibiotic therapy received by these patients may be providing a suitable environment for the emergence of antibiotic resistance. To investigate these issues, we employed whole-genome sequencing of 28 non-Pseudomonas bacterial strains isolated from three paediatric patients. We did not find any trend of increasing antibiotic resistance (either by mutation or lateral gene transfer) in these isolates in comparison with other examples of the same species. In addition, each isolate contained a virulence gene repertoire that was similar to other examples of the relevant species. These results support the impaired clearance of the CF lung not demanding extensive virulence for survival in this habitat. By analysing serial isolates of the same species we uncovered several examples of strain persistence. The same strain of Staphylococcus aureus persisted for nearly a year, despite administration of antibiotics to which it was shown to be sensitive. This is consistent with previous studies showing antibiotic therapy to be inadequate in cystic fibrosis patients, which may also explain the lack of increasing antibiotic resistance over time. Serial isolates of two naturally multi-drug resistant organisms, Achromobacter xylosoxidans and Stenotrophomonas maltophilia, revealed that while all S. maltophilia strains were unique, A. xylosoxidans persisted for nearly five years, making this a species of particular concern. The data generated by this study will assist in developing an understanding of the non-Pseudomonas species associated with cystic fibrosis. PMID:26401445

  15. Comparative genomics of non-pseudomonal bacterial species colonising paediatric cystic fibrosis patients.

    PubMed

    Ormerod, Kate L; George, Narelle M; Fraser, James A; Wainwright, Claire; Hugenholtz, Philip

    2015-01-01

    The genetic disorder cystic fibrosis is a life-limiting condition affecting ∼70,000 people worldwide. Targeted, early, treatment of the dominant infecting species, Pseudomonas aeruginosa, has improved patient outcomes; however, there is concern that other species are now stepping in to take its place. In addition, the necessarily long-term antibiotic therapy received by these patients may be providing a suitable environment for the emergence of antibiotic resistance. To investigate these issues, we employed whole-genome sequencing of 28 non-Pseudomonas bacterial strains isolated from three paediatric patients. We did not find any trend of increasing antibiotic resistance (either by mutation or lateral gene transfer) in these isolates in comparison with other examples of the same species. In addition, each isolate contained a virulence gene repertoire that was similar to other examples of the relevant species. These results support the impaired clearance of the CF lung not demanding extensive virulence for survival in this habitat. By analysing serial isolates of the same species we uncovered several examples of strain persistence. The same strain of Staphylococcus aureus persisted for nearly a year, despite administration of antibiotics to which it was shown to be sensitive. This is consistent with previous studies showing antibiotic therapy to be inadequate in cystic fibrosis patients, which may also explain the lack of increasing antibiotic resistance over time. Serial isolates of two naturally multi-drug resistant organisms, Achromobacter xylosoxidans and Stenotrophomonas maltophilia, revealed that while all S. maltophilia strains were unique, A. xylosoxidans persisted for nearly five years, making this a species of particular concern. The data generated by this study will assist in developing an understanding of the non-Pseudomonas species associated with cystic fibrosis. PMID:26401445

  16. Changes in bacterial community structure in a full-scale membrane bioreactor for municipal wastewater treatment.

    PubMed

    Hashimoto, Kurumi; Tsutsui, Hirofumi; Takada, Kazuki; Hamada, Hiroshi; Sakai, Kousuke; Inoue, Daisuke; Sei, Kazunari; Soda, Satoshi; Yamashita, Kyoko; Tsuji, Koji; Hashimoto, Toshikazu; Ike, Michihiko

    2016-07-01

    This study investigated changes in the structure and metabolic capabilities of the bacterial community in a full-scale membrane bioreactor (MBR) treating municipal wastewater. Microbial monitoring was also conducted for a parallel-running conventional activated sludge (CAS) process treating the same influent. The mixed-liquor suspended solid concentration in the MBR reached a steady-state on day 73 after the start-up. Then the MBR maintained higher rates of removal of organic compounds and nitrogen than the CAS process did. Terminal restriction fragment length polymorphism analysis revealed that the bacterial community structure in the MBR was similar to that in the CAS process at the start-up, but it became very different from that in the CAS process in the steady state. The bacterial community structure of the MBR continued to change dynamically even after 20 months of the steady-state operation, while that of the CAS process was maintained in a stable condition. By contrast, Biolog assay revealed that the carbon source utilization potential of the MBR resembled that of the CAS process as a whole, although it declined transiently. Overall, the results indicate that the bacterial community of the MBR has flexibility in terms of its phylogenetic structure and metabolic activity to maintain the high wastewater treatment capability. PMID:26811223

  17. Millennial-scale ocean acidification and late Quaternary decline of cryptic bacterial crusts in tropical reefs.

    PubMed

    Riding, R; Liang, L; Braga, J C

    2014-09-01

    Ocean acidification by atmospheric carbon dioxide has increased almost continuously since the last glacial maximum (LGM), 21,000 years ago. It is expected to impair tropical reef development, but effects on reefs at the present day and in the recent past have proved difficult to evaluate. We present evidence that acidification has already significantly reduced the formation of calcified bacterial crusts in tropical reefs. Unlike major reef builders such as coralline algae and corals that more closely control their calcification, bacterial calcification is very sensitive to ambient changes in carbonate chemistry. Bacterial crusts in reef cavities have declined in thickness over the past 14,000 years with largest reduction occurring 12,000-10,000 years ago. We interpret this as an early effect of deglacial ocean acidification on reef calcification and infer that similar crusts were likely to have been thicker when seawater carbonate saturation was increased during earlier glacial intervals, and thinner during interglacials. These changes in crust thickness could have substantially affected reef development over glacial cycles, as rigid crusts significantly strengthen framework and their reduction would have increased the susceptibility of reefs to biological and physical erosion. Bacterial crust decline reveals previously unrecognized millennial-scale acidification effects on tropical reefs. This directs attention to the role of crusts in reef formation and the ability of bioinduced calcification to reflect changes in seawater chemistry. It also provides a long-term context for assessing anticipated anthropogenic effects. PMID:25040070

  18. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    SciTech Connect

    Qiu, D.; Tu, Q.; He, Zhili; Zhou, Jizhong

    2010-05-17

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to be of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.

  19. Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity

    PubMed Central

    Bosi, Emanuele; Monk, Jonathan M.; Aziz, Ramy K.; Fondi, Marco; Nizet, Victor; Palsson, Bernhard Ø.

    2016-01-01

    Staphylococcus aureus is a preeminent bacterial pathogen capable of colonizing diverse ecological niches within its human host. We describe here the pangenome of S. aureus based on analysis of genome sequences from 64 strains of S. aureus spanning a range of ecological niches, host types, and antibiotic resistance profiles. Based on this set, S. aureus is expected to have an open pangenome composed of 7,411 genes and a core genome composed of 1,441 genes. Metabolism was highly conserved in this core genome; however, differences were identified in amino acid and nucleotide biosynthesis pathways between the strains. Genome-scale models (GEMs) of metabolism were constructed for the 64 strains of S. aureus. These GEMs enabled a systems approach to characterizing the core metabolic and panmetabolic capabilities of the S. aureus species. All models were predicted to be auxotrophic for the vitamins niacin (vitamin B3) and thiamin (vitamin B1), whereas strain-specific auxotrophies were predicted for riboflavin (vitamin B2), guanosine, leucine, methionine, and cysteine, among others. GEMs were used to systematically analyze growth capabilities in more than 300 different growth-supporting environments. The results identified metabolic capabilities linked to pathogenic traits and virulence acquisitions. Such traits can be used to differentiate strains responsible for mild vs. severe infections and preference for hosts (e.g., animals vs. humans). Genome-scale analysis of multiple strains of a species can thus be used to identify metabolic determinants of virulence and increase our understanding of why certain strains of this deadly pathogen have spread rapidly throughout the world. PMID:27286824

  20. Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity.

    PubMed

    Bosi, Emanuele; Monk, Jonathan M; Aziz, Ramy K; Fondi, Marco; Nizet, Victor; Palsson, Bernhard Ø

    2016-06-28

    Staphylococcus aureus is a preeminent bacterial pathogen capable of colonizing diverse ecological niches within its human host. We describe here the pangenome of S. aureus based on analysis of genome sequences from 64 strains of S. aureus spanning a range of ecological niches, host types, and antibiotic resistance profiles. Based on this set, S. aureus is expected to have an open pangenome composed of 7,411 genes and a core genome composed of 1,441 genes. Metabolism was highly conserved in this core genome; however, differences were identified in amino acid and nucleotide biosynthesis pathways between the strains. Genome-scale models (GEMs) of metabolism were constructed for the 64 strains of S. aureus These GEMs enabled a systems approach to characterizing the core metabolic and panmetabolic capabilities of the S. aureus species. All models were predicted to be auxotrophic for the vitamins niacin (vitamin B3) and thiamin (vitamin B1), whereas strain-specific auxotrophies were predicted for riboflavin (vitamin B2), guanosine, leucine, methionine, and cysteine, among others. GEMs were used to systematically analyze growth capabilities in more than 300 different growth-supporting environments. The results identified metabolic capabilities linked to pathogenic traits and virulence acquisitions. Such traits can be used to differentiate strains responsible for mild vs. severe infections and preference for hosts (e.g., animals vs. humans). Genome-scale analysis of multiple strains of a species can thus be used to identify metabolic determinants of virulence and increase our understanding of why certain strains of this deadly pathogen have spread rapidly throughout the world. PMID:27286824

  1. Draft Genome Sequence of Criibacterium bergeronii gen. nov., sp. nov., Strain CCRI-22567T, Isolated from a Vaginal Sample from a Woman with Bacterial Vaginosis

    PubMed Central

    Maheux, Andrée F.; Bérubé, Ève; Boudreau, Dominique K.; Raymond, Frédéric; Corbeil, Jacques; Roy, Paul H.

    2016-01-01

    Criibacterium bergeronii gen. nov., sp. nov., CCRI-22567 is the type strain of the new genus Criibacterium. The strain was isolated from a woman with bacterial vaginosis. The genome assembly comprised 2,384,460 bp, with 34.4% G+C content. This is the first genome announcement of a strain belonging to the genus Criibacterium. PMID:27587833

  2. Draft Genome Sequence of Criibacterium bergeronii gen. nov., sp. nov., Strain CCRI-22567T, Isolated from a Vaginal Sample from a Woman with Bacterial Vaginosis.

    PubMed

    Maheux, Andrée F; Bérubé, Ève; Boudreau, Dominique K; Raymond, Frédéric; Corbeil, Jacques; Roy, Paul H; Boissinot, Maurice; Omar, Rabeea F

    2016-01-01

    Criibacterium bergeronii gen. nov., sp. nov., CCRI-22567 is the type strain of the new genus Criibacterium The strain was isolated from a woman with bacterial vaginosis. The genome assembly comprised 2,384,460 bp, with 34.4% G+C content. This is the first genome announcement of a strain belonging to the genus Criibacterium. PMID:27587833

  3. A systematic comparison of genome-scale clustering algorithms

    PubMed Central

    2012-01-01

    Background A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further

  4. Combining p-values in large-scale genomics experiments.

    PubMed

    Zaykin, Dmitri V; Zhivotovsky, Lev A; Czika, Wendy; Shao, Susan; Wolfinger, Russell D

    2007-01-01

    In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher's and Lancaster's combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher's method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis - that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K-ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. PMID:17879330

  5. Combining p-values in large scale genomics experiments

    PubMed Central

    Zaykin, Dmitri V.; Zhivotovsky, Lev A.; Czika, Wendy; Shao, Susan; Wolfinger, Russell D.

    2008-01-01

    Summary In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher’s and Lancaster’s combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher’s method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis – that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. PMID:17879330

  6. Life history determines biogeographical patterns of soil bacterial communities over multiple spatial scales.

    PubMed

    Bissett, A; Richardson, A E; Baker, G; Wakelin, S; Thrall, P H

    2010-10-01

    The extent to which the distribution of soil bacteria is controlled by local environment vs. spatial factors (e.g. dispersal, colonization limitation, evolutionary events) is poorly understood and widely debated. Our understanding of biogeographic controls in microbial communities is likely hampered by the enormous environmental variability encountered across spatial scales and the broad diversity of microbial life histories. Here, we constrained environmental factors (soil chemistry, climate, above-ground plant community) to investigate the specific influence of space, by fitting all other variables first, on bacterial communities in soils over distances from m to 10² km. We found strong evidence for a spatial component to bacterial community structure that varies with scale and organism life history (dispersal and survival ability). Geographic distance had no influence over community structure for organisms known to have survival stages, but the converse was true for organisms thought to be less hardy. Community function (substrate utilization) was also shown to be highly correlated with community structure, but not to abiotic factors, suggesting nonstochastic determinants of community structure are important Our results support the view that bacterial soil communities are constrained by both edaphic factors and geographic distance and further show that the relative importance of such constraints depends critically on the taxonomic resolution used to evaluate spatio-temporal patterns of microbial diversity, as well as life history of the groups being investigated, much as is the case for macro-organisms. PMID:25241408

  7. Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics

    PubMed Central

    Wang, Wenming; Tanurdzic, Milos; Luo, Meizhong; Sisneros, Nicholas; Kim, Hye Ran; Weng, Jing-Ke; Kudrna, Dave; Mueller, Christopher; Arumuganathan, K; Carlson, John; Chapple, Clint; de Pamphilis, Claude; Mandoli, Dina; Tomkins, Jeff; Wing, Rod A; Banks, Jo Ann

    2005-01-01

    Background The lycophytes are an ancient lineage of vascular plants that diverged from the seed plant lineage about 400 Myr ago. Although the lycophytes occupy an important phylogenetic position for understanding the evolution of plants and their genomes, no genomic resources exist for this group of plants. Results Here we describe the construction of a large-insert bacterial artificial chromosome (BAC) library from the lycophyte Selaginella moellendorffii. Based on cell flow cytometry, this species has the smallest genome size among the different lycophytes tested, including Huperzia lucidula, Diphaiastrum digita, Isoetes engelmanii and S. kraussiana. The arrayed BAC library consists of 9126 clones; the average insert size is estimated to be 122 kb. Inserts of chloroplast origin account for 2.3% of the clones. The BAC library contains an estimated ten genome-equivalents based on DNA hybridizations using five single-copy and two duplicated S. moellendorffii genes as probes. Conclusion The S. moellenforffii BAC library, the first to be constructed from a lycophyte, will be useful to the scientific community as a resource for comparative plant genomics and evolution. PMID:15955246

  8. Purification and partial genome characterization of the bacterial endosymbiont Blattabacterium cuenoti from the fat bodies of cockroaches

    PubMed Central

    Tokuda, Gaku; Lo, Nathan; Takase, Aya; Yamada, Akinori; Hayashi, Yoshinobu; Watanabe, Hirofumi

    2008-01-01

    Background Symbiotic relationships between intracellular bacteria and eukaryotes are widespread in nature. Genome sequencing of the bacterial partner has provided a number of key insights into the basis of these symbioses. A challenging aspect of sequencing symbiont genomes is separating the bacteria from the host tissues. In the present study, we describe a simple method of endosymbiont purification from complex environment, using Blattabacterium cuenoti inhabiting in cockroaches as a model system. Findings B. cuenoti cells were successfully purified from the fat bodies of the cockroach Panesthia angustipennis by a combination of slow- and fast-speed centrifugal fractionations, nylon-membrane filtration, and centrifugation with Percoll solutions. We performed pulse-field electrophoresis, diagnostic PCR and random sequencing of the shoutgun library. These experiments confirmed minimal contamination of host and mitochondrial DNA. The genome size and the G+C content of B. cuenoti were inferred to be 650 kb and 32.1 ± 7.6%, respectively. Conclusion The present study showed successful purification and characterization of the genome of B. cuenoti. Our methodology should be applicable for future symbiont genome sequencing projects. An advantage of the present purification method is that each step is easily performed with ordinary microtubes and a microcentrifuge, and without DNase treatment. PMID:19025664

  9. Deciphering Cyanide-Degrading Potential of Bacterial Community Associated with the Coking Wastewater Treatment Plant with a Novel Draft Genome.

    PubMed

    Wang, Zhiping; Liu, Lili; Guo, Feng; Zhang, Tong

    2015-10-01

    Biotreatment processes fed with coking wastewater often encounter insufficient removal of pollutants, such as ammonia, phenols, and polycyclic aromatic hydrocarbons (PAHs), especially for cyanides. However, only a limited number of bacterial species in pure cultures have been confirmed to metabolize cyanides, which hinders the improvement of these processes. In this study, a microbial community of activated sludge enriched in a coking wastewater treatment plant was analyzed using 454 pyrosequencing and Illumina sequencing to characterize the potential cyanide-degrading bacteria. According to the classification of these pyro-tags, targeting V3/V4 regions of 16S rRNA gene, half of them were assigned to the family Xanthomonadaceae, implying that Xanthomonadaceae bacteria are well-adapted to coking wastewater. A nearly complete draft genome of the dominant bacterium was reconstructed from metagenome of this community to explore cyanide metabolism based on analysis of the genome. The assembled 16S rRNA gene from this draft genome showed that this bacterium was a novel species of Thermomonas within Xanthomonadaceae, which was further verified by comparative genomics. The annotation using KEGG and Pfam identified genes related to cyanide metabolism, including genes responsible for the iron-harvesting system, cyanide-insensitive terminal oxidase, cyanide hydrolase/nitrilase, and thiosulfate:cyanide transferase. Phylogenetic analysis showed that these genes had homologs in previously identified genomes of bacteria within Xanthomonadaceae and even presented similar gene cassettes, thus implying an inherent cyanide-decomposing potential. The findings of this study expand our knowledge about the bacterial degradation of cyanide compounds and will be helpful in the remediation of cyanides contamination. PMID:25910603

  10. High-throughput generation, optimization and analysis of genome-scale metabolic models.

    SciTech Connect

    Henry, C. S.; DeJongh, M.; Best, A. A.; Frybarger, P. M.; Linsay, B.; Stevens, R. L.

    2010-09-01

    Genome-scale metabolic models have proven to be valuable for predicting organism phenotypes from genotypes. Yet efforts to develop new models are failing to keep pace with genome sequencing. To address this problem, we introduce the Model SEED, a web-based resource for high-throughput generation, optimization and analysis of genome-scale metabolic models. The Model SEED integrates existing methods and introduces techniques to automate nearly every step of this process, taking {approx}48 h to reconstruct a metabolic model from an assembled genome sequence. We apply this resource to generate 130 genome-scale metabolic models representing a taxonomically diverse set of bacteria. Twenty-two of the models were validated against available gene essentiality and Biolog data, with the average model accuracy determined to be 66% before optimization and 87% after optimization.

  11. A genome-scale proteomic screen identifies a role for DnaK in chaperoning of polar autotransporters in Shigella.

    PubMed

    Janakiraman, Anuradha; Fixen, Kathryn R; Gray, Andrew N; Niki, Hironori; Goldberg, Marcia B

    2009-10-01

    Autotransporters are outer membrane proteins that are widely distributed among gram-negative bacteria. Like other autotransporters, the Shigella autotransporter IcsA, which is required for actin assembly during infection, is secreted at the bacterial pole. In the bacterial cytoplasm, IcsA localizes to poles and potential cell division sites independent of the cell division protein FtsZ. To identify bacterial proteins involved in the targeting of IcsA to the pole in the bacterial cytoplasm, we screened a genome-scale library of Escherichia coli proteins tagged with green fluorescent protein (GFP) for those that displayed a localization pattern similar to that of IcsA-GFP in cells that lack functional FtsZ using a strain carrying a temperature-sensitive ftsZ allele. For each protein that mimicked the localization of IcsA-GFP, we tested whether IcsA localization was dependent on the presence of the protein. Although these approaches did not identify a polar receptor for IcsA, the cytoplasmic chaperone DnaK both mimicked IcsA localization at elevated temperatures as a GFP fusion and was required for the localization of IcsA to the pole in the cytoplasm of E. coli. DnaK was also required for IcsA secretion at the pole in Shigella flexneri. The localization of DnaK-GFP to poles and potential cell division sites was dependent on elevated growth temperature and independent of the presence of IcsA or functional FtsZ; native DnaK was found to be enhanced at midcell and the poles. A second Shigella autotransporter, SepA, also required DnaK for secretion, consistent with a role of DnaK more generally in the chaperoning of autotransporter proteins in the bacterial cytoplasm. PMID:19684128

  12. Predicting genome-scale Arabidopsis-Pseudomonas syringae interactome using domain and interolog-based approaches

    PubMed Central

    2014-01-01

    Background Every year pathogenic organisms cause billions of dollars' worth damage to crops and livestock. In agriculture, study of plant-microbe interactions is demanding a special attention to develop management strategies for the destructive pathogen induced diseases that cause huge crop losses every year worldwide. Pseudomonas syringae is a major bacterial leaf pathogen that causes diseases in a wide range of plant species. Among its various strains, pathovar tomato strain DC3000 (PstDC3000) is asserted to infect the plant host Arabidopsis thaliana and thus, has been accepted as a model system for experimental characterization of the molecular dynamics of plant-pathogen interactions. Protein-protein interactions (PPIs) play a critical role in initiating pathogenesis and maintaining infection. Understanding the PPI network between a host and pathogen is a critical step for studying the molecular basis of pathogenesis. The experimental study of PPIs at a large scale is very scarce and also the high throughput experimental results show high false positive rate. Hence, there is a need for developing efficient computational models to predict the interaction between host and pathogen in a genome scale, and find novel candidate effectors and/or their targets. Results In this study, we used two computational approaches, the interolog and the domain-based to predict the interactions between Arabidopsis and PstDC3000 in genome scale. The interolog method relies on protein sequence similarity to conduct the PPI prediction. A Pseudomonas protein and an Arabidopsis protein are predicted to interact with each other if an experimentally verified interaction exists between their respective homologous proteins in another organism. The domain-based method uses domain interaction information, which is derived from known protein 3D structures, to infer the potential PPIs. If a Pseudomonas and an Arabidopsis protein contain an interacting domain pair, one can expect the two

  13. Genomic Survey of Pathogenicity Determinants and VNTR Markers in the Cassava Bacterial Pathogen Xanthomonas axonopodis pv. Manihotis Strain CIO151

    PubMed Central

    Arrieta-Ortiz, Mario L.; Rodríguez-R, Luis M.; Pérez-Quintero, Álvaro L.; Poulin, Lucie; Díaz, Ana C.; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D.; Ortiz Quiñones, Juan F.; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B.; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P.; Tabima, Javier; Urrego Morales, Oscar G.; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo

    2013-01-01

    Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis

  14. Genomic survey of pathogenicity determinants and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. Manihotis strain CIO151.

    PubMed

    Arrieta-Ortiz, Mario L; Rodríguez-R, Luis M; Pérez-Quintero, Álvaro L; Poulin, Lucie; Díaz, Ana C; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D; Ortiz Quiñones, Juan F; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P; Tabima, Javier; Urrego Morales, Oscar G; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo; Koebnik, Ralf; Bernal, Adriana

    2013-01-01

    Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis

  15. An empirical strategy for characterizing bacterial proteomes across species in the absence of genomic sequences

    SciTech Connect

    Turse, Joshua E.; Marshall, Matthew J.; Fredrickson, Jim K.; Lipton, Mary S.; Callister, Stephen J.

    2010-11-12

    Current methods in proteomics are dependent on the availability of sequenced genomes to identify proteins. However, genomic sequences are not always available for bacteria or microbial communities, even with high throughput sequencing technology becoming more readily available. Nevertheless, the homology that exists between related bacteria makes possible the extraction of meaningful biological information from an organism’s, or community’s proteome using the genomic sequence of a near neighbor. Here, a cross-organism search strategy was used to look at the amount of proteomics information obtainable with relative genetic distance from a near neighbor organism and to identify proteins in the proteome of minimally characterized environmental isolates. We conclude that closely related organisms with sequenced genomes, can be used to characterize proteomes of organisms with unsequenced genomes. In general, a cross-organism search strategy demonstrates the first step to use of sequences genomes to evaluate the proteomes of environmental bacteria and microbial communities that have no sequenced genome

  16. Shape-based alignment of genomic landscapes in multi-scale resolution

    PubMed Central

    Ashida, Hiroki; Asai, Kiyoshi; Hamada, Michiaki

    2012-01-01

    Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous ‘genomic landscapes’ to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article, we introduce a method for rapidly detecting local regions that show high correlations between genomic landscapes. We overcame the size problem for genome-wide data by converting the data into series of symbols and then carrying out sequence alignment. We also decomposed the oscillation of the landscape data into different frequency bands before analysis, since the real genomic landscape is a mixture of embedded and confounded biological processes working at different scales in the cell nucleus. To verify the usefulness and generality of our method, we applied our approach to well investigated landscapes from the human genome, including several histone modifications. Furthermore, by applying our method to over 20 genomic landscapes in human and 12 in mouse, we found that DNA replication timing and the density of Alu insertions are highly correlated genome-wide in both species, even though the Alu elements have amplified independently in the two genomes. To our knowledge, this is the first method to align genomic landscapes at multiple scales according to their shape. PMID:22561376

  17. The influence of large scale genomics and the changing role of ex situ collections

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The development of large scale genomics resources in non-model organisms promises to have a fundamental impact on the utilization of genetic resources. Technical innovation in high through-put sequencing has reduced the cost to a point where genome-wide SNP development is feasible across a range of ...

  18. Bacterial α2-macroglobulins: colonization factors acquired by horizontal gene transfer from the metazoan genome?

    PubMed Central

    Budd, Aidan; Blandin, Stephanie; Levashina, Elena A; Gibson, Toby J

    2004-01-01

    Background Invasive bacteria are known to have captured and adapted eukaryotic host genes. They also readily acquire colonizing genes from other bacteria by horizontal gene transfer. Closely related species such as Helicobacter pylori and Helicobacter hepaticus, which exploit different host tissues, share almost none of their colonization genes. The protease inhibitor α2-macroglobulin provides a major metazoan defense against invasive bacteria, trapping attacking proteases required by parasites for successful invasion. Results Database searches with metazoan α2-macroglobulin sequences revealed homologous sequences in bacterial proteomes. The bacterial α2-macroglobulin phylogenetic distribution is patchy and violates the vertical descent model. Bacterial α2-macroglobulin genes are found in diverse clades, including purple bacteria (proteobacteria), fusobacteria, spirochetes, bacteroidetes, deinococcids, cyanobacteria, planctomycetes and thermotogae. Most bacterial species with bacterial α2-macroglobulin genes exploit higher eukaryotes (multicellular plants and animals) as hosts. Both pathogenically invasive and saprophytically colonizing species possess bacterial α2-macroglobulins, indicating that bacterial α2-macroglobulin is a colonization rather than a virulence factor. Conclusions Metazoan α2-macroglobulins inhibit proteases of pathogens. The bacterial homologs may function in reverse to block host antimicrobial defenses. α2-macroglobulin was probably acquired one or more times from metazoan hosts and has then spread widely through other colonizing bacterial species by more than 10 independent horizontal gene transfers. yfhM-like bacterial α2-macroglobulin genes are often found tightly linked with pbpC, encoding an atypical peptidoglycan transglycosylase, PBP1C, that does not function in vegetative peptidoglycan synthesis. We suggest that YfhM and PBP1C are coupled together as a periplasmic defense and repair system. Bacterial α2-macroglobulins might

  19. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    SciTech Connect

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help identify key

  20. High-throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance.

    PubMed

    Bart, Rebecca; Cohn, Megan; Kassen, Andrew; McCallum, Emily J; Shybut, Mikel; Petriello, Annalise; Krasileva, Ksenia; Dahlbeck, Douglas; Medina, Cesar; Alicai, Titus; Kumar, Lava; Moreira, Leandro M; Rodrigues Neto, Júlio; Verdier, Valerie; Santana, María Angélica; Kositcharoenkul, Nuttima; Vanderschuren, Hervé; Gruissem, Wilhelm; Bernal, Adriana; Staskawicz, Brian J

    2012-07-10

    Cassava bacterial blight (CBB), incited by Xanthomonas axonopodis pv. manihotis (Xam), is the most important bacterial disease of cassava, a staple food source for millions of people in developing countries. Here we present a widely applicable strategy for elucidating the virulence components of a pathogen population. We report Illumina-based draft genomes for 65 Xam strains and deduce the phylogenetic relatedness of Xam across the areas where cassava is grown. Using an extensive database of effector proteins from animal and plant pathogens, we identify the effector repertoire for each sequenced strain and use a comparative sequence analysis to deduce the least polymorphic of the conserved effectors. These highly conserved effectors have been maintained over 11 countries, three continents, and 70 y of evolution and as such represent ideal targets for developing resistance strategies. PMID:22699502

  1. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity

    PubMed Central

    Jiao, Yizu; Hasegawa, Mizuho; Moon, Henry; Núñez, Gabriel; Inohara, Naohiro; Raes, Jeroen

    2016-01-01

    Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease. PMID:27409077

  2. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    PubMed

    Darzi, Youssef; Jiao, Yizu; Hasegawa, Mizuho; Moon, Henry; Núñez, Gabriel; Inohara, Naohiro; Raes, Jeroen

    2016-01-01

    Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease. PMID:27409077

  3. Profiling DNA Methylomes from Microarray to Genome-Scale Sequencing

    PubMed Central

    Huang, Yi-Wen; Huang, Tim H.-M.; Wang, Li-Shu

    2010-01-01

    DNA cytosine methylation is a central epigenetic modification which plays critical roles in cellular processes including genome regulation, development and disease. Here, we review current and emerging microarray and next-generation sequencing based technologies that enhance our knowledge of DNA methylation profiling. Each methodology has limitations and their unique applications, and combinations of several modalities may help build the entire methylome. With advances on next-generation sequencing technologies, it is now possible to globally map the DNA cytosine methylation at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. PMID:20218736

  4. Profiling DNA methylomes from microarray to genome-scale sequencing.

    PubMed

    Huang, Yi-Wei; Huang, Tim H-M; Wang, Li-Shu

    2010-04-01

    DNA cytosine methylation is a central epigenetic modification which plays critical roles in cellular processes including genome regulation, development and disease. Here, we review current and emerging microarray and next-generation sequencing based technologies that enhance our knowledge of DNA methylation profiling. Each methodology has limitations and their unique applications, and combinations of several modalities may help build the entire methylome. With advances on next-generation sequencing technologies, it is now possible to globally map the DNA cytosine methylation at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. PMID:20218736

  5. Kernel methods for large-scale genomic data analysis

    PubMed Central

    Xing, Eric P.; Schaid, Daniel J.

    2015-01-01

    Machine learning, particularly kernel methods, has been demonstrated as a promising new tool to tackle the challenges imposed by today’s explosive data growth in genomics. They provide a practical and principled approach to learning how a large number of genetic variants are associated with complex phenotypes, to help reveal the complexity in the relationship between the genetic markers and the outcome of interest. In this review, we highlight the potential key role it will have in modern genomic data processing, especially with regard to integration with classical methods for gene prioritizing, prediction and data fusion. PMID:25053743

  6. Metabolic stasis in an ancient symbiosis: genome-scale metabolic networks from two Blattabacterium cuenoti strains, primary endosymbionts of cockroaches

    PubMed Central

    2012-01-01

    Background Cockroaches are terrestrial insects that strikingly eliminate waste nitrogen as ammonia instead of uric acid. Blattabacterium cuenoti (Mercier 1906) strains Bge and Pam are the obligate primary endosymbionts of the cockroaches Blattella germanica and Periplaneta americana, respectively. The genomes of both bacterial endosymbionts have recently been sequenced, making possible a genome-scale constraint-based reconstruction of their metabolic networks. The mathematical expression of a metabolic network and the subsequent quantitative studies of phenotypic features by Flux Balance Analysis (FBA) represent an efficient functional approach to these uncultivable bacteria. Results We report the metabolic models of Blattabacterium strains Bge (iCG238) and Pam (iCG230), comprising 296 and 289 biochemical reactions, associated with 238 and 230 genes, and 364 and 358 metabolites, respectively. Both models reflect both the striking similarities and the singularities of these microorganisms. FBA was used to analyze the properties, potential and limits of the models, assuming some environmental constraints such as aerobic conditions and the net production of ammonia from these bacterial systems, as has been experimentally observed. In addition, in silico simulations with the iCG238 model have enabled a set of carbon and nitrogen sources to be defined, which would also support a viable phenotype in terms of biomass production in the strain Pam, which lacks the first three steps of the tricarboxylic acid cycle. FBA reveals a metabolic condition that renders these enzymatic steps dispensable, thus offering a possible evolutionary explanation for their elimination. We also confirm, by computational simulations, the fragility of the metabolic networks and their host dependence. Conclusions The minimized Blattabacterium metabolic networks are surprisingly similar in strains Bge and Pam, after 140 million years of evolution of these endosymbionts in separate cockroach

  7. Dynamics of bacterial communities before and after distribution in a full-scale drinking water network.

    PubMed

    El-Chakhtoura, Joline; Prest, Emmanuelle; Saikaly, Pascal; van Loosdrecht, Mark; Hammes, Frederik; Vrouwenvelder, Hans

    2015-05-01

    Understanding the biological stability of drinking water distribution systems is imperative in the framework of process control and risk management. The objective of this research was to examine the dynamics of the bacterial community during drinking water distribution at high temporal resolution. Water samples (156 in total) were collected over short time-scales (minutes/hours/days) from the outlet of a treatment plant and a location in its corresponding distribution network. The drinking water is treated by biofiltration and disinfectant residuals are absent during distribution. The community was analyzed by 16S rRNA gene pyrosequencing and flow cytometry as well as conventional, culture-based methods. Despite a random dramatic event (detected with pyrosequencing and flow cytometry but not with plate counts), the bacterial community profile at the two locations did not vary significantly over time. A diverse core microbiome was shared between the two locations (58-65% of the taxa and 86-91% of the sequences) and found to be dependent on the treatment strategy. The bacterial community structure changed during distribution, with greater richness detected in the network and phyla such as Acidobacteria and Gemmatimonadetes becoming abundant. The rare taxa displayed the highest dynamicity, causing the major change during water distribution. This change did not have hygienic implications and is contingent on the sensitivity of the applied methods. The concept of biological stability therefore needs to be revised. Biostability is generally desired in drinking water guidelines but may be difficult to achieve in large-scale complex distribution systems that are inherently dynamic. PMID:25732558

  8. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    SciTech Connect

    Tartakovsky, Guzel D.; Tartakovsky, Alexandre M.; Scheibe, Timothy D.; Fang, Yilin; Mahadevan, Radhakrishnan; Lovley, Derek R.

    2013-09-07

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparisonto prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model under

  9. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    NASA Astrophysics Data System (ADS)

    Scheibe, T. D.; Tartakovsky, G.; Tartakovsky, A. M.; Fang, Y.; Mahadevan, R.; Lovley, D. R.

    2012-12-01

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparison to prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model

  10. Draft Genome Sequence of Nocardia jinanensis, an Opportunistic Bacterial Pathogen That Causes Cellulitis

    PubMed Central

    Chakrabortti, Alolika; Li, Jinming

    2016-01-01

    The draft genome sequence of Nocardia jinanensis, an opportunistic pathogen that can cause skin infections, reveals genes that may contribute to the lifestyle and pathogenicity of N. jinanensis. The genome also reveals the biosynthetic capacity of N. jinanensis in producing mycolic acids, siderophores, and other polyketide and nonribosomal peptide-derived secondary metabolites. PMID:27445366

  11. FDA Bioinformatics Tool for Microbial Genomics Research on Molecular Characterization of Bacterial Foodborne Pathogens Using Microarrays

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed the genomics tool ArrayTrackTM, which provides extensive functionalities to man...

  12. Attenuated Virulence and Genomic Reductive Evolution in the Entomopathogenic Bacterial Symbiont Species, Xenorhabdus poinarii

    PubMed Central

    Ogier, Jean-Claude; Pagès, Sylvie; Bisch, Gaëlle; Chiapello, Hélène; Médigue, Claudine; Rouy, Zoé; Teyssier, Corinne; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie

    2014-01-01

    Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Unlike other Xenorhabdus species, Xenorhabdus poinarii is avirulent when injected into insects in the absence of its nematode host. We sequenced the genome of the X. poinarii strain G6 and the closely related but virulent X. doucetiae strain FRM16. G6 had a smaller genome (500–700 kb smaller) than virulent Xenorhabdus strains and lacked genes encoding potential virulence factors (hemolysins, type 5 secretion systems, enzymes involved in the synthesis of secondary metabolites, and toxin–antitoxin systems). The genomes of all the X. poinarii strains analyzed here had a similar small size. We did not observe the accumulation of pseudogenes, insertion sequences or decrease in coding density usually seen as a sign of genomic erosion driven by genetic drift in host-adapted bacteria. Instead, genome reduction of X. poinarii seems to have been mediated by the excision of genomic blocks from the flexible genome, as reported for the genomes of attenuated free pathogenic bacteria and some facultative mutualistic bacteria growing exclusively within hosts. This evolutionary pathway probably reflects the adaptation of X. poinarii to specific host. PMID:24904010

  13. Symmetry and scale orient Min protein patterns in shaped bacterial sculptures

    PubMed Central

    Wu, Fabai; van Schie, Bas G.C.; Keymer, Juan E.; Dekker, Cees

    2016-01-01

    The boundary of a cell defines the shape and scale for its subcellular organisation. However, the effects of the cell’s spatial boundaries as well as the geometry sensing and scale adaptation of intracellular molecular networks remain largely unexplored. Here, we show that living bacterial cells can be ‘sculpted’ into defined shapes, such as squares and rectangles, which are used to explore the spatial adaptation of Min proteins that oscillate pole-to-pole in rod-shape Escherichia coli to assist cell division. In a wide geometric parameter space, ranging from 2x1x1 to 11x6x1 μm3, Min proteins exhibit versatile oscillation patterns, sustaining rotational, longitudinal, diagonal, stripe, and even transversal modes. These patterns are found to directly capture the symmetry and scale of the cell boundary, and the Min concentration gradients scale in adaptation to the cell size within a characteristic length range of 3–6 μm. Numerical simulations reveal that local microscopic Turing kinetics of Min proteins can yield global symmetry selection, gradient scaling, and an adaptive range, when and only when facilitated by the three-dimensional confinement of cell boundary. These findings cannot be explained by previous geometry-sensing models based on the longest distance, membrane area or curvature, and reveal that spatial boundaries can facilitate simple molecular interactions to result in far more versatile functions than previously understood. PMID:26098227

  14. Symmetry and scale orient Min protein patterns in shaped bacterial sculptures

    NASA Astrophysics Data System (ADS)

    Wu, Fabai; van Schie, Bas G. C.; Keymer, Juan E.; Dekker, Cees

    2015-08-01

    The boundary of a cell defines the shape and scale of its subcellular organization. However, the effects of the cell's spatial boundaries as well as the geometry sensing and scale adaptation of intracellular molecular networks remain largely unexplored. Here, we show that living bacterial cells can be ‘sculpted’ into defined shapes, such as squares and rectangles, which are used to explore the spatial adaptation of Min proteins that oscillate pole-to-pole in rod-shaped Escherichia coli to assist cell division. In a wide geometric parameter space, ranging from 2 × 1 × 1 to 11 × 6 × 1 μm3, Min proteins exhibit versatile oscillation patterns, sustaining rotational, longitudinal, diagonal, stripe and even transversal modes. These patterns are found to directly capture the symmetry and scale of the cell boundary, and the Min concentration gradients scale with the cell size within a characteristic length range of 3-6 μm. Numerical simulations reveal that local microscopic Turing kinetics of Min proteins can yield global symmetry selection, gradient scaling and an adaptive range, when and only when facilitated by the three-dimensional confinement of the cell boundary. These findings cannot be explained by previous geometry-sensing models based on the longest distance, membrane area or curvature, and reveal that spatial boundaries can facilitate simple molecular interactions to result in far more versatile functions than previously understood.

  15. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts.

    PubMed

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G; Schroeder, Steven; Scheffler, Brian; Duke, Mary V; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  16. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts

    PubMed Central

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M.; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G.; Schroeder, Steven; Scheffler, Brian; Duke, Mary V.; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L.; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C.

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  17. Genomic DNA fingerprint analysis of biotype 1 Gardnerella vaginalis from patients with and without bacterial vaginosis.

    PubMed Central

    Wu, S R; Hillier, S L; Nath, K

    1996-01-01

    Of the 20 biotype 1 Gardnerella vaginalis isolates analyzed, 10 from patients with bacterial vaginosis and 10 from patients without bacterial vaginosis, none shared the same DNA fingerprint. However, a 1.18-kb HindIII fragment was common among 18 of the 20 biotype 1 isolates in a restriction fragment length polymorphism analysis with a 7.9-kb G. vaginalis DNA probe. PMID:8748302

  18. The Heterotrophic Bacterial Response During the Meso-scale Southern Ocean Iron Experiment (SOFeX)

    NASA Astrophysics Data System (ADS)

    Oliver, J. L.; Barber, R. T.; Ducklow, H. W.

    2002-12-01

    Previous meso-scale iron enrichments have demonstrated the stimulatory effect of iron on primary productivity and the accelerated flow of carbon into the surface ocean foodweb. In stratified waters, heterotrophic activity can work against carbon export by remineralizing POC and/or DOC back to CO2, effectively slowing the biological pump. To assess the response of heterotrophic activity to iron enrichment, we measured heterotrophic bacterial production and abundance during the Southern Ocean Iron Experiment (SOFeX). Heterotrophic bacterial processes primarily affect the latter of the two carbon export mechanisms, removal of DOC to the deep ocean. Heterotrophic bacterial production (BP), measured via tritiated thymidine (3H-TdR) and leucine (3H-Leu) incorporation, increased ~40% over the 18-d observation period in iron fertilized waters south of the Polar Front (South Patch). Also, South Patch BP was 61% higher than in the surrounding unfertilized waters. Abundance, measured by flow cytometry (FCM) and acridine orange direct counts (AODC), also increased in the South Patch from 3 to 5 x 108 cells liter-1, a 70% increase. Bacterial biomass increased from ~3.6 to 6.3 μg C liter-1, a clear indication that production rates exceeded removal rates (bactivory, viral lysis) over the course of 18 days. Biomass within the fertilized patch was 11% higher than in surrounding unfertilized waters reflecting a similar trend. This pattern is in contrast to SOIREE where no accumulation of biomass was observed. High DNA-containing (HDNA) cells detected by FCM also increased over time in iron fertilized waters from 20% to 46% relative to the total population suggesting an active subpopulation of cells that were growing faster than the removal rates. In iron fertilized waters north of the Polar Front (North Patch), BP and abundance were ~90% and 80% higher, respectively, than in unfertilized waters. Our results suggest an active bacterial population that responded to iron fertilization

  19. Origins and Recombination of the Bacterial-Sized Multichromosomal Mitochondrial Genome of Cucumber[C][W

    PubMed Central

    Alverson, Andrew J; Rice, Danny W; Dickinson, Stephanie; Barry, Kerrie; Palmer, Jeffrey D

    2011-01-01

    Members of the flowering plant family Cucurbitaceae harbor the largest known mitochondrial genomes. Here, we report the 1685-kb mitochondrial genome of cucumber (Cucumis sativus). We help solve a 30-year mystery about the origins of its large size by showing that it mainly reflects the proliferation of dispersed repeats, expansions of existing introns, and the acquisition of sequences from diverse sources, including the cucumber nuclear and chloroplast genomes, viruses, and bacteria. The cucumber genome has a novel structure for plant mitochondria, mapping as three entirely or largely autonomous circular chromosomes (lengths 1556, 84, and 45 kb) that vary in relative abundance over a twofold range. These properties suggest that the three chromosomes replicate independently of one another. The two smaller chromosomes are devoid of known functional genes but nonetheless contain diagnostic mitochondrial features. Paired-end sequencing conflicts reveal differences in recombination dynamics among chromosomes, for which an explanatory model is developed, as well as a large pool of low-frequency genome conformations, many of which may result from asymmetric recombination across intermediate-sized and sometimes highly divergent repeats. These findings highlight the promise of genome sequencing for elucidating the recombinational dynamics of plant mitochondrial genomes. PMID:21742987

  20. On the road to synthetic life: the minimal cell and genome-scale engineering.

    PubMed

    Juhas, Mario

    2016-06-01

    Synthetic biology employs rational engineering principles to build biological systems from the libraries of standard, well characterized biological parts. Biological systems designed and built by synthetic biologists fulfill a plethora of useful purposes, ranging from better healthcare and energy production to biomanufacturing. Recent advancements in the synthesis, assembly and "booting-up" of synthetic genomes and in low and high-throughput genome engineering have paved the way for engineering on the genome-wide scale. One of the key goals of genome engineering is the construction of minimal genomes consisting solely of essential genes (genes indispensable for survival of living organisms). Besides serving as a toolbox to understand the universal principles of life, the cell encoded by minimal genome could be used to build a stringently controlled "cell factory" with a desired phenotype. This review provides an update on recent advances in the genome-scale engineering with particular emphasis on the engineering of minimal genomes. Furthermore, it presents an ongoing discussion to the scientific community for better suitability of minimal or robust cells for industrial applications. PMID:25578717

  1. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions.

    PubMed

    Burton, Joshua N; Adey, Andrew; Patwardhan, Rupali P; Qiu, Ruolan; Kitzman, Jacob O; Shendure, Jay

    2013-12-01

    Genomes assembled de novo from short reads are highly fragmented relative to the finished chromosomes of Homo sapiens and key model organisms generated by the Human Genome Project. To address this problem, we need scalable, cost-effective methods to obtain assemblies with chromosome-scale contiguity. Here we show that genome-wide chromatin interaction data sets, such as those generated by Hi-C, are a rich source of long-range information for assigning, ordering and orienting genomic sequences to chromosomes, including across centromeres. To exploit this finding, we developed an algorithm that uses Hi-C data for ultra-long-range scaffolding of de novo genome assemblies. We demonstrate the approach by combining shotgun fragment and short jump mate-pair sequences with Hi-C data to generate chromosome-scale de novo assemblies of the human, mouse and Drosophila genomes, achieving--for the human genome--98% accuracy in assigning scaffolds to chromosome groups and 99% accuracy in ordering and orienting scaffolds within chromosome groups. Hi-C data can also be used to validate chromosomal translocations in cancer genomes. PMID:24185095

  2. Genome-Scale Metabolic Modeling in the Simulation of Field-Scale Uranium Bioremediation

    NASA Astrophysics Data System (ADS)

    Yabusaki, S.; Wilkins, M.; Fang, Y.; Williams, K. H.; Waichler, S.; Long, P. E.

    2015-12-01

    Coupled variably saturated flow and biogeochemical reactive transport modeling is used to improve understanding of the processes, properties, and conditions controlling uranium bio-immobilization in a field experiment where uranium-contaminated groundwater was amended with acetate and bicarbonate. The acetate stimulates indigenous microorganisms that catalyze metal reduction, including the conversion of aqueous U(VI) to solid-phase U(IV), which effectively removes uranium from solution. The initiation of the bicarbonate amendment prior to biostimulation was designed to promote U(VI) desorption that would increase the aqueous U(VI) available for bioreduction. The three-dimensional simulations were able to largely reproduce the timing and magnitude of the physical, chemical and biological responses to the acetate and bicarbonate amendment in the context of changing water table elevation and gradient. A time series of groundwater proteomic samples exhibited correlations between the most abundant Geobacter metallireducens proteins and the genome-scale metabolic model-predicted fluxes of intra-cellular reactions associated with each of those proteins. The desorption of U(VI) induced by the bicarbonate amendment led to initially higher rates of bioreduction compared to locations with minimal bicarbonate exposure. After bicarbonate amendment ceased, bioreduction continued at these locations whereas U(VI) sorption was the dominant removal mechanism at the bicarbonate-impacted sites.

  3. Genomic revelations of a mutualism: the pea aphid and its obligate bacterial symbiont.

    PubMed

    Shigenobu, Shuji; Wilson, Alex C C

    2011-04-01

    The symbiosis of the pea aphid Acyrthosphion pisum with the bacterium Buchnera aphidicola APS represents the best-studied insect obligate symbiosis. Here we present a refined picture of this symbiosis by linking pre-genomic observations to new genomic data that includes the complete genomes of the eukaryotic and prokaryotic symbiotic partners. In doing so, we address four issues central to understanding the patterns and processes operating at the A. pisum/Buchnera APS interface. These four issues include: (1) lateral gene transfer, (2) host immunity, (3) symbiotic metabolism, and (4) regulation. PMID:21390549

  4. Strain Dependent Genetic Networks for Antibiotic-Sensitivity in a Bacterial Pathogen with a Large Pan-Genome.

    PubMed

    van Opijnen, Tim; Dedrick, Sandra; Bento, José

    2016-09-01

    The interaction between an antibiotic and bacterium is not merely restricted to the drug and its direct target, rather antibiotic induced stress seems to resonate through the bacterium, creating selective pressures that drive the emergence of adaptive mutations not only in the direct target, but in genes involved in many different fundamental processes as well. Surprisingly, it has been shown that adaptive mutations do not necessarily have the same effect in all species, indicating that the genetic background influences how phenotypes are manifested. However, to what extent the genetic background affects the manner in which a bacterium experiences antibiotic stress, and how this stress is processed is unclear. Here we employ the genome-wide tool Tn-Seq to construct daptomycin-sensitivity profiles for two strains of the bacterial pathogen Streptococcus pneumoniae. Remarkably, over half of the genes that are important for dealing with antibiotic-induced stress in one strain are dispensable in another. By confirming over 100 genotype-phenotype relationships, probing potassium-loss, employing genetic interaction mapping as well as temporal gene-expression experiments we reveal genome-wide conditionally important/essential genes, we discover roles for genes with unknown function, and uncover parts of the antibiotic's mode-of-action. Moreover, by mapping the underlying genomic network for two query genes we encounter little conservation in network connectivity between strains as well as profound differences in regulatory relationships. Our approach uniquely enables genome-wide fitness comparisons across strains, facilitating the discovery that antibiotic responses are complex events that can vary widely between strains, which suggests that in some cases the emergence of resistance could be strain specific and at least for species with a large pan-genome less predictable. PMID:27607357

  5. Cloning human herpes virus 6A genome into bacterial artificial chromosomes and study of DNA replication intermediates

    PubMed Central

    Borenstein, Ronen; Frenkel, Niza

    2009-01-01

    Cloning of large viral genomes into bacterial artificial chromosomes (BACs) facilitates analyses of viral functions and molecular mutagenesis. Previous derivations of viral BACs involved laborious recombinations within infected cells. We describe a single-step production of viral BACs by direct cloning of unit length genomes, derived from circular or head-to-tail concatemeric DNA replication intermediates. The BAC cloning is independent of intracellular recombinations and DNA packaging constraints. We introduced the 160-kb human herpes virus 6A (HHV-6A) genome into BACs by digesting the viral DNA replicative intermediates with the Sfil enzyme that cleaves the viral genome in a single site. The recombinant BACs contained also the puromycin selection gene, GFP, and LoxP sites flanking the BAC sequences. The HHV-6A-BAC vectors were retained stably in puromycin selected 293T cells. In the presence of irradiated helper virus, supplying most likely proteins enhancing gene expression they expressed early and late genes in SupT1 T cells. The method is especially attractive for viruses that replicate inefficiently and for viruses propagated in suspension cells. We have used the fact that the BAC cloning “freezes” the viral DNA replication intermediates to analyze their structure. The results revealed that HHV-6A-BACs contained a single direct repeat (DR) rather than a DR-DR sequence, predicted to arise by circularization of parental genomes with a DR at each terminus. HHV-6A DNA molecules prepared from the infected cells also contained DNA molecules with a single DR. Such forms were not previously described for HHV-6 DNA. PMID:19858479

  6. Direct-to-consumer genomics on the scales of autonomy.

    PubMed

    Vayena, Effy

    2015-04-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the 'harm' arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers' independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610

  7. Mapping copy number variation by population-scale genome sequencing.

    PubMed

    Mills, Ryan E; Walter, Klaudia; Stewart, Chip; Handsaker, Robert E; Chen, Ken; Alkan, Can; Abyzov, Alexej; Yoon, Seungtai Chris; Ye, Kai; Cheetham, R Keira; Chinwalla, Asif; Conrad, Donald F; Fu, Yutao; Grubert, Fabian; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Iakoucheva, Lilia M; Iqbal, Zamin; Kang, Shuli; Kidd, Jeffrey M; Konkel, Miriam K; Korn, Joshua; Khurana, Ekta; Kural, Deniz; Lam, Hugo Y K; Leng, Jing; Li, Ruiqiang; Li, Yingrui; Lin, Chang-Yun; Luo, Ruibang; Mu, Xinmeng Jasmine; Nemesh, James; Peckham, Heather E; Rausch, Tobias; Scally, Aylwyn; Shi, Xinghua; Stromberg, Michael P; Stütz, Adrian M; Urban, Alexander Eckehart; Walker, Jerilyn A; Wu, Jiantao; Zhang, Yujun; Zhang, Zhengdong D; Batzer, Mark A; Ding, Li; Marth, Gabor T; McVean, Gil; Sebat, Jonathan; Snyder, Michael; Wang, Jun; Ye, Kenny; Eichler, Evan E; Gerstein, Mark B; Hurles, Matthew E; Lee, Charles; McCarroll, Steven A; Korbel, Jan O

    2011-02-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies. PMID:21293372

  8. Draft Genome Sequence of Erwinia tracheiphila, an Economically Important Bacterial Pathogen of Cucurbits.

    PubMed

    Shapiro, Lori R; Scully, Erin D; Roberts, Dana; Straub, Timothy J; Geib, Scott M; Park, Jihye; Stephenson, Andrew G; Salaau Rojas, Erika; Liu, Quin; Beattie, Gwyn; Gleason, Mark; De Moraes, Consuelo M; Mescher, Mark C; Fleischer, Shelby G; Kolter, Roberto; Pierce, Naomi; Zhaxybayeva, Olga

    2015-01-01

    Erwinia tracheiphila is one of the most economically important pathogens of cucumbers, melons, squashes, pumpkins, and gourds in the northeastern and midwestern United States, yet its molecular pathology remains uninvestigated. Here, we report the first draft genome sequence of an E. tracheiphila strain isolated from an infected wild gourd (Cucurbita pepo subsp. texana) plant. The genome assembly consists of 7 contigs and includes a putative plasmid and at least 20 phage and prophage elements. PMID:26044415

  9. Genome Sequence of Acidovorax citrulli Group 1 Strain pslb65 Causing Bacterial Fruit Blotch of Melons

    PubMed Central

    Wang, Tielin; Sun, Baixin; Yang, Yuwen

    2015-01-01

    Acidovorax citrulli is typed into two groups, mainly based on the host. We determined the draft genome of A. citrulli group 1 strain pslb65. The strain was isolated from melon collected from Xinjiang province, China. The A. citrulli pslb65 genome contains 4,903,443 bp and has a G+C content of 68.8 mol%. PMID:25908136

  10. Draft Genome Sequence of Erwinia tracheiphila, an Economically Important Bacterial Pathogen of Cucurbits

    PubMed Central

    Scully, Erin D.; Roberts, Dana; Straub, Timothy J.; Geib, Scott M.; Park, Jihye; Stephenson, Andrew G.; Salaau Rojas, Erika; Liu, Quin; Beattie, Gwyn; Gleason, Mark; De Moraes, Consuelo M.; Mescher, Mark C.; Fleischer, Shelby G.; Kolter, Roberto; Pierce, Naomi; Zhaxybayeva, Olga

    2015-01-01

    Erwinia tracheiphila is one of the most economically important pathogens of cucumbers, melons, squashes, pumpkins, and gourds in the northeastern and midwestern United States, yet its molecular pathology remains uninvestigated. Here, we report the first draft genome sequence of an E. tracheiphila strain isolated from an infected wild gourd (Cucurbita pepo subsp. texana) plant. The genome assembly consists of 7 contigs and includes a putative plasmid and at least 20 phage and prophage elements. PMID:26044415

  11. Genomic epidemiology and global diversity of the emerging bacterial pathogen Elizabethkingia anophelis.

    PubMed

    Breurec, Sebastien; Criscuolo, Alexis; Diancourt, Laure; Rendueles, Olaya; Vandenbogaert, Mathias; Passet, Virginie; Caro, Valérie; Rocha, Eduardo P C; Touchon, Marie; Brisse, Sylvain

    2016-01-01

    Elizabethkingia anophelis is an emerging pathogen involved in human infections and outbreaks in distinct world regions. We investigated the phylogenetic relationships and pathogenesis-associated genomic features of two neonatal meningitis isolates isolated 5 years apart from one hospital in Central African Republic and compared them with Elizabethkingia from other regions and sources. Average nucleotide identity firmly confirmed that E. anophelis, E. meningoseptica and E. miricola represent demarcated genomic species. A core genome multilocus sequence typing scheme, broadly applicable to Elizabethkingia species, was developed and made publicly available (http://bigsdb.pasteur.fr/elizabethkingia). Phylogenetic analysis revealed distinct E. anophelis sublineages and demonstrated high genetic relatedness between the African isolates, compatible with persistence of the strain in the hospital environment. CRISPR spacer variation between the African isolates was mirrored by the presence of a large mobile genetic element. The pan-genome of E. anophelis comprised 6,880 gene families, underlining genomic heterogeneity of this species. African isolates carried unique resistance genes acquired by horizontal transfer. We demonstrated the presence of extensive variation of the capsular polysaccharide synthesis gene cluster in E. anophelis. Our results demonstrate the dynamic evolution of this emerging pathogen and the power of genomic approaches for Elizabethkingia identification, population biology and epidemiology. PMID:27461509

  12. Comparative genomics of the bacterial genus Streptococcus illuminates evolutionary implications of species groups.

    PubMed

    Gao, Xiao-Yang; Zhi, Xiao-Yang; Li, Hong-Wei; Klenk, Hans-Peter; Li, Wen-Jun

    2014-01-01

    Members of the genus Streptococcus within the phylum Firmicutes are among the most diverse and significant zoonotic pathogens. This genus has gone through considerable taxonomic revision due to increasing improvements of chemotaxonomic approaches, DNA hybridization and 16S rRNA gene sequencing. It is proposed to place the majority of streptococci into "species groups". However, the evolutionary implications of species groups are not clear presently. We use comparative genomic approaches to yield a better understanding of the evolution of Streptococcus through genome dynamics, population structure, phylogenies and virulence factor distribution of species groups. Genome dynamics analyses indicate that the pan-genome size increases with the addition of newly sequenced strains, while the core genome size decreases with sequential addition at the genus level and species group level. Population structure analysis reveals two distinct lineages, one including Pyogenic, Bovis, Mutans and Salivarius groups, and the other including Mitis, Anginosus and Unknown groups. Phylogenetic dendrograms show that species within the same species group cluster together, and infer two main clades in accordance with population structure analysis. Distribution of streptococcal virulence factors has no obvious patterns among the species groups; however, the evolution of some common virulence factors is congruous with the evolution of species groups, according to phylogenetic inference. We suggest that the proposed streptococcal species groups are reasonable from the viewpoints of comparative genomics; evolution of the genus is congruent with the individual evolutionary trajectories of different species groups. PMID:24977706

  13. Comparative Genomics of the Bacterial Genus Streptococcus Illuminates Evolutionary Implications of Species Groups

    PubMed Central

    Gao, Xiao-Yang; Zhi, Xiao-Yang; Li, Hong-Wei; Klenk, Hans-Peter; Li, Wen-Jun

    2014-01-01

    Members of the genus Streptococcus within the phylum Firmicutes are among the most diverse and significant zoonotic pathogens. This genus has gone through considerable taxonomic revision due to increasing improvements of chemotaxonomic approaches, DNA hybridization and 16S rRNA gene sequencing. It is proposed to place the majority of streptococci into “species groups”. However, the evolutionary implications of species groups are not clear presently. We use comparative genomic approaches to yield a better understanding of the evolution of Streptococcus through genome dynamics, population structure, phylogenies and virulence factor distribution of species groups. Genome dynamics analyses indicate that the pan-genome size increases with the addition of newly sequenced strains, while the core genome size decreases with sequential addition at the genus level and species group level. Population structure analysis reveals two distinct lineages, one including Pyogenic, Bovis, Mutans and Salivarius groups, and the other including Mitis, Anginosus and Unknown groups. Phylogenetic dendrograms show that species within the same species group cluster together, and infer two main clades in accordance with population structure analysis. Distribution of streptococcal virulence factors has no obvious patterns among the species groups; however, the evolution of some common virulence factors is congruous with the evolution of species groups, according to phylogenetic inference. We suggest that the proposed streptococcal species groups are reasonable from the viewpoints of comparative genomics; evolution of the genus is congruent with the individual evolutionary trajectories of different species groups. PMID:24977706

  14. Genomic epidemiology and global diversity of the emerging bacterial pathogen Elizabethkingia anophelis

    PubMed Central

    Breurec, Sebastien; Criscuolo, Alexis; Diancourt, Laure; Rendueles, Olaya; Vandenbogaert, Mathias; Passet, Virginie; Caro, Valérie; Rocha, Eduardo P. C.; Touchon, Marie; Brisse, Sylvain

    2016-01-01

    Elizabethkingia anophelis is an emerging pathogen involved in human infections and outbreaks in distinct world regions. We investigated the phylogenetic relationships and pathogenesis-associated genomic features of two neonatal meningitis isolates isolated 5 years apart from one hospital in Central African Republic and compared them with Elizabethkingia from other regions and sources. Average nucleotide identity firmly confirmed that E. anophelis, E. meningoseptica and E. miricola represent demarcated genomic species. A core genome multilocus sequence typing scheme, broadly applicable to Elizabethkingia species, was developed and made publicly available (http://bigsdb.pasteur.fr/elizabethkingia). Phylogenetic analysis revealed distinct E. anophelis sublineages and demonstrated high genetic relatedness between the African isolates, compatible with persistence of the strain in the hospital environment. CRISPR spacer variation between the African isolates was mirrored by the presence of a large mobile genetic element. The pan-genome of E. anophelis comprised 6,880 gene families, underlining genomic heterogeneity of this species. African isolates carried unique resistance genes acquired by horizontal transfer. We demonstrated the presence of extensive variation of the capsular polysaccharide synthesis gene cluster in E. anophelis. Our results demonstrate the dynamic evolution of this emerging pathogen and the power of genomic approaches for Elizabethkingia identification, population biology and epidemiology. PMID:27461509

  15. Survey of chimeric IStron elements in bacterial genomes: multiple molecular symbioses between group I intron ribozymes and DNA transposons

    PubMed Central

    Tourasse, Nicolas J.; Stabell, Fredrik B.; Kolstø, Anne-Brit

    2014-01-01

    IStrons are chimeric genetic elements composed of a group I intron associated with an insertion sequence (IS). The group I intron is a catalytic RNA providing the IStron with self-splicing ability, which renders IStron insertions harmless to the host genome. The IS element is a DNA transposon conferring mobility, and thus allowing the IStron to spread in genomes. IStrons are therefore a striking example of a molecular symbiosis between unrelated genetic elements endowed with different functions. In this study, we have conducted the first comprehensive survey of IStrons in sequenced genomes that provides insights into the distribution, diversity, origin and evolution of IStrons. We show that IStrons have a restricted phylogenetic distribution limited to two bacterial phyla, the Firmicutes and the Fusobacteria. Nevertheless, diverse IStrons representing two major groups targeting different insertion site motifs were identified. This taken with the finding that while the intron components of all IStrons belong to the same structural class, they are fused to different IS families, indicates that multiple intron–IS symbioses have occurred during evolution. In addition, introns and IS elements related to those that were at the origin of IStrons were also identified. PMID:25324310

  16. Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum

    DOE PAGESBeta

    Anantharaman, Karthik; Brown, Christopher T.; Burstein, David; Castelle, Cindy J.; Probst, Alexander J.; Thomas, Brian C.; Williams, Kenneth H.; Banfield, Jillian F.

    2016-01-28

    Five closely related populations of bacteria from the Candidate Phylum (CP) Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR), were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugarsmore » including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Furthermore, phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. In conclusion, we propose the provisional taxonomic assignment as ‘Candidatus Peribacter riflensis’, Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria.« less

  17. Amoebal Endosymbiont Neochlamydia Genome Sequence Illuminates the Bacterial Role in the Defense of the Host Amoebae against Legionella pneumophila

    PubMed Central

    Ishida, Kasumi; Sekizuka, Tsuyoshi; Hayashida, Kyoko; Matsuo, Junji; Takeuchi, Fumihiko; Kuroda, Makoto; Nakamura, Shinji; Yamazaki, Tomohiro; Yoshida, Mitsutaka; Takahashi, Kaori; Nagai, Hiroki; Sugimoto, Chihiro; Yamaguchi, Hiroyuki

    2014-01-01

    Previous work has shown that the obligate intracellular amoebal endosymbiont Neochlamydia S13, an environmental chlamydia strain, has an amoebal infection rate of 100%, but does not cause amoebal lysis and lacks transferability to other host amoebae. The underlying mechanism for these observations remains unknown. In this study, we found that the host amoeba could completely evade Legionella infection. The draft genome sequence of Neochlamydia S13 revealed several defects in essential metabolic pathways, as well as unique molecules with leucine-rich repeats (LRRs) and ankyrin domains, responsible for protein-protein interaction. Neochlamydia S13 lacked an intact tricarboxylic acid cycle and had an incomplete respiratory chain. ADP/ATP translocases, ATP-binding cassette transporters, and secretion systems (types II and III) were well conserved, but no type IV secretion system was found. The number of outer membrane proteins (OmcB, PomS, 76-kDa protein, and OmpW) was limited. Interestingly, genes predicting unique proteins with LRRs (30 genes) or ankyrin domains (one gene) were identified. Furthermore, 33 transposases were found, possibly explaining the drastic genome modification. Taken together, the genomic features of Neochlamydia S13 explain the intimate interaction with the host amoeba to compensate for bacterial metabolic defects, and illuminate the role of the endosymbiont in the defense of the host amoebae against Legionella infection. PMID:24747986

  18. Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum

    PubMed Central

    Anantharaman, Karthik; Burstein, David; Castelle, Cindy J.; Probst, Alexander J.; Thomas, Brian C.; Williams, Kenneth H.

    2016-01-01

    Five closely related populations of bacteria from the Candidate Phylum (CP) Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR), were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugars including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. We propose the provisional taxonomic assignment as ‘Candidatus Peribacter riflensis’, Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria. PMID:26844018

  19. Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum.

    PubMed

    Anantharaman, Karthik; Brown, Christopher T; Burstein, David; Castelle, Cindy J; Probst, Alexander J; Thomas, Brian C; Williams, Kenneth H; Banfield, Jillian F

    2016-01-01

    Five closely related populations of bacteria from the Candidate Phylum (CP) Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR), were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugars including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. We propose the provisional taxonomic assignment as 'Candidatus Peribacter riflensis', Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria. PMID:26844018

  20. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium

    PubMed Central

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  1. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    PubMed

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  2. A simple model for DNA bridging proteins and bacterial or human genomes: bridging-induced attraction and genome compaction

    NASA Astrophysics Data System (ADS)

    Johnson, J.; Brackley, C. A.; Cook, P. R.; Marenduzzo, D.

    2015-02-01

    We present computer simulations of the phase behaviour of an ensemble of proteins interacting with a polymer, mimicking non-specific binding to a piece of bacterial DNA or eukaryotic chromatin. The proteins can simultaneously bind to the polymer in two or more places to create protein bridges. Despite the lack of any explicit interaction between the proteins or between DNA segments, our simulations confirm previous results showing that when the protein-polymer interaction is sufficiently strong, the proteins come together to form clusters. Furthermore, a sufficiently large concentration of bridging proteins leads to the compaction of the swollen polymer into a globular phase. Here we characterise both the formation of protein clusters and the polymer collapse as a function of protein concentration, protein-polymer affinity and fibre flexibility.

  3. Large-scale genomic comparison using two-dimensional DNA gels

    SciTech Connect

    Sidman, C.L.; Shaffer, D.J.

    1994-09-01

    Two-dimensional electrophoresis (2DE) of DNA fragments, in which separation occurs first by size and then by sequence variation, is a method enabling large-scale comparison of complex genomes. Combining 2DE with probing for various classes of repetitive genomic elements allows rapid and efficient comparison of thousands of fragments and millions of basepairs of DNA distributed across most genomic regions. This approach is demonstrated here by analyzing the extent of genomic relatedness of different inbred strains of mice. Such strains are shown to differ from each other by approximately 0.2-1% of their nucleotides, above which level reproductive speciation occurs. The 2DE method of assessing the overall relationship between two genomes represents an appropriate tool for analyzing members of a single species, but is too sensitive for use in interspecies comparisons. 51 refs., 4 figs., 1 tab.

  4. Phylogeny-driven target selection for large-scale genome-sequencing (and other) projects

    PubMed Central

    Göker, Markus; Klenk, Hans-Peter

    2013-01-01

    Despite the steadily decreasing costs of genome sequencing, prioritizing organisms for sequencing remains important in large-scale projects. Phylogeny-based selection is of interest to identify those organisms whose genomes can be expected to differ most from those that have already been sequenced. Here, we describe a method that infers a phylogenetic scoring independent of which set of organisms has previously been targeted, which is computationally simple and easy to apply in practice. The scoring itself, as well as pre- and post-processing of the data, is illustrated using two real-world examples in which the method has already been applied for selecting targets for genome sequencing. These projects are the JGI CSP Genomic Encyclopedia of Bacteria and Archaea phase I, targeting 1,000 type strains, and, on a smaller-scale, the phylogenomics of the Roseobacter clade. Potential artifacts of the method are discussed and compared to a selection approach based on the taxonomic classification. PMID:23991265

  5. Ensembl Genomes 2016: more genomes, more complexity.

    PubMed

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. PMID:26578574

  6. Ensembl Genomes 2016: more genomes, more complexity

    PubMed Central

    Kersey, Paul Julian; Allen, James E.; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J.; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J.; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K.; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D.; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello–Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M.; Howe, Kevin L.; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M.

    2016-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. PMID:26578574

  7. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production.

    PubMed

    Belila, A; El-Chakhtoura, J; Otaibi, N; Muyzer, G; Gonzalez-Gil, G; Saikaly, P E; van Loosdrecht, M C M; Vrouwenvelder, J S

    2016-05-01

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m(3)/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  8. Comparative Genomics between Two Xenorhabdus bovienii Strains Highlights Differential Evolutionary Scenarios within an Entomopathogenic Bacterial Species

    PubMed Central

    Bisch, Gaëlle; Ogier, Jean-Claude; Médigue, Claudine; Rouy, Zoé; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie

    2016-01-01

    Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Within Xenorhabdus bovienii species, the X. bovienii CS03 strain (Xb CS03) is nonvirulent when directly injected into lepidopteran insects, and displays a low virulence when associated with its Steinernema symbiont. The genome of Xb CS03 was sequenced and compared with the genome of a virulent strain, X. bovienii SS-2004 (Xb SS-2004). The genome size and content widely differed between the two strains. Indeed, Xb CS03 had a large genome containing several specific loci involved in the inhibition of competitors, including a few NRPS-PKS loci (nonribosomal peptide synthetases and polyketide synthases) producing antimicrobial molecules. Consistently, Xb CS03 had a greater antimicrobial activity than Xb SS-2004. The Xb CS03 strain contained more pseudogenes than Xb SS-2004. Decay of genes involved in the host invasion and exploitation (toxins, invasins, or extracellular enzymes) was particularly important in Xb CS03. This may provide an explanation for the nonvirulence of the strain when injected into an insect host. We suggest that Xb CS03 and Xb SS-2004 followed divergent evolutionary scenarios to cope with their peculiar life cycle. The fitness strategy of Xb CS03 would involve competitor inhibition, whereas Xb SS-2004 would quickly and efficiently kill the insect host. Hence, Xenorhabdus strains would have widely divergent host exploitation strategies, which impact their genome structure. PMID:26769959

  9. Adaptation in Toxic Environments: Arsenic Genomic Islands in the Bacterial Genus Thiomonas

    PubMed Central

    Freel, Kelle C.; Krueger, Martin C.; Farasin, Julien; Brochier-Armanet, Céline; Barbe, Valérie; Andrès, Jeremy; Cholley, Pierre-Etienne; Dillies, Marie-Agnès; Jagla, Bernd; Koechler, Sandrine; Leva, Yann; Magdelenat, Ghislaine; Plewniak, Frédéric; Proux, Caroline; Coppée, Jean-Yves; Bertin, Philippe N.; Heipieper, Hermann J.; Arsène-Ploetze, Florence

    2015-01-01

    Acid mine drainage (AMD) is a highly toxic environment for most living organisms due to the presence of many lethal elements including arsenic (As). Thiomonas (Tm.) bacteria are found ubiquitously in AMD and can withstand these extreme conditions, in part because they are able to oxidize arsenite. In order to further improve our knowledge concerning the adaptive capacities of these bacteria, we sequenced and assembled the genome of six isolates derived from the Carnoulès AMD, and compared them to the genomes of Tm. arsenitoxydans 3As (isolated from the same site) and Tm. intermedia K12 (isolated from a sewage pipe). A detailed analysis of the Tm. sp. CB2 genome revealed various rearrangements had occurred in comparison to what was observed in 3As and K12 and over 20 genomic islands (GEIs) were found in each of these three genomes. We performed a detailed comparison of the two arsenic-related islands found in CB2, carrying the genes required for arsenite oxidation and As resistance, with those found in K12, 3As, and five other Thiomonas strains also isolated from Carnoulès (CB1, CB3, CB6, ACO3 and ACO7). Our results suggest that these arsenic-related islands have evolved differentially in these closely related Thiomonas strains, leading to divergent capacities to survive in As rich environments. PMID:26422469

  10. Comparative Genomics between Two Xenorhabdus bovienii Strains Highlights Differential Evolutionary Scenarios within an Entomopathogenic Bacterial Species.

    PubMed

    Bisch, Gaëlle; Ogier, Jean-Claude; Médigue, Claudine; Rouy, Zoé; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie

    2016-01-01

    Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Within Xenorhabdus bovienii species, the X. bovienii CS03 strain (Xb CS03) is nonvirulent when directly injected into lepidopteran insects, and displays a low virulence when associated with its Steinernema symbiont. The genome of Xb CS03 was sequenced and compared with the genome of a virulent strain, X. bovienii SS-2004 (Xb SS-2004). The genome size and content widely differed between the two strains. Indeed, Xb CS03 had a large genome containing several specific loci involved in the inhibition of competitors, including a few NRPS-PKS loci (nonribosomal peptide synthetases and polyketide synthases) producing antimicrobial molecules. Consistently, Xb CS03 had a greater antimicrobial activity than Xb SS-2004. The Xb CS03 strain contained more pseudogenes than Xb SS-2004. Decay of genes involved in the host invasion and exploitation (toxins, invasins, or extracellular enzymes) was particularly important in Xb CS03. This may provide an explanation for the nonvirulence of the strain when injected into an insect host. We suggest that Xb CS03 and Xb SS-2004 followed divergent evolutionary scenarios to cope with their peculiar life cycle. The fitness strategy of Xb CS03 would involve competitor inhibition, whereas Xb SS-2004 would quickly and efficiently kill the insect host. Hence, Xenorhabdus strains would have widely divergent host exploitation strategies, which impact their genome structure. PMID:26769959

  11. Implications of Genome-Based Discrimination between Clostridium botulinum Group I and Clostridium sporogenes Strains for Bacterial Taxonomy

    PubMed Central

    Weigand, Michael R.; Pena-Gonzalez, Angela; Shirey, Timothy B.; Broeker, Robin G.; Ishaq, Maliha K.; Konstantinidis, Konstantinos T.

    2015-01-01

    Taxonomic classification of Clostridium botulinum is based on the production of botulinum neurotoxin (BoNT), while closely related, nontoxic organisms are classified as Clostridium sporogenes. However, this taxonomic organization does not accurately mirror phylogenetic relationships between these species. A phylogenetic reconstruction using 2,016 orthologous genes shared among strains of C. botulinum group I and C. sporogenes clearly separated these two species into discrete clades which showed ∼93% average nucleotide identity (ANI) between them. Clustering of strains based on the presence of variable orthologs revealed 143 C. sporogenes clade-specific genetic signatures, a subset of which were further evaluated for their ability to correctly classify a panel of presumptive C. sporogenes strains by PCR. Genome sequencing of several C. sporogenes strains lacking these signatures confirmed that they clustered with C. botulinum strains in a core genome phylogenetic tree. Our analysis also identified C. botulinum strains that contained C. sporogenes clade-specific signatures and phylogenetically clustered with C. sporogenes strains. The genome sequences of two bont/B2-containing strains belonging to the C. sporogenes clade contained regions with similarity to a bont-bearing plasmid (pCLD), while two different strains belonging to the C. botulinum clade carried bont/B2 on the chromosome. These results indicate that bont/B2 was likely acquired by C. sporogenes strains through horizontal gene transfer. The genome-based classification of these species used to identify candidate genes for the development of rapid assays for molecular identification may be applicable to additional bacterial species that are challenging with respect to their classification. PMID:26048939

  12. Differential Genome Evolution Between Companion Symbionts in an Insect-Bacterial Symbiosis

    PubMed Central

    McCutcheon, John P.; MacDonald, Bradon R.; Romanovicz, Dwight; Moran, Nancy A.

    2014-01-01

    ABSTRACT Obligate symbioses with bacteria allow insects to feed on otherwise unsuitable diets. Some symbionts have extremely reduced genomes and have lost many genes considered to be essential in other bacteria. To understand how symbiont genome degeneration proceeds, we compared the genomes of symbionts in two leafhopper species, Homalodisca vitripennis (glassy-winged sharpshooter [GWSS]) and Graphocephala atropunctata (blue-green sharpshooter [BGSS]) (Hemiptera: Cicadellidae). Each host species is associated with the anciently acquired “Candidatus Sulcia muelleri” (Bacteroidetes) and the more recently acquired “Candidatus Baumannia cicadellinicola” (Gammaproteobacteria). BGSS “Ca. Baumannia” retains 89 genes that are absent from GWSS “Ca. Baumannia”; these underlie central cellular functions, including cell envelope biogenesis, cellular replication, and stress response. In contrast, “Ca. Sulcia” strains differ by only a few genes. Although GWSS “Ca. Baumannia” cells are spherical or pleomorphic (a convergent trait of obligate symbionts), electron microscopy reveals that BGSS “Ca. Baumannia” maintains a rod shape, possibly due to its retention of genes involved in cell envelope biogenesis and integrity. Phylogenomic results suggest that “Ca. Baumannia” is derived from the clade consisting of Sodalis and relatives, a group that has evolved symbiotic associations with numerous insect hosts. Finally, the rates of synonymous and nonsynonymous substitutions are higher in “Ca. Baumannia” than in “Ca. Sulcia,” which may be due to a lower mutation rate in the latter. Taken together, our results suggest that the two “Ca. Baumannia” genomes represent different stages of genome reduction in which many essential functions are being lost and likely compensated by hosts. “Ca. Sulcia” exhibits much greater genome stability and slower sequence evolution, although the mechanisms underlying these differences are poorly understood

  13. A FISH approach for mapping the human genome using Bacterial Artificial Chromosomes (BACs)

    SciTech Connect

    Hubert, R.S.; Chen, X.N.; Mitchell, S.

    1994-09-01

    As the Human Genome Project progresses, large insert cloning vectors such as BACs, P1, and P1 Artificial Chromosomes (PACs) will be required to complement the YAC mapping efforts. The value of the BAC vector for physical mapping lies in the stability of the inserts, the lack of chimerism, the length of inserts (up to 300 kb), the ability to obtain large amounts of pure clone DNA and the ease of BAC manipulation. These features helped us design two approaches for generating physical mapping reagents for human genetic studies. The first approach is a whole genome strategy in which randomly selected BACs are mapped, using FISH, to specific chromosomal bands. To date, 700 BACs have been mapped to single chromosome bands at a resolution of 2-5 Mb in addition to BACs mapped to 14 different centromeres. These BACs represent more than 90 Mb of the genome and include >70% of all human chromosome bands at the 350-band level. These data revealed that >97% of the BACs were non-chimeric and have a genomic distribution covering most gaps in the existing YAC map with excellent coverage of gene-rich regions. In the second approach, we used YACs to identify BACs on chromosome 21. A 1.5 Mb contig between D21S339 and D21S220 nears completion within the Down syndrome congenital heart disease (DS-CHD) region. Seventeen BACs ranging in size from 80 kb to 240 kb were ordered using 14 STSs with FISH confirmation. We have also used 40 YACs spanning 21q to identify, on average, >1 BAC/Mb to provide molecular cytogenetic reagents and anchor points for further mapping. The contig generated on chromosome 21 will be helpful in isolating the genes for DS-CHD. The physical mapping reagents generated using the whole genome approach will provide cytogenetic markers and mapped genomic fragments that will facilitate positional cloning efforts and the identification of genes within most chromosomal bands.

  14. Genome-scale Mapping of DNaseI Hypersensitivity

    PubMed Central

    John, Sam; Sabo, Peter J.; Canfield, Theresa K.; Lee, Kristen; Vong, Shinny; Weaver, Molly; Wang, Hao; Vierstra, Jeff; Reynolds, Alex P.; Thurman, Robert E.; Stamatoyannopoulos, John A.

    2014-01-01

    DNaseI-seq is a global and high-resolution method that uses the non-specific endonuclease DNaseI to map chromatin accessibility. These accessible regions, designated as DNaseI hypersensitive sites (DHSs), define the regulatory features, (eg. promoters, enhancers, insulators, locus control regions) of complex genomes. In this unit, we will describe systematic methods for nuclei isolation, digestion of nuclei with limiting concentrations of DNaseI and the biochemical fractionation of DNaseI hypersensitive sites in preparation for high-throughput sequencing. DNaseI-seq is an unbiased and robust method that is not predicated on an a priori understanding of regulatory patterns or chromatin features. PMID:23821440

  15. A Year of Infection in the Intensive Care Unit: Prospective Whole Genome Sequencing of Bacterial Clinical Isolates Reveals Cryptic Transmissions and Novel Microbiota.

    PubMed

    Roach, David J; Burton, Joshua N; Lee, Choli; Stackhouse, Bethany; Butler-Wu, Susan M; Cookson, Brad T; Shendure, Jay; Salipante, Stephen J

    2015-07-01

    Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital's intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate Staphylococcus epidermidis clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care. PMID:26230489

  16. A Year of Infection in the Intensive Care Unit: Prospective Whole Genome Sequencing of Bacterial Clinical Isolates Reveals Cryptic Transmissions and Novel Microbiota

    PubMed Central

    Roach, David J.; Burton, Joshua N.; Lee, Choli; Stackhouse, Bethany; Butler-Wu, Susan M.; Cookson, Brad T.

    2015-01-01

    Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital’s intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate Staphylococcus epidermidis clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care. PMID:26230489

  17. Rapid prototyping of microbial cell factories via genome-scale engineering.

    PubMed

    Si, Tong; Xiao, Han; Zhao, Huimin

    2015-11-15

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories. PMID:25450192

  18. Draft genome sequence of Erwinia tracheiphila, an economically important bacterial pathogen of cucurbits

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Erwinia tracheiphila is one of the most economically important pathogen of cucumbers, melons, squashes, pumpkins, and gourds, in the Northeastern and Midwestern United States, yet the molecular pathology remains uninvestigated. Here we report the first draft genome sequence of an E. tracheiphila str...

  19. Comparative genomics of bacterial and plant folate synthesis and salvage: predictions and validations

    PubMed Central

    de Crécy-Lagard, Valérie; El Yacoubi, Basma; de la Garza, Rocío Díaz; Noiriel, Alexandre; Hanson, Andrew D

    2007-01-01

    Background Folate synthesis and salvage pathways are relatively well known from classical biochemistry and genetics but they have not been subjected to comparative genomic analysis. The availability of genome sequences from hundreds of diverse bacteria, and from Arabidopsis thaliana, enabled such an analysis using the SEED database and its tools. This study reports the results of the analysis and integrates them with new and existing experimental data. Results Based on sequence similarity and the clustering, fusion, and phylogenetic distribution of genes, several functional predictions emerged from this analysis. For bacteria, these included the existence of novel GTP cyclohydrolase I and folylpolyglutamate synthase gene families, and of a trifunctional p-aminobenzoate synthesis gene. For plants and bacteria, the predictions comprised the identities of a 'missing' folate synthesis gene (folQ) and of a folate transporter, and the absence from plants of a folate salvage enzyme. Genetic and biochemical tests bore out these predictions. Conclusion For bacteria, these results demonstrate that much can be learnt from comparative genomics, even for well-explored primary metabolic pathways. For plants, the findings particularly illustrate the potential for rapid functional assignment of unknown genes that have prokaryotic homologs, by analyzing which genes are associated with the latter. More generally, our data indicate how combined genomic analysis of both plants and prokaryotes can be more powerful than isolated examination of either group alone. PMID:17645794

  20. First Complete Genome Sequence of Tenacibaculum dicentrarchi, an Emerging Bacterial Pathogen of Salmonids.

    PubMed

    Grothusen, Horst; Castillo, Alejandro; Henríquez, Patricio; Navas, Esteban; Bohle, Harry; Araya, Carolina; Bustamante, Fernando; Bustos, Patricio; Mancilla, Marcos

    2016-01-01

    Tenacibaculum-like bacilli have recently been isolated from diseased sea-reared Atlantic salmon in outbreaks that took place in the XI region (Región de Aysén) of Chile. Molecular typing identified the bacterium as Tenacibaculum dicentrarchi. Here, we report the complete genome sequence of the AY7486TD isolate recovered during those outbreaks. PMID:26893432

  1. First Complete Genome Sequence of Tenacibaculum dicentrarchi, an Emerging Bacterial Pathogen of Salmonids

    PubMed Central

    Grothusen, Horst; Castillo, Alejandro; Henríquez, Patricio; Navas, Esteban; Bohle, Harry; Araya, Carolina; Bustamante, Fernando; Bustos, Patricio

    2016-01-01

    Tenacibaculum-like bacilli have recently been isolated from diseased sea-reared Atlantic salmon in outbreaks that took place in the XI region (Región de Aysén) of Chile. Molecular typing identified the bacterium as Tenacibaculum dicentrarchi. Here, we report the complete genome sequence of the AY7486TD isolate recovered during those outbreaks. PMID:26893432

  2. Germinal transmission of site-specific excised genomic DNA by the bacterial ParA resolvase

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome engineering is an essential tool in research and product development. Behind some of the recent advances in plant gene transfer is the development of site-specific recombination systems that enable the precise manipulation of DNA, e.g. the deletion, integration or translocation of DNA. DNA ...

  3. A Caenorhabditis elegans Genome-Scale Metabolic Network Model.

    PubMed

    Yilmaz, L Safak; Walhout, Albertha J M

    2016-05-25

    Caenorhabditis elegans is a powerful model to study metabolism and how it relates to nutrition, gene expression, and life history traits. However, while numerous experimental techniques that enable perturbation of its diet and gene function are available, a high-quality metabolic network model has been lacking. Here, we reconstruct an initial version of the C. elegans metabolic network. This network model contains 1,273 genes, 623 enzymes, and 1,985 metabolic reactions and is referred to as iCEL1273. Using flux balance analysis, we show that iCEL1273 is capable of representing the conversion of bacterial biomass into C. elegans biomass during growth and enables the predictions of gene essentiality and other phenotypes. In addition, we demonstrate that gene expression data can be integrated with the model by comparing metabolic rewiring in dauer animals versus growing larvae. iCEL1273 is available at a dedicated website (wormflux.umassmed.edu) and will enable the unraveling of the mechanisms by which different macro- and micronutrients contribute to the animal's physiology. PMID:27211857

  4. Genome Segregation and Packaging Machinery in Acanthamoeba polyphaga Mimivirus Is Reminiscent of Bacterial Apparatus

    PubMed Central

    Chelikani, Venkata; Ranjan, Tushar; Zade, Amrutraj; Shukla, Avi

    2014-01-01

    ABSTRACT Genome packaging is a critical step in the virion assembly process. The putative ATP-driven genome packaging motor of Acanthamoeba polyphaga mimivirus (APMV) and other nucleocytoplasmic large DNA viruses (NCLDVs) is a distant ortholog of prokaryotic chromosome segregation motors, such as FtsK and HerA, rather than other viral packaging motors, such as large terminase. Intriguingly, APMV also encodes other components, i.e., three putative serine recombinases and a putative type II topoisomerase, all of which are essential for chromosome segregation in prokaryotes. Based on our analyses of these components and taking the limited available literature into account, here we propose for the first time a model for genome segregation and packaging in APMV that can possibly be extended to NCLDV subfamilies, except perhaps Poxviridae and Ascoviridae. This model might represent a unique variation of the prokaryotic system acquired and contrived by the large DNA viruses of eukaryotes. It is also consistent with previous observations that unicellular eukaryotes, such as amoebae, are melting pots for the advent of chimeric organisms with novel mechanisms. IMPORTANCE Extremely large viruses with DNA genomes infect a wide range of eukaryotes, from human beings to amoebae and from crocodiles to algae. These large DNA viruses, unlike their much smaller cousins, have the capability of making most of the protein components required for their multiplication. Once they infect the cell, these viruses set up viral replication centers, known as viral factories, to carry out their multiplication with very little help from the host. Our sequence analyses show that there is remarkable similarity between prokaryotes (bacteria and archaea) and large DNA viruses, such as mimivirus, vaccinia virus, and pandoravirus, in the way that they process their newly synthesized genetic material to make sure that only one copy of the complete genome is generated and is meticulously placed inside

  5. Draft Genome Sequence of Paracoccus sp. MKU1, a New Bacterial Strain Isolated from an Industrial Effluent with Potential for Bioremediation

    PubMed Central

    Nisha, Kamaldeen Nasrin; Sridhar, Jayavel; Varalakshmi, Perumal; Ashokkumar, Balasubramaniem

    2016-01-01

    Paracoccus sp. MKU1, a novel dimethylformamide degrading bacterial strain was originally isolated from an industrial effluent, Tirupur region, Tamil Nadu, India. Here, we report the draft genome sequence of Paracoccus sp. MKU1, which could provide the genetic insights on its evolution and application of this versatile bacterium for effective degradation of xenobiotics and thus in bioremediation. PMID:27326263

  6. Draft Genome Sequence of Two Strains of Xanthomonas arboricola Isolated from Prunus persica Which Are Dissimilar to Strains That Cause Bacterial Spot Disease on Prunus spp.

    PubMed

    Garita-Cambronero, Jerson; Palacio-Bielsa, Ana; López, María M; Cubero, Jaime

    2016-01-01

    The draft genome sequences of two strains of Xanthomonas arboricola, isolated from asymptomatic peach trees in Spain, are reported here. These strains are avirulent and do not belong to the same phylogroup as X. arboricola pv. pruni, a causal agent of bacterial spot disease of stone fruits and almonds. PMID:27609931

  7. Draft Genome Sequence of Paracoccus sp. MKU1, a New Bacterial Strain Isolated from an Industrial Effluent with Potential for Bioremediation.

    PubMed

    Nisha, Kamaldeen Nasrin; Sridhar, Jayavel; Varalakshmi, Perumal; Ashokkumar, Balasubramaniem

    2016-01-01

    Paracoccus sp. MKU1, a novel dimethylformamide degrading bacterial strain was originally isolated from an industrial effluent, Tirupur region, Tamil Nadu, India. Here, we report the draft genome sequence of Paracoccus sp. MKU1, which could provide the genetic insights on its evolution and application of this versatile bacterium for effective degradation of xenobiotics and thus in bioremediation. PMID:27326263

  8. Genome-wide identification of Hsp70 genes in channel catfish and their regulated expression after bacterial infection.

    PubMed

    Song, Lin; Li, Chao; Xie, Yangjie; Liu, Shikai; Zhang, Jiaren; Yao, Jun; Jiang, Chen; Li, Yun; Liu, Zhanjiang

    2016-02-01

    Heat shock proteins 70/110 (Hsp70/110) are a family of conserved ubiquitously expressed heat shock proteins which are produced by cells in response to exposure to stressful conditions. Besides the chaperone and housekeeping functions, they are also known to be involved in immune response during infection. In this study, we identified 16 Hsp70/110 geness in channel catfish (Ictalurus punctatus) through in silico analysis using RNA-Seq and genome databases. Among them 12 members of Hsp70 (Hspa) family and 4 members of Hsp110 (Hsph) family were identified. Phylogenetic and syntenic analyses provided strong evidence in supporting the orthologies of these HSPs. In addition, we also determined the expression patterns of Hsp70/110 genes after Flavobacterium columnare and Edwardsiella ictaluri infections by meta-analyses, for the first time in channel catfish. Ten out of sixteen genes were significantly up/down-regulated after bacterial challenges. Specifically, nine genes were found significantly expressed in gill after F. columnare infection. Two genes were found significantly expressed in intestine after E. ictaluri infection. Pathogen-specific pattern and tissue-specific pattern were found in the two infections. The significantly regulated expressions of catfish Hsp70 genes after bacterial infections suggested their involvement in immune response in catfish. PMID:26693666

  9. Limitations to estimating bacterial cross-species transmission using genetic and genomic markers: inferences from simulation modeling

    PubMed Central

    Benavides, Julio A; Cross, Paul C; Luikart, Gordon; Creel, Scott

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced. PMID:25469159

  10. Automated extraction of typing information for bacterial pathogens from whole genome sequence data: Neisseria meningitidis as an exemplar.

    PubMed

    Jolley, K A; Maiden, M C

    2013-01-01

    Whole genome sequence (WGS) data are increasingly used to characterise bacterial pathogens. These data provide detailed information on the genotypes and likely phenotypes of aetiological agents, enabling the relationships of samples from potential disease outbreaks to be established precisely. However, the generation of increasing quantities of sequence data does not, in itself, resolve the problems that many microbiological typing methods have addressed over the last 100 years or so; indeed, providing large volumes of unstructured data can confuse rather than resolve these issues. Here we review the nascent field of storage of WGS data for clinical application and show how curated sequence-based typing schemes on websites have generated an infrastructure that can exploit WGS for bacterial typing efficiently. We review the tools that have been implemented within the PubMLST website to extract clinically useful, strain-characterisation information that can be provided to physicians and public health professionals in a timely, concise and understandable way. These data can be used to inform medical decisions such as how to treat a patient, whether to instigate public health action, and what action might be appropriate. The information is compatible both with previous sequence-based typing data and also with data obtained in the absence of WGS, providing a flexible infrastructure for WGS-based clinical microbiology. PMID:23369391

  11. From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model.

    PubMed

    Cuevas, Daniel A; Edirisinghe, Janaka; Henry, Chris S; Overbeek, Ross; O'Connell, Taylor G; Edwards, Robert A

    2016-01-01

    Microbiological studies are increasingly relying on in silico methods to perform exploration and rapid analysis of genomic data, and functional genomics studies are supplemented by the new perspectives that genome-scale metabolic models offer. A mathematical model consisting of a microbe's entire metabolic map can be rapidly determined from whole-genome sequencing and annotating the genomic material encoded in its DNA. Flux-balance analysis (FBA), a linear programming technique that uses metabolic models to predict the phenotypic responses imposed by environmental elements and factors, is the leading method to simulate and manipulate cellular growth in silico. However, the process of creating an accurate model to use in FBA consists of a series of steps involving a multitude of connections between bioinformatics databases, enzyme resources, and metabolic pathways. We present the methodology and procedure to obtain a metabolic model using PyFBA, an extensible Python-based open-source software package aimed to provide a platform where functional annotations are used to build metabolic models (http://linsalrob.github.io/PyFBA). Backed by the Model SEED biochemistry database, PyFBA contains methods to reconstruct a microbe's metabolic map, run FBA upon different media conditions, and gap-fill its metabolism. The extensibility of PyFBA facilitates novel techniques in creating accurate genome-scale metabolic models. PMID:27379044

  12. Population Genomics Reveals Chromosome-Scale Heterogeneous Evolution in a Protoploid Yeast

    PubMed Central

    Friedrich, Anne; Jung, Paul; Reisser, Cyrielle; Fischer, Gilles; Schacherer, Joseph

    2015-01-01

    Yeast species represent an ideal model system for population genomic studies but large-scale polymorphism surveys have only been reported for species of the Saccharomyces genus so far. Hence, little is known about intraspecific diversity and evolution in yeast. To obtain a new insight into the evolutionary forces shaping natural populations, we sequenced the genomes of an expansive worldwide collection of isolates from a species distantly related to Saccharomyces cerevisiae: Lachancea kluyveri (formerly S. kluyveri). We identified 6.5 million single nucleotide polymorphisms and showed that a large introgression event of 1 Mb of GC-rich sequence in the chromosomal arm probably occurred in the last common ancestor of all L. kluyveri strains. Our population genomic data clearly revealed that this 1-Mb region underwent a molecular evolution pattern very different from the rest of the genome. It is characterized by a higher recombination rate, with a dramatically elevated A:T → G:C substitution rate, which is the signature of an increased GC-biased gene conversion. In addition, the predicted base composition at equilibrium demonstrates that the chromosome-scale compositional heterogeneity will persist after the genome has reached mutational equilibrium. Altogether, the data presented herein clearly show that distinct recombination and substitution regimes can coexist and lead to different evolutionary patterns within a single genome. PMID:25349286

  13. Population genomics reveals chromosome-scale heterogeneous evolution in a protoploid yeast.

    PubMed

    Friedrich, Anne; Jung, Paul; Reisser, Cyrielle; Fischer, Gilles; Schacherer, Joseph

    2015-01-01

    Yeast species represent an ideal model system for population genomic studies but large-scale polymorphism surveys have only been reported for species of the Saccharomyces genus so far. Hence, little is known about intraspecific diversity and evolution in yeast. To obtain a new insight into the evolutionary forces shaping natural populations, we sequenced the genomes of an expansive worldwide collection of isolates from a species distantly related to Saccharomyces cerevisiae: Lachancea kluyveri (formerly S. kluyveri). We identified 6.5 million single nucleotide polymorphisms and showed that a large introgression event of 1 Mb of GC-rich sequence in the chromosomal arm probably occurred in the last common ancestor of all L. kluyveri strains. Our population genomic data clearly revealed that this 1-Mb region underwent a molecular evolution pattern very different from the rest of the genome. It is characterized by a higher recombination rate, with a dramatically elevated A:T → G:C substitution rate, which is the signature of an increased GC-biased gene conversion. In addition, the predicted base composition at equilibrium demonstrates that the chromosome-scale compositional heterogeneity will persist after the genome has reached mutational equilibrium. Altogether, the data presented herein clearly show that distinct recombination and substitution regimes can coexist and lead to different evolutionary patterns within a single genome. PMID:25349286

  14. From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model

    PubMed Central

    Cuevas, Daniel A.; Edirisinghe, Janaka; Henry, Chris S.; Overbeek, Ross; O’Connell, Taylor G.; Edwards, Robert A.

    2016-01-01

    Microbiological studies are increasingly relying on in silico methods to perform exploration and rapid analysis of genomic data, and functional genomics studies are supplemented by the new perspectives that genome-scale metabolic models offer. A mathematical model consisting of a microbe’s entire metabolic map can be rapidly determined from whole-genome sequencing and annotating the genomic material encoded in its DNA. Flux-balance analysis (FBA), a linear programming technique that uses metabolic models to predict the phenotypic responses imposed by environmental elements and factors, is the leading method to simulate and manipulate cellular growth in silico. However, the process of creating an accurate model to use in FBA consists of a series of steps involving a multitude of connections between bioinformatics databases, enzyme resources, and metabolic pathways. We present the methodology and procedure to obtain a metabolic model using PyFBA, an extensible Python-based open-source software package aimed to provide a platform where functional annotations are used to build metabolic models (http://linsalrob.github.io/PyFBA). Backed by the Model SEED biochemistry database, PyFBA contains methods to reconstruct a microbe’s metabolic map, run FBA upon different media conditions, and gap-fill its metabolism. The extensibility of PyFBA facilitates novel techniques in creating accurate genome-scale metabolic models. PMID:27379044

  15. Development and experimental verification of a genome-scale metabolic model for Corynebacterium glutamicum

    PubMed Central

    Shinfuku, Yohei; Sorpitiporn, Natee; Sono, Masahiro; Furusawa, Chikara; Hirasawa, Takashi; Shimizu, Hiroshi

    2009-01-01

    Background In silico genome-scale metabolic models enable the analysis of the characteristics of metabolic systems of organisms. In this study, we reconstructed a genome-scale metabolic model of Corynebacterium glutamicum on the basis of genome sequence annotation and physiological data. The metabolic characteristics were analyzed using flux balance analysis (FBA), and the results of FBA were validated using data from culture experiments performed at different oxygen uptake rates. Results The reconstructed genome-scale metabolic model of C. glutamicum contains 502 reactions and 423 metabolites. We collected the reactions and biomass components from the database and literatures, and made the model available for the flux balance analysis by filling gaps in the reaction networks and removing inadequate loop reactions. Using the framework of FBA and our genome-scale metabolic model, we first simulated the changes in the metabolic flux profiles that occur on changing the oxygen uptake rate. The predicted production yields of carbon dioxide and organic acids agreed well with the experimental data. The metabolic profiles of amino acid production phases were also investigated. A comprehensive gene deletion study was performed in which the effects of gene deletions on metabolic fluxes were simulated; this helped in the identification of several genes whose deletion resulted in an improvement in organic acid production. Conclusion The genome-scale metabolic model provides useful information for the evaluation of the metabolic capabilities and prediction of the metabolic characteristics of C. glutamicum. This can form a basis for the in silico design of C. glutamicum metabolic networks for improved bioproduction of desirable metabolites. PMID:19646286

  16. Generating Genome-Scale Candidate Gene Lists for Pharmacogenomics

    PubMed Central

    Hansen, NT; Brunak, S; Altman, RB

    2009-01-01

    A critical task in pharmacogenomics is identifying genes that may be important modulators of drug response. High-throughput experimental methods are often plagued by false positives and do not take advantage of existing knowledge. Candidate gene lists can usefully summarize existing knowledge, but they are expensive to generate manually and may therefore have incomplete coverage. We have developed a method that ranks 12,460 genes in the human genome on the basis of their potential relevance to a specific query drug and its putative indications. Our method uses known gene–drug interactions, networks of gene–gene interactions, and available measures of drug–drug similarity. It ranks genes by building a local network of known interactions and assessing the similarity of the query drug (by both structure and indication) with drugs that interact with gene products in the local network. In a comprehensive benchmark, our method achieves an overall area under the curve of 0.82. To showcase our method, we found novel gene candidates for warfarin, gefitinib, carboplatin, and gemcitabine, and we provide the molecular hypotheses for these predictions. PMID:19369935

  17. Feasibility of Large-Scale Genomic Testing to Facilitate Enrollment Onto Genomically Matched Clinical Trials

    PubMed Central

    Meric-Bernstam, Funda; Brusco, Lauren; Shaw, Kenna; Horombe, Chacha; Kopetz, Scott; Davies, Michael A.; Routbort, Mark; Piha-Paul, Sarina A.; Janku, Filip; Ueno, Naoto; Hong, David; De Groot, John; Ravi, Vinod; Li, Yisheng; Luthra, Raja; Patel, Keyur; Broaddus, Russell; Mendelsohn, John; Mills, Gordon B.

    2015-01-01

    Purpose We report the experience with 2,000 consecutive patients with advanced cancer who underwent testing on a genomic testing protocol, including the frequency of actionable alterations across tumor types, subsequent enrollment onto clinical trials, and the challenges for trial enrollment. Patients and Methods Standardized hotspot mutation analysis was performed in 2,000 patients, using either an 11-gene (251 patients) or a 46- or 50-gene (1,749 patients) multiplex platform. Thirty-five genes were considered potentially actionable based on their potential to be targeted with approved or investigational therapies. Results Seven hundred eighty-nine patients (39%) had at least one mutation in potentially actionable genes. Eighty-three patients (11%) with potentially actionable mutations went on genotype-matched trials targeting these alterations. Of 230 patients with PIK3CA/AKT1/PTEN/BRAF mutations that returned for therapy, 116 (50%) received a genotype-matched drug. Forty patients (17%) were treated on a genotype-selected trial requiring a mutation for eligibility, 16 (7%) were treated on a genotype-relevant trial targeting a genomic alteration without biomarker selection, and 40 (17%) received a genotype-relevant drug off trial. Challenges to trial accrual included patient preference of noninvestigational treatment or local treatment, poor performance status or other reasons for trial ineligibility, lack of trials/slots, and insurance denial. Conclusion Broad implementation of multiplex hotspot testing is feasible; however, only a small portion of patients with actionable alterations were actually enrolled onto genotype-matched trials. Increased awareness of therapeutic implications and access to novel therapeutics are needed to optimally leverage results from broad-based genomic testing. PMID:26014291

  18. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    PubMed Central

    King, Zachary A.; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  19. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    PubMed

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456

  20. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    SciTech Connect

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  1. Large-scale profiling of microRNAs for The Cancer Genome Atlas

    PubMed Central

    Chu, Andy; Robertson, Gordon; Brooks, Denise; Mungall, Andrew J.; Birol, Inanc; Coope, Robin; Ma, Yussanne; Jones, Steven; Marra, Marco A.

    2016-01-01

    The comprehensive multiplatform genomics data generated by The Cancer Genome Atlas (TCGA) Research Network is an enabling resource for cancer research. It includes an unprecedented amount of microRNA sequence data: ∼11 000 libraries across 33 cancer types. Combined with initiatives like the National Cancer Institute Genomics Cloud Pilots, such data resources will make intensive analysis of large-scale cancer genomics data widely accessible. To support such initiatives, and to enable comparison of TCGA microRNA data to data from other projects, we describe the process that we developed and used to generate the microRNA sequence data, from library construction through to submission of data to repositories. In the context of this process, we describe the computational pipeline that we used to characterize microRNA expression across large patient cohorts. PMID:26271990

  2. Large-scale profiling of microRNAs for The Cancer Genome Atlas.

    PubMed

    Chu, Andy; Robertson, Gordon; Brooks, Denise; Mungall, Andrew J; Birol, Inanc; Coope, Robin; Ma, Yussanne; Jones, Steven; Marra, Marco A

    2016-01-01

    The comprehensive multiplatform genomics data generated by The Cancer Genome Atlas (TCGA) Research Network is an enabling resource for cancer research. It includes an unprecedented amount of microRNA sequence data: ~11 000 libraries across 33 cancer types. Combined with initiatives like the National Cancer Institute Genomics Cloud Pilots, such data resources will make intensive analysis of large-scale cancer genomics data widely accessible. To support such initiatives, and to enable comparison of TCGA microRNA data to data from other projects, we describe the process that we developed and used to generate the microRNA sequence data, from library construction through to submission of data to repositories. In the context of this process, we describe the computational pipeline that we used to characterize microRNA expression across large patient cohorts. PMID:26271990

  3. antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences

    PubMed Central

    Medema, Marnix H.; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A.; Weber, Tilmann; Takano, Eriko

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide variety of microbes. However, rapidly and reliably pinpointing all the potential gene clusters for secondary metabolites in dozens of newly sequenced genomes has been extremely challenging, due to their biochemical heterogeneity, the presence of unknown enzymes and the dispersed nature of the necessary specialized bioinformatics tools and resources. Here, we present antiSMASH (antibiotics & Secondary Metabolite Analysis Shell), the first comprehensive pipeline capable of identifying biosynthetic loci covering the whole range of known secondary metabolite compound classes (polyketides, non-ribosomal peptides, terpenes, aminoglycosides, aminocoumarins, indolocarbazoles, lantibiotics, bacteriocins, nucleosides, beta-lactams, butyrolactones, siderophores, melanins and others). It aligns the identified regions at the gene cluster level to their nearest relatives from a database containing all other known gene clusters, and integrates or cross-links all previously available secondary-metabolite specific gene analysis methods in one interactive view. antiSMASH is available at http://antismash.secondarymetabolites.org. PMID:21672958

  4. The complete genome sequence of Chromobacterium violaceum reveals remarkable and exploitable bacterial adaptability

    PubMed Central

    2003-01-01

    Chromobacterium violaceum is one of millions of species of free-living microorganisms that populate the soil and water in the extant areas of tropical biodiversity around the world. Its complete genome sequence reveals (i) extensive alternative pathways for energy generation, (ii) ≈500 ORFs for transport-related proteins, (iii) complex and extensive systems for stress adaptation and motility, and (iv) widespread utilization of quorum sensing for control of inducible systems, all of which underpin the versatility and adaptability of the organism. The genome also contains extensive but incomplete arrays of ORFs coding for proteins associated with mammalian pathogenicity, possibly involved in the occasional but often fatal cases of human C. violaceum infection. There is, in addition, a series of previously unknown but important enzymes and secondary metabolites including paraquat-inducible proteins, drug and heavy-metal-resistance proteins, multiple chitinases, and proteins for the detoxification of xenobiotics that may have biotechnological applications. PMID:14500782

  5. The Psychiatric Genomics Consortium Posttraumatic Stress Disorder Workgroup: Posttraumatic Stress Disorder Enters the Age of Large-Scale Genomic Collaboration

    PubMed Central

    Logue, Mark W; Amstadter, Ananda B; Baker, Dewleen G; Duncan, Laramie; Koenen, Karestan C; Liberzon, Israel; Miller, Mark W; Morey, Rajendra A; Nievergelt, Caroline M; Ressler, Kerry J; Smith, Alicia K; Smoller, Jordan W; Stein, Murray B; Sumner, Jennifer A; Uddin, Monica

    2015-01-01

    The development of posttraumatic stress disorder (PTSD) is influenced by genetic factors. Although there have been some replicated candidates, the identification of risk variants for PTSD has lagged behind genetic research of other psychiatric disorders such as schizophrenia, autism, and bipolar disorder. Psychiatric genetics has moved beyond examination of specific candidate genes in favor of the genome-wide association study (GWAS) strategy of very large numbers of samples, which allows for the discovery of previously unsuspected genes and molecular pathways. The successes of genetic studies of schizophrenia and bipolar disorder have been aided by the formation of a large-scale GWAS consortium: the Psychiatric Genomics Consortium (PGC). In contrast, only a handful of GWAS of PTSD have appeared in the literature to date. Here we describe the formation of a group dedicated to large-scale study of PTSD genetics: the PGC-PTSD. The PGC-PTSD faces challenges related to the contingency on trauma exposure and the large degree of ancestral genetic diversity within and across participating studies. Using the PGC analysis pipeline supplemented by analyses tailored to address these challenges, we anticipate that our first large-scale GWAS of PTSD will comprise over 10 000 cases and 30 000 trauma-exposed controls. Following in the footsteps of our PGC forerunners, this collaboration—of a scope that is unprecedented in the field of traumatic stress—will lead the search for replicable genetic associations and new insights into the biological underpinnings of PTSD. PMID:25904361

  6. Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations.

    PubMed

    McNally, Alan; Oren, Yaara; Kelly, Darren; Pascoe, Ben; Dunn, Steven; Sreecharan, Tristan; Vehkala, Minna; Välimäki, Niko; Prentice, Michael B; Ashour, Amgad; Avram, Oren; Pupko, Tal; Dobrindt, Ulrich; Literak, Ivan; Guenther, Sebastian; Schaufler, Katharina; Wieler, Lothar H; Zhiyong, Zong; Sheppard, Samuel K; McInerney, James O; Corander, Jukka

    2016-09-01

    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements. PMID:27618184

  7. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    PubMed Central

    Triana, Julián; Montagud†, Arnau; Siurana, Maria; Fuente, David; Urchueguía, Arantxa; Gamermann, Daniel; Torres, Javier; Tena, Jose; de Córdoba, Pedro Fernández; Urchueguía, Javier F.

    2014-01-01

    The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942. PMID:25141288

  8. Bacterial molecular networks: bridging the gap between functional genomics and dynamical modelling.

    PubMed

    van Helden, Jacques; Toussaint, Ariane; Thieffry, Denis

    2012-01-01

    This introductory review synthesizes the contents of the volume Bacterial Molecular Networks of the series Methods in Molecular Biology. This volume gathers 9 reviews and 16 method chapters describing computational protocols for the analysis of metabolic pathways, protein interaction networks, and regulatory networks. Each protocol is documented by concrete case studies dedicated to model bacteria or interacting populations. Altogether, the chapters provide a representative overview of state-of-the-art methods for data integration and retrieval, network visualization, graph analysis, and dynamical modelling. PMID:22144145

  9. Bacterial diversity and active biomass in full-scale granular activated carbon filters operated at low water temperatures.

    PubMed

    Kaarela, Outi E; Härkki, Heli A; Palmroth, Marja R T; Tuhkanen, Tuula A

    2015-01-01

    Granular activated carbon (GAC) filtration enhances the removal of natural organic matter and micropollutants in drinking water treatment. Microbial communities in GAC filters contribute to the removal of the biodegradable part of organic matter, and thus help to control microbial regrowth in the distribution system. Our objectives were to investigate bacterial community dynamics, identify the major bacterial groups, and determine the concentration of active bacterial biomass in full-scale GAC filters treating cold (3.7-9.5°C), physicochemically pretreated, and ozonated lake water. Three sampling rounds were conducted to study six GAC filters of different operation times and flow modes in winter, spring, and summer. Total organic carbon results indicated that both the first-step and second-step filters contributed to the removal of organic matter. Length heterogeneity analysis of amplified 16S rRNA genes illustrated that bacterial communities were diverse and considerably stable over time. α-Proteobacteria, β-Proteobacteria, and Nitrospira dominated in all of the GAC filters, although the relative proportion of dominant phylogenetic groups in individual filters differed. The active bacterial biomass accumulation, measured as adenosine triphosphate, was limited due to low temperature, low flux of nutrients, and frequent backwashing. The concentration of active bacterial biomass was not affected by the moderate seasonal temperature variation. In summary, the results provided an insight into the biological component of GAC filtration in cold water temperatures and the operational parameters affecting it. PMID:25242545

  10. Expansion of Cultured Bacterial Diversity by Large-Scale Dilution-to-Extinction Culturing from a Single Seawater Sample.

    PubMed

    Yang, Seung-Jo; Kang, Ilnam; Cho, Jang-Cheon

    2016-01-01

    High-throughput cultivation (HTC) based on a dilution-to-extinction method has been applied broadly to the cultivation of marine bacterial groups, which has often led to the repeated isolation of abundant lineages such as SAR11 and oligotrophic marine gammaproteobacteria (OMG). In this study, to expand the phylogenetic diversity of HTC isolates, we performed a large-scale HTC with a single surface seawater sample collected from the East Sea, the Western Pacific Ocean. Phylogenetic analyses of the 16S rRNA genes from 847 putative pure cultures demonstrated that some isolates were affiliated with not-yet-cultured clades, including the OPB35 and Puniceicoccaceae marine group of Verrucomicrobia and PS1 of Alphaproteobacteria. In addition, numerous strains were obtained from abundant clades, such as SAR11, marine Roseobacter clade, OMG (e.g., SAR92 and OM60), OM43, and SAR116, thereby increasing the size of available culture resources for representative marine bacterial groups. Comparison between the composition of HTC isolates and the bacterial community structure of the seawater sample used for HTC showed that diverse marine bacterial groups exhibited various growth capabilities under our HTC conditions. The growth response of many bacterial groups, however, was clearly different from that observed with conventional plating methods, as exemplified by numerous isolates of the SAR11 clade and Verrucomicrobia. This study showed that a large number of novel bacterial strains could be obtained by an extensive HTC from even a small number of samples. PMID:26573832

  11. CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Shih, Patrick [Kerfeld Lab, UC Berkeley and JGI

    2013-01-22

    Patrick Shih, representing both the University of California, Berkeley and JGI, gives a talk titled "CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  12. CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    SciTech Connect

    Shih, Patrick

    2012-03-22

    Patrick Shih, representing both the University of California, Berkeley and JGI, gives a talk titled "CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  13. Large-Scale Development of Gene-Associated Single-Nucleotide Polymorphism Markers for Molluscan Population Genomic, Comparative Genomic, and Genome-Wide Association Studies

    PubMed Central

    Jiao, Wenqian; Fu, Xiaoteng; Li, Jinqin; Li, Ling; Feng, Liying; Lv, Jia; Zhang, Lu; Wang, Xiaojian; Li, Yangping; Hou, Rui; Zhang, Lingling; Hu, Xiaoli; Wang, Shi; Bao, Zhenmin

    2014-01-01

    Mollusca is the second most diverse group of animals in the world. Despite their perceived importance, omics-level studies have seldom been applied to this group of animals largely due to a paucity of genomic resources. Here, we report the first large-scale gene-associated marker development and evaluation for a bivalve mollusc, Chlamys farreri. More than 21,000 putative single-nucleotide polymorphisms (SNPs) were identified from the C. farreri transcriptome. Primers and probes were designed and synthesized for 4500 SNPs, and 1492 polymorphic markers were successfully developed using a high-resolution melting genotyping platform. These markers are particularly suitable for population genomic analysis due to high polymorphism within and across populations, a low frequency of null alleles, and conformation to neutral expectations. Unexpectedly, high cross-species transferability was observed, suggesting that the transferable SNPs may largely represent ancestral genetic variations that have been preserved differentially among subfamilies of Pectinidae. Gene annotations were available for 73% of the markers, and 65% could be anchored to the recently released Pacific oyster genome. Large-scale association analysis revealed key candidate genes responsible for scallop growth regulation, and provided markers for further genetic improvement of C. farreri in breeding programmes. PMID:24277739

  14. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DOE PAGESBeta

    King, Zachary A.; Lu, Justin; Drager, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

    2015-10-17

    In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scalemore » metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.« less

  15. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats

    NASA Astrophysics Data System (ADS)

    Wong, Hon Lun; Smith, Daniela-Lee; Visscher, Pieter T.; Burns, Brendan P.

    2015-10-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertaken in the present study. A total of 8,263,982 16S rRNA gene sequences were obtained, which were affiliated to 58 bacterial and candidate phyla. The surface of both mats were dominated by Cyanobacteria, accompanied with known or putative members of Alphaproteobacteria and Bacteroidetes. The deeper anoxic layers of smooth mats were dominated by Chloroflexi, while Alphaproteobacteria dominated the lower layers of pustular mats. In situ microelectrode measurements revealed smooth mats have a steeper profile of O2 and H2S concentrations, as well as higher oxygen production, consumption, and sulfate reduction rates. Specific elements (Mo, Mg, Mn, Fe, V, P) could be correlated with specific mat types and putative phylogenetic groups. Models are proposed for these systems suggesting putative surface anoxic niches, differential nitrogen fixing niches, and those coupled with methane metabolism.

  16. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats.

    PubMed

    Wong, Hon Lun; Smith, Daniela-Lee; Visscher, Pieter T; Burns, Brendan P

    2015-01-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertaken in the present study. A total of 8,263,982 16S rRNA gene sequences were obtained, which were affiliated to 58 bacterial and candidate phyla. The surface of both mats were dominated by Cyanobacteria, accompanied with known or putative members of Alphaproteobacteria and Bacteroidetes. The deeper anoxic layers of smooth mats were dominated by Chloroflexi, while Alphaproteobacteria dominated the lower layers of pustular mats. In situ microelectrode measurements revealed smooth mats have a steeper profile of O2 and H2S concentrations, as well as higher oxygen production, consumption, and sulfate reduction rates. Specific elements (Mo, Mg, Mn, Fe, V, P) could be correlated with specific mat types and putative phylogenetic groups. Models are proposed for these systems suggesting putative surface anoxic niches, differential nitrogen fixing niches, and those coupled with methane metabolism. PMID:26499760

  17. Niche differentiation of bacterial communities at a millimeter scale in Shark Bay microbial mats

    PubMed Central

    Wong, Hon Lun; Smith, Daniela-Lee; Visscher, Pieter T.; Burns, Brendan P.

    2015-01-01

    Modern microbial mats can provide key insights into early Earth ecosystems, and Shark Bay, Australia, holds one of the best examples of these systems. Identifying the spatial distribution of microorganisms with mat depth facilitates a greater understanding of specific niches and potentially novel microbial interactions. High throughput sequencing coupled with elemental analyses and biogeochemical measurements of two distinct mat types (smooth and pustular) at a millimeter scale were undertaken in the present study. A total of 8,263,982 16S rRNA gene sequences were obtained, which were affiliated to 58 bacterial and candidate phyla. The surface of both mats were dominated by Cyanobacteria, accompanied with known or putative members of Alphaproteobacteria and Bacteroidetes. The deeper anoxic layers of smooth mats were dominated by Chloroflexi, while Alphaproteobacteria dominated the lower layers of pustular mats. In situ microelectrode measurements revealed smooth mats have a steeper profile of O2 and H2S concentrations, as well as higher oxygen production, consumption, and sulfate reduction rates. Specific elements (Mo, Mg, Mn, Fe, V, P) could be correlated with specific mat types and putative phylogenetic groups. Models are proposed for these systems suggesting putative surface anoxic niches, differential nitrogen fixing niches, and those coupled with methane metabolism. PMID:26499760

  18. Bacterial community structure of a lab-scale anammox membrane bioreactor.

    PubMed

    Gonzalez-Martinez, Alejandro; Osorio, F; Rodriguez-Sanchez, Alejandro; Martinez-Toledo, Maria Victoria; Gonzalez-Lopez, Jesus; Lotti, Tommaso; van Loosdrecht, M C M

    2015-01-01

    Autotrophic nitrogen removal technologies have proliferated through the last decade. Among these, a promising one is the membrane bioreactor (MBR) Anammox, which can achieve very high solids retention time and therefore sets a proper environment for the cultivation of anammox bacteria. In this sense, the MBR Anammox is an efficient technology for the treatment of effluents with low organic carbon and high ammonium concentrations once it has been treated under partial nitrification systems. A lab-scale MBR Anammox bioreactor has been built at the Technological University of Delft, The Netherlands and has been proven for efficient nitrogen removal and efficient cultivation of anammox bacteria. In this study, next-generation sequencing techniques have been used for the investigation of the bacterial communities of this MBR Anammox for the first time ever. A strong domination of Candidatus Brocadia bacterium and also the presence of a myriad of other microorganisms that have adapted to this environment were detected, suggesting that the MBR Anammox bioreactor might have a more complex microbial ecosystem that it has been thought. Among these, nitrate-reducing heterotrophs and primary producers, among others, were identified. Definition of the ecological roles of the OTUs identified through metagenomic analysis was discussed. PMID:25270790

  19. Disinfection of bacterial biofilms in pilot-scale cooling tower systems

    PubMed Central

    Liu, Yang; Zhang, Wei; Sileika, Tadas; Warta, Richard; Cianciotto, Nicholas P.; Packman, Aaron I.

    2015-01-01

    The impact of continuous chlorination and periodic glutaraldehyde treatment on planktonic and biofilm microbial communities was evaluated in pilot-scale cooling towers operated continuously for 3 months. The system was operated at a flow rate of 10,080 l day−1. Experiments were performed with a well-defined microbial consortium containing three heterotrophic bacteria: Pseudomonas aeruginosa, Klebsiella pneumoniae and Flavobacterium sp. The persistence of each species was monitored in the recirculating cooling water loop and in biofilms on steel and PVC coupons in the cooling tower basin. The observed bacterial colonization in cooling towers did not follow trends in growth rates observed under batch conditions and, instead, reflected differences in the ability of each organism to remain attached and form biofilms under the high-through flow conditions in cooling towers. Flavobacterium was the dominant organism in the community, while P. aeruginosa and K. pneumoniae did not attach well to either PVC or steel coupons in cooling towers and were not able to persist in biofilms. As a result, the much greater ability of Flavobacterium to adhere to surfaces protected it from disinfection, whereas P. aeruginosa and K. pneumoniae were subject to rapid disinfection in the planktonic state. PMID:21547755

  20. Disinfection of bacterial biofilms in pilot-scale cooling tower systems.

    PubMed

    Liu, Yang; Zhang, Wei; Sileika, Tadas; Warta, Richard; Cianciotto, Nicholas P; Packman, Aaron I

    2011-04-01

    The impact of continuous chlorination and periodic glutaraldehyde treatment on planktonic and biofilm microbial communities was evaluated in pilot-scale cooling towers operated continuously for 3 months. The system was operated at a flow rate of 10,080 l day(-1). Experiments were performed with a well-defined microbial consortium containing three heterotrophic bacteria: Pseudomonas aeruginosa, Klebsiella pneumoniae and Flavobacterium sp. The persistence of each species was monitored in the recirculating cooling water loop and in biofilms on steel and PVC coupons in the cooling tower basin. The observed bacterial colonization in cooling towers did not follow trends in growth rates observed under batch conditions and, instead, reflected differences in the ability of each organism to remain attached and form biofilms under the high-through flow conditions in cooling towers. Flavobacterium was the dominant organism in the community, while P. aeruginosa and K. pneumoniae did not attach well to either PVC or steel coupons in cooling towers and were not able to persist in biofilms. As a result, the much greater ability of Flavobacterium to adhere to surfaces protected it from disinfection, whereas P. aeruginosa and K. pneumoniae were subject to rapid disinfection in the planktonic state. PMID:21547755

  1. Structuring the bacterial genome: Y1-transposases associated with REP-BIME sequences†

    PubMed Central

    Ton-Hoang, Bao; Siguier, Patricia; Quentin, Yves; Onillon, Séverine; Marty, Brigitte; Fichant, Gwennaele; Chandler, Mick

    2012-01-01

    REPs are highly repeated intergenic palindromic sequences often clustered into structures called BIMEs including two individual REPs separated by short linker of variable length. They play a variety of key roles in the cell. REPs also resemble the sub-terminal hairpins of the atypical IS200/605 family of insertion sequences which encode Y1 transposases (TnpAIS200/IS605). These belong to the HUH endonuclease family, carry a single catalytic tyrosine (Y) and promote single strand transposition. Recently, a new clade of Y1 transposases (TnpAREP) was found associated with REP/BIME in structures called REPtrons. It has been suggested that TnpAREP is responsible for REP/BIME proliferation over genomes. We analysed and compared REP distribution and REPtron structure in numerous available E. coli and Shigella strains. Phylogenetic analysis clearly indicated that tnpAREP was acquired early in the species radiation and was lost later in some strains. To understand REP/BIME behaviour within the host genome, we also studied E. coli K12 TnpAREP activity in vitro and demonstrated that it catalyses cleavage and recombination of BIMEs. While TnpAREP shared the same general organization and similar catalytic characteristics with TnpAIS200/IS605 transposases, it exhibited distinct properties potentially important in the creation of BIME variability and in their amplification. TnpAREP may therefore be one of the first examples of transposase domestication in prokaryotes. PMID:22199259

  2. PCR-based positive hybridization to detect genomic diversity associated with bacterial secondary metabolism

    PubMed Central

    Pomati, Francesco; Neilan, Brett A.

    2004-01-01

    A PCR-based positive hybridization (PPH) method was developed to explore toxic-specific genes in common between toxigenic strains of Anabaena circinalis, a cyanobacterium able to produce saxitoxin (STX). The PPH technique is based on the same principles of suppression subtractive hybridization (SSH), although with the former no driver DNA is required and two tester genomic DNAs are hybridized at high stringency. The aim was to obtain genes associated with cyanobacterial STX production. The genetic diversity within phylogenetically similar strains of A.circinalis was investigated by comparing the results of the standard SSH protocol to the PPH approach by DNA-microarray analysis. SSH allowed the recovery of DNA libraries that were mainly specific for each of the two STX-producing strains used. Several candidate sequences were found by PPH to be in common between both the STX-producing testers. The PPH technique performed using unsubtracted genomic libraries proved to be a powerful tool to identify DNA sequences possibly transferred laterally between two cyanobacterial strains that may be candidate(s) in STX biosynthesis. The approach presented in this study represents a novel and valid tool to study the genetic basis for secondary metabolite production in microorganisms. PMID:14718552

  3. Vertebrate Protein CTCF and its Multiple Roles in a Large-Scale Regulation of Genome Activity

    PubMed Central

    Nikolaev, L.G; Akopov, S.B; Didych, D.A; Sverdlov, E.D

    2009-01-01

    The CTCF transcription factor is an 11 zinc fingers multifunctional protein that uses different zinc finger combinations to recognize and bind different sites within DNA. CTCF is thought to participate in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains and regulation of imprinting. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on genomic distribution of CTCF binding sites in the human and other genomes within a framework of the loop domain hypothesis of large-scale regulation of the genome activity. We also tried to formulate possible lines of studies on a variety of CTCF functions which probably depend on its ability to specifically bind DNA, interact with other proteins and form di- and multimers. These three fundamental properties allow CTCF to serve as a transcription factor, an insulator and a constitutive dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s). PMID:20119526

  4. Vertebrate Protein CTCF and its Multiple Roles in a Large-Scale Regulation of Genome Activity.

    PubMed

    Nikolaev, L G; Akopov, S B; Didych, D A; Sverdlov, E D

    2009-08-01

    The CTCF transcription factor is an 11 zinc fingers multifunctional protein that uses different zinc finger combinations to recognize and bind different sites within DNA. CTCF is thought to participate in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains and regulation of imprinting. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on genomic distribution of CTCF binding sites in the human and other genomes within a framework of the loop domain hypothesis of large-scale regulation of the genome activity. We also tried to formulate possible lines of studies on a variety of CTCF functions which probably depend on its ability to specifically bind DNA, interact with other proteins and form di- and multimers. These three fundamental properties allow CTCF to serve as a transcription factor, an insulator and a constitutive dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s). PMID:20119526

  5. Genome-scale approaches to the epigenetics of common human disease

    PubMed Central

    2011-01-01

    Traditionally, the pathology of human disease has been focused on microscopic examination of affected tissues, chemical and biochemical analysis of biopsy samples, other available samples of convenience, such as blood, and noninvasive or invasive imaging of varying complexity, in order to classify disease and illuminate its mechanistic basis. The molecular age has complemented this armamentarium with gene expression arrays and selective analysis of individual genes. However, we are entering a new era of epigenomic profiling, i.e., genome-scale analysis of cell-heritable nonsequence genetic change, such as DNA methylation. The epigenome offers access to stable measurements of cellular state and to biobanked material for large-scale epidemiological studies. Some of these genome-scale technologies are beginning to be applied to create the new field of epigenetic epidemiology. PMID:19844740

  6. New clues about the evolutionary history of metabolic losses in bacterial endosymbionts, provided by the genome of Buchnera aphidicola from the aphid Cinara tujafilina.

    PubMed

    Lamelas, Araceli; Gosalbes, María José; Moya, Andrés; Latorre, Amparo

    2011-07-01

    The symbiotic association between aphids (Homoptera) and Buchnera aphidicola (Gammaproteobacteria) started about 100 to 200 million years ago. As a consequence of this relationship, the bacterial genome has undergone a prominent size reduction. The downsize genome process starts when the bacterium enters the host and will probably end with its extinction and replacement by another healthier bacterium or with the establishment of metabolic complementation between two or more bacteria. Nowadays, several complete genomes of Buchnera aphidicola from four different aphid species (Acyrthosiphon pisum, Schizaphis graminum, Baizongia pistacea, and Cinara cedri) have been fully sequenced. C. cedri belongs to the subfamily Lachninae and harbors two coprimary bacteria that fulfill the metabolic needs of the whole consortium: B. aphidicola with the smallest genome reported so far and "Candidatus Serratia symbiotica." In addition, Cinara tujafilina, another member of the subfamily Lachninae, closely related to C. cedri, also harbors "Ca. Serratia symbiotica" but with a different phylogenetic status than the one from C. cedri. In this study, we present the complete genome sequence of B. aphidicola from C. tujafilina and the phylogenetic analysis and comparative genomics with the other Buchnera genomes. Furthermore, the gene repertoire of the last common ancestor has been inferred, and the evolutionary history of the metabolic losses that occurred in the different lineages has been analyzed. Although stochastic gene loss plays a role in the genome reduction process, it is also clear that metabolism, as a functional constraint, is also a powerful evolutionary force in insect endosymbionts. PMID:21571878

  7. Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

    PubMed Central

    2013-01-01

    Background Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related to a considered organism of interest in a metagenome, and whether it is beneficial to combine information obtained using different technologies. We analyze comparatively three metagenomic datasets acquired from a sample containing the anammox bacterium Candidatus ’Brocadia fulgida’ (B. fulgida). These datasets were obtained using Roche 454 FLX and Sanger sequencing with two different libraries (shotgun and fosmid). Results In each dataset, the abundance of the reads annotated to B. fulgida was much lower than the abundance expected from available cell count information. This was due to the overrepresentation of GC-richer organisms, as shown by GC-content distribution of the reads. Nevertheless, by considering the union of B. fulgida reads over the three datasets, the number of B. fulgida ORFs recovered for at least 80% of their length was twice the amount recovered by the best technology. Indeed, while taxonomic distributions of reads in the three datasets were similar, the respective sets of B. fulgida ORFs recovered for a large part of their length were highly different, and depth of coverage patterns of 454 and Sanger were dissimilar. Conclusions Precautions should be sought in order to prevent the overrepresentation of GC-rich microbes in the datasets. This overrepresentation and the consistency of the taxonomic distributions of reads obtained with different sequencing technologies suggests that, in general, abundance biases might be mainly due to other steps of the sequencing protocols. Results show that biases against organisms of interest could be compensated combining different sequencing technologies, due to the differences of their genome-level sequencing

  8. Bringing large-scale multiple genome analysis one step closer: ScalaBLAST and beyond

    SciTech Connect

    Oehmen, Christopher S.; Sofia, Heidi J.; Baxter, Douglas; Szeto, Ernest; Hugenholtz, Philip; Kyrpides, Nikos; Markowitz, Victor; Straatsma, Tjerk P.

    2007-06-01

    Genome sequence comparisons of exponentially growing data sets form the foundation for the comparative analysis tools provided by community biological data resources such as the Integrated Microbial Genome (IMG) system at the Joint Genome Institute (JGI). We present an example of how ScalaBLAST, a high-throughput sequence analysis program harnesses increasingly critical high-performance computing to perform sequence analysis which is a critical component of maintaining a state-of-the-art sequence data repository. The Integrated Microbial Genomes (IMG) system1 is a data management and analysis platform for microbial genomes hosted at the JGI. IMG contains both draft and complete JGI genomes integrated with other publicly available microbial genomes of all three domains of life. IMG provides tools and viewers for interactive analysis of genomes, genes and functions, individually or in a comparative context. Most of these tools are based on pre-computed pairwise sequence similarities involving millions of genes. These computations are becoming prohibitively time consuming with the rapid increase in the number of newly sequenced genomes incorporated into IMG and the need to refresh regularly the content of IMG in order to reflect changes in the annotations of existing genomes. Thus, building IMG 2.0 (released on December 1st 2006) entailed reloading from NCBI's RefSeq all the genomes in the previous version of IMG (IMG 1.6, as of September 1st, 2006) together with 1,541 new public microbial,viral and eukaryal genomes, bringing the total of IMG genomes to 2,301. A critical part of building IMG 2.0 involved using PNNL ScalaBLAST software for computing pairwise similarities for over 2.2 million genes in under 26 hours on 1,000 processors, thus illustrating the impact that new generation bioinformatics tools are poised to make in biology. The BLAST algorithm2, 3 is a familiar bioinformatics application for computing sequence similarity, and has become a workhorse in large-scale

  9. Antimicrobial Susceptibility Test with Plasmonic Imaging and Tracking of Single Bacterial Motions on Nanometer Scale.

    PubMed

    Syal, Karan; Iriya, Rafael; Yang, Yunze; Yu, Hui; Wang, Shaopeng; Haydel, Shelley E; Chen, Hong-Yuan; Tao, Nongjian

    2016-01-26

    Antimicrobial susceptibility tests (ASTs) are important for confirming susceptibility to empirical antibiotics and detecting resistance in bacterial isolates. Currently, most ASTs performed in clinical microbiology laboratories are based on bacterial culturing, which take days to complete for slowly growing microorganisms. A faster AST will reduce morbidity and mortality rates and help healthcare providers administer narrow spectrum antibiotics at the earliest possible treatment stage. We report the development of a nonculture-based AST using a plasmonic imaging and tracking (PIT) technology. We track the motion of individual bacterial cells tethered to a surface with nanometer (nm) precision and correlate the phenotypic motion with bacterial metabolism and antibiotic action. We show that antibiotic action significantly slows down bacterial motion, which can be quantified for development of a rapid phenotypic-based AST. PMID:26637243

  10. Time-scales of hydrological forcing on the geochemistry and bacterial community structure of temperate peat soils

    NASA Astrophysics Data System (ADS)

    Nunes, Flavia L. D.; Aquilina, Luc; De Ridder, Jo; Francez, André-Jean; Quaiser, Achim; Caudal, Jean-Pierre; Vandenkoornhuyse, Philippe; Dufresne, Alexis

    2015-10-01

    Peatlands are an important global carbon reservoir. The continued accumulation of carbon in peatlands depends on the persistence of anoxic conditions, in part induced by water saturation, which prevents oxidation of organic matter, and slows down decomposition. Here we investigate how and over what time scales the hydrological regime impacts the geochemistry and the bacterial community structure of temperate peat soils. Peat cores from two sites having contrasting groundwater budgets were subjected to four controlled drought-rewetting cycles. Pore water geochemistry and metagenomic profiling of bacterial communities showed that frequent water table drawdown induced lower concentrations of dissolved carbon, higher concentrations of sulfate and iron and reduced bacterial richness and diversity in the peat soil and water. Short-term drought cycles (3-9 day frequency) resulted in different communities from continuously saturated environments. Furthermore, the site that has more frequently experienced water table drawdown during the last two decades presented the most striking shifts in bacterial community structure, altering biogeochemical functioning of peat soils. Our results suggest that the increase in frequency and duration of drought conditions under changing climatic conditions or water resource use can induce profound changes in bacterial communities, with potentially severe consequences for carbon storage in temperate peatlands.

  11. Time-scales of hydrological forcing on the geochemistry and bacterial community structure of temperate peat soils

    PubMed Central

    Nunes, Flavia L. D.; Aquilina, Luc; de Ridder, Jo; Francez, André-Jean; Quaiser, Achim; Caudal, Jean-Pierre; Vandenkoornhuyse, Philippe; Dufresne, Alexis

    2015-01-01

    Peatlands are an important global carbon reservoir. The continued accumulation of carbon in peatlands depends on the persistence of anoxic conditions, in part induced by water saturation, which prevents oxidation of organic matter, and slows down decomposition. Here we investigate how and over what time scales the hydrological regime impacts the geochemistry and the bacterial community structure of temperate peat soils. Peat cores from two sites having contrasting groundwater budgets were subjected to four controlled drought-rewetting cycles. Pore water geochemistry and metagenomic profiling of bacterial communities showed that frequent water table drawdown induced lower concentrations of dissolved carbon, higher concentrations of sulfate and iron and reduced bacterial richness and diversity in the peat soil and water. Short-term drought cycles (3–9 day frequency) resulted in different communities from continuously saturated environments. Furthermore, the site that has more frequently experienced water table drawdown during the last two decades presented the most striking shifts in bacterial community structure, altering biogeochemical functioning of peat soils. Our results suggest that the increase in frequency and duration of drought conditions under changing climatic conditions or water resource use can induce profound changes in bacterial communities, with potentially severe consequences for carbon storage in temperate peatlands. PMID:26440376

  12. Time-scales of hydrological forcing on the geochemistry and bacterial community structure of temperate peat soils.

    PubMed

    Nunes, Flavia L D; Aquilina, Luc; de Ridder, Jo; Francez, André-Jean; Quaiser, Achim; Caudal, Jean-Pierre; Vandenkoornhuyse, Philippe; Dufresne, Alexis

    2015-01-01

    Peatlands are an important global carbon reservoir. The continued accumulation of carbon in peatlands depends on the persistence of anoxic conditions, in part induced by water saturation, which prevents oxidation of organic matter, and slows down decomposition. Here we investigate how and over what time scales the hydrological regime impacts the geochemistry and the bacterial community structure of temperate peat soils. Peat cores from two sites having contrasting groundwater budgets were subjected to four controlled drought-rewetting cycles. Pore water geochemistry and metagenomic profiling of bacterial communities showed that frequent water table drawdown induced lower concentrations of dissolved carbon, higher concentrations of sulfate and iron and reduced bacterial richness and diversity in the peat soil and water. Short-term drought cycles (3-9 day frequency) resulted in different communities from continuously saturated environments. Furthermore, the site that has more frequently experienced water table drawdown during the last two decades presented the most striking shifts in bacterial community structure, altering biogeochemical functioning of peat soils. Our results suggest that the increase in frequency and duration of drought conditions under changing climatic conditions or water resource use can induce profound changes in bacterial communities, with potentially severe consequences for carbon storage in temperate peatlands. PMID:26440376

  13. ``Black Holes" and Bacterial Pathogenicity: A Large Genomic Deletion that Enhances the Virulence of Shigella spp. and Enteroinvasive Escherichia coli

    NASA Astrophysics Data System (ADS)

    Maurelli, Anthony T.; Fernandez, Reinaldo E.; Bloch, Craig A.; Rode, Christopher K.; Fasano, Alessio

    1998-03-01

    Plasmids, bacteriophages, and pathogenicity islands are genomic additions that contribute to the evolution of bacterial pathogens. For example, Shigella spp., the causative agents of bacillary dysentery, differ from the closely related commensal Escherichia coli in the presence of a plasmid in Shigella that encodes virulence functions. However, pathogenic bacteria also may lack properties that are characteristic of nonpathogens. Lysine decarboxylate (LDC) activity is present in ≈ 90% of E. coli strains but is uniformly absent in Shigella strains. When the gene for LDC, cadA, was introduced into Shigella flexneri 2a, virulence became attenuated, and enterotoxin activity was inhibited greatly. The enterotoxin inhibitor was identified as cadaverine, a product of the reaction catalyzed by LDC. Comparison of the S. flexneri 2a and laboratory E. coli K-12 genomes in the region of cadA revealed a large deletion in Shigella. Representative strains of Shigella spp. and enteroinvasive E. coli displayed similar deletions of cadA. Our results suggest that, as Shigella spp. evolved from E. coli to become pathogens, they not only acquired virulence genes on a plasmid but also shed genes via deletions. The formation of these ``black holes,'' deletions of genes that are detrimental to a pathogenic lifestyle, provides an evolutionary pathway that enables a pathogen to enhance virulence. Furthermore, the demonstration that cadaverine can inhibit enterotoxin activity may lead to more general models about toxin activity or entry into cells and suggests an avenue for antitoxin therapy. Thus, understanding the role of black holes in pathogen evolution may yield clues to new treatments of infectious diseases.

  14. Fractality and entropic scaling in the chromosomal distribution of conserved noncoding elements in the human genome.

    PubMed

    Polychronopoulos, Dimitris; Athanasopoulou, Labrini; Almirantis, Yannis

    2016-06-15

    Conserved non-coding elements (CNEs) are defined using various degrees of sequence identity and thresholds of minimal length. Their conservation frequently exceeds the one observed for protein-coding sequences. We explored the chromosomal distribution of different classes of CNEs in the human genome. We employed two methodologies: the scaling of block entropy and box-counting, with the aim to assess fractal characteristics of different CNE datasets. Both approaches converged to the conclusion that well-developed fractality is characteristic of elements that are either extremely conserved between species or are of ancient origin, i.e. conserved between distant organisms across evolution. Given that CNEs are often clustered around genes, we verified by appropriate gene masking that fractal-like patterns emerge even when elements found in proximity or inside genes are excluded. An evolutionary scenario is proposed, involving genomic events that might account for fractal distribution of CNEs in the human genome as indicated through numerical simulations. PMID:26899868

  15. Genome-scale protein expression and structural biology of Plasmodium falciparum and related Apicomplexan organisms.

    PubMed

    Vedadi, Masoud; Lew, Jocelyne; Artz, Jennifer; Amani, Mehrnaz; Zhao, Yong; Dong, Aiping; Wasney, Gregory A; Gao, Mian; Hills, Tanya; Brokx, Stephen; Qiu, Wei; Sharma, Sujata; Diassiti, Angelina; Alam, Zahoor; Melone, Michelle; Mulichak, Anne; Wernimont, Amy; Bray, James; Loppnau, Peter; Plotnikova, Olga; Newberry, Kate; Sundararajan, Emayavaram; Houston, Simon; Walker, John; Tempel, Wolfram; Bochkarev, Alexey; Kozieradzki, Ivona; Edwards, Aled; Arrowsmith, Cheryl; Roos, David; Kain, Kevin; Hui, Raymond

    2007-01-01

    Parasites from the protozoan phylum Apicomplexa are responsible for diseases, such as malaria, toxoplasmosis and cryptosporidiosis, all of which have significantly higher rates of mortality and morbidity in economically underdeveloped regions of the world. Advances in vaccine development and drug discovery are urgently needed to control these diseases and can be facilitated by production of purified recombinant proteins from Apicomplexan genomes and determination of their 3D structures. To date, both heterologous expression and crystallization of Apicomplexan proteins have seen only limited success. In an effort to explore the effectiveness of producing and crystallizing proteins on a genome-scale using a standardized methodology, over 400 distinct Plasmodium falciparum target genes were chosen representing different cellular classes, along with select orthologues from four other Plasmodium species as well as Cryptosporidium parvum and Toxoplasma gondii. From a total of 1008 genes from the seven genomes, 304 (30.2%) produced purified soluble proteins and 97 (9.6%) crystallized, culminating in 36 crystal structures. These results demonstrate that, contrary to previous findings, a standardized platform using Escherichia coli can be effective for genome-scale production and crystallography of Apicomplexan proteins. Predictably, orthologous proteins from different Apicomplexan genomes behaved differently in expression, purification and crystallization, although the overall success rates of Plasmodium orthologues do not differ significantly. Their differences were effectively exploited to elevate the overall productivity to levels comparable to the most successful ongoing structural genomics projects: 229 of the 468 target genes produced purified soluble protein from one or more organisms, with 80 and 32 of the purified targets, respectively, leading to crystals and ultimately structures from one or more orthologues. PMID:17125854

  16. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    PubMed Central

    2012-01-01

    Background Spirulina (Arthrospira) platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438) genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP) analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a predictive metabolic platform

  17. Bacterial community structures are unique and resilient in full-scale bioenergy systems

    PubMed Central

    Werner, Jeffrey J.; Knights, Dan; Garcia, Marcelo L.; Scalfone, Nicholas B.; Smith, Samual; Yarasheski, Kevin; Cummings, Theresa A.; Beers, Allen R.; Knight, Rob; Angenent, Largus T.

    2011-01-01

    Anaerobic digestion is the most successful bioenergy technology worldwide with, at its core, undefined microbial communities that have poorly understood dynamics. Here, we investigated the relationships of bacterial community structure (>400,000 16S rRNA gene sequences for 112 samples) with function (i.e., bioreactor performance) and environment (i.e., operating conditions) in a yearlong monthly time series of nine full-scale bioreactor facilities treating brewery wastewater (>20,000 measurements). Each of the nine facilities had a unique community structure with an unprecedented level of stability. Using machine learning, we identified a small subset of operational taxonomic units (OTUs; 145 out of 4,962), which predicted the location of the facility of origin for almost every sample (96.4% accuracy). Of these 145 OTUs, syntrophic bacteria were systematically overrepresented, demonstrating that syntrophs rebounded following disturbances. This indicates that resilience, rather than dynamic competition, played an important role in maintaining the necessary syntrophic populations. In addition, we explained the observed phylogenetic differences between all samples on the basis of a subset of environmental gradients (using constrained ordination) and found stronger relationships between community structure and its function rather than its environment. These relationships were strongest for two performance variables—methanogenic activity and substrate removal efficiency—both of which were also affected by microbial ecology because these variables were correlated with community evenness (at any given time) and variability in phylogenetic structure (over time), respectively. Thus, we quantified relationships between community structure and function, which opens the door to engineer communities with superior functions. PMID:21368115

  18. Nanometer-scale characterization of exceptionally preserved bacterial fossils in Paleocene phosphorites from Ouled Abdoun (Morocco).

    PubMed

    Cosmidis, J; Benzerara, K; Gheerbrant, E; Estève, I; Bouya, B; Amaghzaz, M

    2013-03-01

    Micrometer-sized spherical and rod-shaped forms have been reported in many phosphorites and often interpreted as microbes fossilized by apatite, based on their morphologic resemblance with modern bacteria inferred by scanning electron microscopy (SEM) observations. This interpretation supports models involving bacteria in the formation of phosphorites. Here, we studied a phosphatic coprolite of Paleocene age originating from the Ouled Abdoun phosphate basin (Morocco) down to the nanometer-scale using focused ion beam milling, transmission electron microscopy (TEM), and scanning transmission x-ray microscopy (STXM) coupled with x-ray absorption near-edge structure spectroscopy (XANES). The coprolite, exclusively composed of francolite (a carbonate-fluroapatite), is formed by the accumulation of spherical objects, delimited by a thin envelope, and whose apparent diameters are between 0.5 and 3 μm. The envelope of the spheres is composed of a continuous crown dense to electrons, which measures 20-40 nm in thickness. It is surrounded by two thinner layers that are more porous and transparent to electrons and enriched in organic carbon. The observed spherical objects are very similar with bacteria encrusting in hydroxyapatite as observed in laboratory experiments. We suggest that they are Gram-negative bacteria fossilized by francolite, the precipitation of which started within the periplasm of the cells. We discuss the role of bacteria in the fossilization mechanism and propose that they could have played an active role in the formation of francolite. This study shows that ancient phosphorites can contain fossil biological subcellular structures as fine as a bacterial periplasm. Moreover, we demonstrate that while morphological information provided by SEM analyses is valuable, the use of additional nanoscale analyses is a powerful approach to help inferring the biogenicity of biomorphs found in phosphorites. A more systematic use of this approach could considerably

  19. FVGWAS: Fast voxelwise genome wide association analysis of large-scale imaging genetic data.

    PubMed

    Huang, Meiyan; Nichols, Thomas; Huang, Chao; Yu, Yang; Lu, Zhaohua; Knickmeyer, Rebecca C; Feng, Qianjin; Zhu, Hongtu

    2015-09-01

    More and more large-scale imaging genetic studies are being widely conducted to collect a rich set of imaging, genetic, and clinical data to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. Several major big-data challenges arise from testing genome-wide (NC>12 million known variants) associations with signals at millions of locations (NV~10(6)) in the brain from thousands of subjects (n~10(3)). The aim of this paper is to develop a Fast Voxelwise Genome Wide Association analysiS (FVGWAS) framework to efficiently carry out whole-genome analyses of whole-brain data. FVGWAS consists of three components including a heteroscedastic linear model, a global sure independence screening (GSIS) procedure, and a detection procedure based on wild bootstrap methods. Specifically, for standard linear association, the computational complexity is O (nNVNC) for voxelwise genome wide association analysis (VGWAS) method compared with O ((NC+NV)n(2)) for FVGWAS. Simulation studies show that FVGWAS is an efficient method of searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. Finally, we have successfully applied FVGWAS to a large-scale imaging genetic data analysis of ADNI data with 708 subjects, 193,275voxels in RAVENS maps, and 501,584 SNPs, and the total processing time was 203,645s for a single CPU. Our FVGWAS may be a valuable statistical toolbox for large-scale imaging genetic analysis as the field is rapidly advancing with ultra-high-resolution imaging and whole-genome sequencing. PMID:26025292

  20. Use of bacterial whole-genome sequencing to investigate local persistence and spread in bovine tuberculosis

    PubMed Central

    Trewby, Hannah; Wright, David; Breadon, Eleanor L.; Lycett, Samantha J.; Mallon, Tom R.; McCormick, Carl; Johnson, Paul; Orton, Richard J.; Allen, Adrian R.; Galbraith, Julie; Herzyk, Pawel; Skuce, Robin A.; Biek, Roman; Kao, Rowland R.

    2016-01-01

    Mycobacterium bovis is the causal agent of bovine tuberculosis, one of the most important diseases currently facing the UK cattle industry. Here, we use high-density whole genome sequencing (WGS) in a defined sub-population of M. bovis in 145 cattle across 66 herd breakdowns to gain insights into local spread and persistence. We show that despite low divergence among isolates, WGS can in principle expose contributions of under-sampled host populations to M. bovis transmission. However, we demonstrate that in our data such a signal is due to molecular type switching, which had been previously undocumented for M. bovis. Isolates from farms with a known history of direct cattle movement between them did not show a statistical signal of higher genetic similarity. Despite an overall signal of genetic isolation by distance, genetic distances also showed no apparent relationship with spatial distance among affected farms over distances <5 km. Using simulations, we find that even over the brief evolutionary timescale covered by our data, Bayesian phylogeographic approaches are feasible. Applying such approaches showed that M. bovis dispersal in this system is heterogeneous but slow overall, averaging 2 km/year. These results confirm that widespread application of WGS to M. bovis will bring novel and important insights into the dynamics of M. bovis spread and persistence, but that the current questions most pertinent to control will be best addressed using approaches that more directly integrate WGS with additional epidemiological data. PMID:26972511

  1. Genomic Adaptations to the Loss of a Conserved Bacterial DNA Methyltransferase

    PubMed Central

    2015-01-01

    ABSTRACT CcrM is an orphan DNA methyltransferase nearly universally conserved in a vast group of Alphaproteobacteria. In Caulobacter crescentus, it controls the expression of key genes involved in the regulation of the cell cycle and cell division. Here, we demonstrate, using an experimental evolution approach, that C. crescentus can significantly compensate, through easily accessible genetic changes like point mutations, the severe loss in fitness due to the absence of CcrM, quickly improving its growth rate and cell morphology in rich medium. By analyzing the compensatory mutations genome-wide in 12 clones sampled from independent ΔccrM populations evolved for ~300 generations, we demonstrated that each of the twelve clones carried at least one mutation that potentially stimulated ftsZ expression, suggesting that the low intracellular levels of FtsZ are the major burden of ΔccrM mutants. In addition, we demonstrate that the phosphoenolpyruvate-carbohydrate phosphotransfer system (PTS) actually modulates ftsZ and mipZ transcription, uncovering a previously unsuspected link between metabolic regulation and cell division in Alphaproteobacteria. We present evidence that point mutations found in genes encoding proteins of the PTS provide the strongest fitness advantage to ΔccrM cells cultivated in rich medium despite being disadvantageous in minimal medium. This environmental sign epistasis might prevent such mutations from getting fixed under changing natural conditions, adding a plausible explanation for the broad conservation of CcrM. PMID:26220966

  2. Multidrug-resistant Escherichia coli soft tissue infection investigated with bacterial whole genome sequencing.

    PubMed

    Buchanan, Ruaridh; Stoesser, Nicole; Crook, Derrick; Bowler, Ian C J W

    2014-01-01

    A 45-year-old man with dilated cardiomyopathy presented with acute leg pain and erythema suggestive of necrotising fasciitis. Initial surgical exploration revealed no necrosis and treatment for a soft tissue infection was started. Blood and tissue cultures unexpectedly grew a Gram-negative bacillus, subsequently identified by an automated broth microdilution phenotyping system as an extended-spectrum β-lactamase producing Escherichia coli. The patient was treated with a 3-week course of antibiotics (ertapenem followed by ciprofloxacin) and debridement for small areas of necrosis, followed by skin grafting. The presence of E. coli triggered investigation of both host and pathogen. The patient was found to have previously undiagnosed liver disease, a risk factor for E. coli soft tissue infection. Whole genome sequencing of isolates from all specimens confirmed they were clonal, of sequence type ST131 and associated with a likely plasmid-associated AmpC (CMY-2), several other resistance genes and a number of virulence factors. PMID:25331151

  3. Use of bacterial whole-genome sequencing to investigate local persistence and spread in bovine tuberculosis.

    PubMed

    Trewby, Hannah; Wright, David; Breadon, Eleanor L; Lycett, Samantha J; Mallon, Tom R; McCormick, Carl; Johnson, Paul; Orton, Richard J; Allen, Adrian R; Galbraith, Julie; Herzyk, Pawel; Skuce, Robin A; Biek, Roman; Kao, Rowland R

    2016-03-01

    Mycobacterium bovis is the causal agent of bovine tuberculosis, one of the most important diseases currently facing the UK cattle industry. Here, we use high-density whole genome sequencing (WGS) in a defined sub-population of M. bovis in 145 cattle across 66 herd breakdowns to gain insights into local spread and persistence. We show that despite low divergence among isolates, WGS can in principle expose contributions of under-sampled host populations to M. bovis transmission. However, we demonstrate that in our data such a signal is due to molecular type switching, which had been previously undocumented for M. bovis. Isolates from farms with a known history of direct cattle movement between them did not show a statistical signal of higher genetic similarity. Despite an overall signal of genetic isolation by distance, genetic distances also showed no apparent relationship with spatial distance among affected farms over distances <5 km. Using simulations, we find that even over the brief evolutionary timescale covered by our data, Bayesian phylogeographic approaches are feasible. Applying such approaches showed that M. bovis dispersal in this system is heterogeneous but slow overall, averaging 2 km/year. These results confirm that widespread application of WGS to M. bovis will bring novel and important insights into the dynamics of M. bovis spread and persistence, but that the current questions most pertinent to control will be best addressed using approaches that more directly integrate WGS with additional epidemiological data. PMID:26972511

  4. Evaluation and optimisation of bacterial genomic DNA extraction for no-culture techniques applied to vinegars.

    PubMed

    Mamlouk, Dhouha; Hidalgo, Claudio; Torija, María-Jesús; Gullo, Maria

    2011-10-01

    Direct genomic DNA extraction from vinegars was set up and suitability for PCR assays performed by PCR/DGGE and sequencing of 16S rRNA gene. The method was tested on 12 intermediary products of special vinegars, fruit vinegars and condiments produced from different raw materials and procedures. DNAs extraction was performed on pellets by chemical, enzymatic, resin mediated methods and their modifications. Suitable yield and DNA purity were obtained by modification of a method based on the use of PVP/CTAB to remove polyphenolic components and esopolysaccharides. By sequencing of bands from DGGE gel, Gluconacetobacter europaeus, Acetobacter malorum/cerevisiae and Acetobacter orleanensis were detected as main species in samples having more than 4% of acetic acid content. From samples having no acetic acid content, sequences retrieved from excised bands revealed high similarity with prokaryotes with no function on vinegar fermentation: Burkholderia spp., Cupriavidus spp., Lactococcus lactis and Leuconostoc mesenteroides. The method was suitable to be applied for no-culture study of vinegars containing polyphenols and esopolysaccharides allowing a more complete assessment of vinegar bacteria. PMID:21839388

  5. Genomic-scale capture and sequencing of endogenous DNA from feces

    PubMed Central

    Perry, George H.; Marioni, John C.; Melsted, Páll; Gilad, Yoav

    2010-01-01

    Genomic-level analyses of DNA from non-invasive sources would facilitate powerful conservation and evolutionary studies in natural populations of endangered and otherwise elusive species. However, the typical low quantity and poor quality of DNA that is extracted from non-invasive samples have generally precluded such work. Here we apply a modified DNA capture protocol that, when used in combination with massively-parallel sequencing technology, facilitates efficient and highly-accurate resequencing of megabases of specified nuclear genomic regions from fecal DNA samples. We validated our approach by comparing genetic variants identified from corresponding fecal and blood DNA samples of six western chimpanzees (Pan troglodytes verus) across more than 1.5 megabases of chromosome 21, chromosome X, and the complete mitochondrial genome. Our results suggest that it is now feasible to conduct genomic studies in natural populations for which constraints on invasive sampling have otherwise long been a barrier. The data we collected also provided an opportunity to examine western chimpanzee genetic diversity at unprecedented scale. Despite high mitochondrial genome diversity (π = 0.585%), western chimpanzees have a low ratio (0.42) of X chromosomal (π = 0.034%) to autosomal (chromosome 21 π = 0.081%) sequence diversity, a pattern that may reflect an unusual demographic history of this subspecies. PMID:21054605

  6. Investigating host-pathogen behavior and their interaction using genome-scale metabolic network models.

    PubMed

    Sadhukhan, Priyanka P; Raghunathan, Anu

    2014-01-01

    Genome Scale Metabolic Modeling methods represent one way to compute whole cell function starting from the genome sequence of an organism and contribute towards understanding and predicting the genotype-phenotype relationship. About 80 models spanning all the kingdoms of life from archaea to eukaryotes have been built till date and used to interrogate cell phenotype under varying conditions. These models have been used to not only understand the flux distribution in evolutionary conserved pathways like glycolysis and the Krebs cycle but also in applications ranging from value added product formation in Escherichia coli to predicting inborn errors of Homo sapiens metabolism. This chapter describes a protocol that delineates the process of genome scale metabolic modeling for analysing host-pathogen behavior and interaction using flux balance analysis (FBA). The steps discussed in the process include (1) reconstruction of a metabolic network from the genome sequence, (2) its representation in a precise mathematical framework, (3) its translation to a model, and (4) the analysis using linear algebra and optimization. The methods for biological interpretations of computed cell phenotypes in the context of individual host and pathogen models and their integration are also discussed. PMID:25048144

  7. Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

    PubMed Central

    2014-01-01

    Background At the beginning of the transcription process, the RNA polymerase (RNAP) core enzyme requires a σ-factor to recognize the genomic location at which the process initiates. Although the crucial role of σ-factors has long been appreciated and characterized for many individual promoters, we do not yet have a genome-scale assessment of their function. Results Using multiple genome-scale measurements, we elucidated the network of σ-factor and promoter interactions in Escherichia coli. The reconstructed network includes 4,724 σ-factor-specific promoters corresponding to transcription units (TUs), representing an increase of more than 300% over what has been previously reported. The reconstructed network was used to investigate competition between alternative σ-factors (the σ70 and σ38 regulons), confirming the competition model of σ substitution and negative regulation by alternative σ-factors. Comparison with σ-factor binding in Klebsiella pneumoniae showed that transcriptional regulation of conserved genes in closely related species is unexpectedly divergent. Conclusions The reconstructed network reveals the regulatory complexity of the promoter architecture in prokaryotic genomes, and opens a path to the direct determination of the systems biology of their transcriptional regulatory networks. PMID:24461193

  8. Informed Consent in Genome-Scale Research: What Do Prospective Participants Think?

    PubMed Central

    Trinidad, Susan Brown; Fullerton, Stephanie M.; Bares, Julie M.; Jarvik, Gail P.; Larson, Eric B.; Burke, Wylie

    2012-01-01

    Background To promote effective genome-scale research, genomic and clinical data for large population samples must be collected, stored, and shared. Methods We conducted focus groups with 45 members of a Seattle-based integrated healthcare delivery system to learn about their views and expectations for informed consent in genome-scale studies. Results Participants viewed information about study purpose, aims, and how and by whom study data could be used to be at least as important as information about risks and possible harms. They generally supported a tiered consent approach for specific issues, including research purpose, data sharing, and access to individual research results. Participants expressed a continuum of opinions with respect to the acceptability of broad consent, ranging from completely acceptable to completely unacceptable. Older participants were more likely to view the consent process in relational – rather than contractual – terms, compared with younger participants. The majority of participants endorsed seeking study subjects’ permission regarding material changes in study purpose and data sharing. Conclusions Although this study sample was limited in terms of racial and socioeconomic diversity, our results suggest a strong positive interest in genomic research on the part of at least some prospective participants and indicate a need for increased public engagement, as well as strategies for ongoing communication with study participants. PMID:23493836

  9. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    PubMed

    Masomian, Malihe; Rahman, Raja Noor Zaliha Raja Abd; Salleh, Abu Bakar; Basri, Mahiran

    2016-01-01

    Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+)-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents. PMID:26934700

  10. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases

    PubMed Central

    Salleh, Abu Bakar; Basri, Mahiran

    2016-01-01

    Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca2+-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65°C and retained ≥ 97% activity after incubation at 50°C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents. PMID:26934700

  11. A protocol for generating a high-quality genome-scale metabolic reconstruction

    PubMed Central

    Thiele, Ines; Palsson, Bernhard Ø.

    2011-01-01

    Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have developed over the past 10 years. These reconstructions represent structured knowledge-bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates myriad computational biological studies including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics, and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge-bases. Here, we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction as well as common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process. PMID:20057383

  12. A Systems Approach to Predict Oncometabolites via Context-Specific Genome-Scale Metabolic Networks

    PubMed Central

    Nam, Hojung; Campodonico, Miguel; Bordbar, Aarash; Hyduke, Daniel R.; Kim, Sangwoo; Zielinski, Daniel C.; Palsson, Bernhard O.

    2014-01-01

    Altered metabolism in cancer cells has been viewed as a passive response required for a malignant transformation. However, this view has changed through the recently described metabolic oncogenic factors: mutated isocitrate dehydrogenases (IDH), succinate dehydrogenase (SDH), and fumarate hydratase (FH) that produce oncometabolites that competitively inhibit epigenetic regulation. In this study, we demonstrate in silico predictions of oncometabolites that have the potential to dysregulate epigenetic controls in nine types of cancer by incorporating massive scale genetic mutation information (collected from more than 1,700 cancer genomes), expression profiling data, and deploying Recon 2 to reconstruct context-specific genome-scale metabolic models. Our analysis predicted 15 compounds and 24 substructures of potential oncometabolites that could result from the loss-of-function and gain-of-function mutations of metabolic enzymes, respectively. These results suggest a substantial potential for discovering unidentified oncometabolites in various forms of cancers. PMID:25232952

  13. Evaluation of Genome-Enabled Selection for Bacterial Cold Water Disease Resistance Using Progeny Performance Data in Rainbow Trout: Insights on Genotyping Methods and Genomic Prediction Models

    PubMed Central

    Vallejo, Roger L.; Leeds, Timothy D.; Fragomeni, Breno O.; Gao, Guangtu; Hernandez, Alvaro G.; Misztal, Ignacy; Welch, Timothy J.; Wiens, Gregory D.; Palti, Yniv

    2016-01-01

    Bacterial cold water disease (BCWD) causes significant economic losses in salmonid aquaculture, and traditional family-based breeding programs aimed at improving BCWD resistance have been limited to exploiting only between-family variation. We used genomic selection (GS) models to predict genomic breeding values (GEBVs) for BCWD resistance in 10 families from the first generation of the NCCCWA BCWD resistance breeding line, compared the predictive ability (PA) of GEBVs to pedigree-based estimated breeding values (EBVs), and compared the impact of two SNP genotyping methods on the accuracy of GEBV predictions. The BCWD phenotypes survival days (DAYS) and survival status (STATUS) had been recorded in training fish (n = 583) subjected to experimental BCWD challenge. Training fish, and their full sibs without phenotypic data that were used as parents of the subsequent generation, were genotyped using two methods: restriction-site associated DNA (RAD) sequencing and the Rainbow Trout Axiom® 57 K SNP array (Chip). Animal-specific GEBVs were estimated using four GS models: BayesB, BayesC, single-step GBLUP (ssGBLUP), and weighted ssGBLUP (wssGBLUP). Family-specific EBVs were estimated using pedigree and phenotype data in the training fish only. The PA of EBVs and GEBVs was assessed by correlating mean progeny phenotype (MPP) with mid-parent EBV (family-specific) or GEBV (animal-specific). The best GEBV predictions were similar to EBV with PA values of 0.49 and 0.46 vs. 0.50 and 0.41 for DAYS and STATUS, respectively. Among the GEBV prediction methods, ssGBLUP consistently had the highest PA. The RAD genotyping platform had GEBVs with similar PA to those of GEBVs from the Chip platform. The PA of ssGBLUP and wssGBLUP methods was higher with the Chip, but for BayesB and BayesC methods it was higher with the RAD platform. The overall GEBV accuracy in this study was low to moderate, likely due to the small training sample used. This study explored the potential of GS for

  14. Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

    PubMed

    Borodovsky, M; Rudd, K E; Koonin, E V

    1994-11-11

    The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. PMID:7984428

  15. Mutational analysis of the human mitochondrial genome branches into the realm of bacterial genetics

    SciTech Connect

    Howell, N.

    1996-10-01

    This is shaping up as a vintage year for studies of the genetics and evolution of the human mitochondrial genome (mtDNA). In a theoretical and experimental tour de force, Shenkar et al. (1996), on pages 772-780 of this issue, derive the mutation rate of the 4,977-bp (or {open_quotes}common{close_quotes}) deletion in the human mtDNA through refinement and extension of fluctuation analysis, a technique that was first used >50 years ago. Shenkar et al., in essence, have solved or bypassed many of the difficulties that are inherent in the application of fluctuation analysis to human mitochondrial gene mutations. Their study is important for two principal reasons. In the first place, high levels of this deletion cause a variety of pathological disorders, including Kearns-Sayre syndrome and chronic progressive external ophthalmoplegia. Their current report, therefore, is a major step in the elucidation of the molecular genetic pathogenesis of this group of mitochondrial disorders. For example, it now may be feasible to analyze the effects of selection on transmission and segregation of this deletion and, perhaps, other mtDNA mutations as well. Second, and at a broader level, the approach of Shenkar et al. should find widespread applicability to the study of other mtDNA mutations. It has been recognized for several years that mammalian mtDNA mutates much more rapidly than nuclear DNA, a phenomenon with potentially profound evolutionary implications. It is exciting and useful, both experimentally and theoretically, that this {open_quotes}old{close_quotes} approach can be used for {open_quotes}new{close_quotes} applications. 56 refs.

  16. Similar Processes but Different Environmental Filters for Soil Bacterial and Fungal Community Composition Turnover on a Broad Spatial Scale

    PubMed Central

    Chemidlin Prévost-Bouré, Nicolas; Dequiedt, Samuel; Thioulouse, Jean; Lelièvre, Mélanie; Saby, Nicolas P. A.; Jolivet, Claudy; Arrouays, Dominique; Plassart, Pierre; Lemanceau, Philippe; Ranjard, Lionel

    2014-01-01

    Spatial scaling of microorganisms has been demonstrated over the last decade. However, the processes and environmental filters shaping soil microbial community structure on a broad spatial scale still need to be refined and ranked. Here, we compared bacterial and fungal community composition turnovers through a biogeographical approach on the same soil sampling design at a broad spatial scale (area range: 13300 to 31000 km2): i) to examine their spatial structuring; ii) to investigate the relative importance of environmental selection and spatial autocorrelation in determining their community composition turnover; and iii) to identify and rank the relevant environmental filters and scales involved in their spatial variations. Molecular fingerprinting of soil bacterial and fungal communities was performed on 413 soils from four French regions of contrasting environmental heterogeneity (LandesBacterial and fungal community composition turnovers were mainly driven by environmental selection explaining from 10% to 20% of community composition variations, but spatial variables also explained 3% to 9% of total variance. These variables highlighted significant spatial autocorrelation of both communities unexplained by the environmental variables measured and could partly be explained by dispersal limitations. Although the identified filters and their hierarchy were dependent on the region and organism, selection was systematically based on a common group of environmental variables: pH, trophic resources, texture and land use. Spatial autocorrelation was also important at coarse (80 to

  17. Large scale distribution of bacterial communities in the upper Paraná River floodplain

    PubMed Central

    Chiaramonte, Josiane Barros; Roberto, Maria do Carmo; Pagioro, Thomaz Aurélio

    2014-01-01

    A bacterial community has a central role in nutrient cycle in aquatic habitats. Therefore, it is important to analyze how this community is distributed throughout different locations. Thirty-six different sites in the upper Paraná River floodplain were surveyed to determine the influence of environmental variable in bacterial community composition. The sites are classified as rivers, channels, and floodplain lakes connected or unconnected to the main river channel. The bacterial community structure was analyzed by fluorescent in situ hybridization (FISH) technique, based on frequency of the main domains Bacteria and Archaea, and subdivisions of the phylum Proteobacteria (Alpha-proteobacteria, Beta-proteobacteria, Gamma-proteobacteria) and the Cytophaga-Flavobacterium cluster. It has been demonstrated that the bacterial community differed in density and frequency of the studied groups. And these differences responded to distinct characteristics of the three main rivers of the floodplain as well as to the classification of the environments found in this floodplain. We conclude that dissimilarities in the bacterial community structure are related to environmental heterogeneity, and the limnological variables that most predicted bacterial communities in the upper Paraná River floodplain was total and ammoniacal nitrogen, orthophosphate and chlorophyll-a. PMID:25763022

  18. Diagnostics for Stochastic Genome-Scale Modeling via Model Slicing and Debugging

    PubMed Central

    Tsai, Kevin J.; Chang, Chuan-Hsiung

    2014-01-01

    Modeling of biological behavior has evolved from simple gene expression plots represented by mathematical equations to genome-scale systems biology networks. However, due to obstacles in complexity and scalability of creating genome-scale models, several biological modelers have turned to programming or scripting languages and away from modeling fundamentals. In doing so, they have traded the ability to have exchangeable, standardized model representation formats, while those that remain true to standardized model representation are faced with challenges in model complexity and analysis. We have developed a model diagnostic methodology inspired by program slicing and debugging and demonstrate the effectiveness of the methodology on a genome-scale metabolic network model published in the BioModels database. The computer-aided identification revealed specific points of interest such as reversibility of reactions, initialization of species amounts, and parameter estimation that improved a candidate cell's adenosine triphosphate production. We then compared the advantages of our methodology over other modeling techniques such as model checking and model reduction. A software application that implements the methodology is available at http://gel.ym.edu.tw/gcs/. PMID:25368989

  19. Achieving Metabolic Flux Analysis for S. cerevisiae at a Genome-Scale: Challenges, Requirements, and Considerations

    PubMed Central

    Gopalakrishnan, Saratram; Maranas, Costas D.

    2015-01-01

    Recent advances in 13C-Metabolic flux analysis (13C-MFA) have increased its capability to accurately resolve fluxes using a genome-scale model with narrow confidence intervals without pre-judging the activity or inactivity of alternate metabolic pathways. However, the necessary precautions, computational challenges, and minimum data requirements for successful analysis remain poorly established. This review aims to establish the necessary guidelines for performing 13C-MFA at the genome-scale for a compartmentalized eukaryotic system such as yeast in terms of model and data requirements, while addressing key issues such as statistical analysis and network complexity. We describe the various approaches used to simplify the genome-scale model in the absence of sufficient experimental flux measurements, the availability and generation of reaction atom mapping information, and the experimental flux and metabolite labeling distribution measurements to ensure statistical validity of the obtained flux distribution. Organism-specific challenges such as the impact of compartmentalization of metabolism, variability of biomass composition, and the cell-cycle dependence of metabolism are discussed. Identification of errors arising from incorrect gene annotation and suggested alternate routes using MFA are also highlighted. PMID:26393660

  20. Comparative Genome-Scale Reconstruction of Gapless Metabolic Networks for Present and Ancestral Species

    PubMed Central

    Pitkänen, Esa; Jouhten, Paula; Hou, Jian; Syed, Muhammad Fahad; Blomberg, Peter; Kludas, Jana; Oja, Merja; Holm, Liisa; Penttilä, Merja; Rousu, Juho; Arvas, Mikko

    2014-01-01

    We introduce a novel computational approach, CoReCo, for comparative metabolic reconstruction and provide genome-scale metabolic network models for 49 important fungal species. Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. High reconstruction accuracy is demonstrated by comparisons to the well-curated Saccharomyces cerevisiae consensus model and large-scale knock-out experiments. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. We demonstrate the functionality and usability of the reconstructed fungal models with computational steady-state biomass production experiment, as these fungi include some of the most important production organisms in industrial biotechnology. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments. CoReCo is available at http://esaskar.github.io/CoReCo/. PMID:24516375

  1. Modeling Method for Increased Precision and Scope of Directly Measurable Fluxes at a Genome-Scale.

    PubMed

    McCloskey, Douglas; Young, Jamey D; Xu, Sibei; Palsson, Bernhard O; Feist, Adam M

    2016-04-01

    Metabolic flux analysis (MFA) is considered to be the gold standard for determining the intracellular flux distribution of biological systems. The majority of work using MFA has been limited to core models of metabolism due to challenges in implementing genome-scale MFA and the undesirable trade-off between increased scope and decreased precision in flux estimations. This work presents a tunable workflow for expanding the scope of MFA to the genome-scale without trade-offs in flux precision. The genome-scale MFA model presented here, iDM2014, accounts for 537 net reactions, which includes the core pathways of traditional MFA models and also covers the additional pathways of purine, pyrimidine, isoprenoid, methionine, riboflavin, coenzyme A, and folate, as well as other biosynthetic pathways. When evaluating the iDM2014 using a set of measured intracellular intermediate and cofactor mass isotopomer distributions (MIDs),1 it was found that a total of 232 net fluxes of central and peripheral metabolism could be resolved in the E. coli network. The increase in scope was shown to cover the full biosynthetic route to an expanded set of bioproduction pathways, which should facilitate applications such as the design of more complex bioprocessing strains and aid in identifying new antimicrobials. Importantly, it was found that there was no loss in precision of core fluxes when compared to a traditional core model, and additionally there was an overall increase in precision when considering all observable reactions. PMID:26981784

  2. Genome-scale phylogenetic function annotation of large and diverse protein families

    PubMed Central

    Engelhardt, Barbara E.; Jordan, Michael I.; Srouji, John R.; Brenner, Steven E.

    2011-01-01

    The Statistical Inference of Function Through Evolutionary Relationships (SIFTER) framework uses a statistical graphical model that applies phylogenetic principles to automate precise protein function prediction. Here we present a revised approach (SIFTER version 2.0) that enables annotations on a genomic scale. SIFTER 2.0 produces equivalently precise predictions compared to the earlier version on a carefully studied family and on a collection of 100 protein families. We have added an approximation method to SIFTER 2.0 and show a 500-fold improvement in speed with minimal impact on prediction results in the functionally diverse sulfotransferase protein family. On the Nudix protein family, previously inaccessible to the SIFTER framework because of the 66 possible molecular functions, SIFTER achieved 47.4% accuracy on experimental data (where BLAST achieved 34.0%). Finally, we used SIFTER to annotate all of the Schizosaccharomyces pombe proteins with experimental functional characterizations, based on annotations from proteins in 46 fungal genomes. SIFTER precisely predicted molecular function for 45.5% of the characterized proteins in this genome, as compared with four current function prediction methods that precisely predicted function for 62.6%, 30.6%, 6.0%, and 5.7% of these proteins. We use both precision-recall curves and ROC analyses to compare these genome-scale predictions across the different methods and to assess performance on different types of applications. SIFTER 2.0 is capable of predicting protein molecular function for large and functionally diverse protein families using an approximate statistical model, enabling phylogenetics-based protein function prediction for genome-wide analyses. The code for SIFTER and protein family data are available at http://sifter.berkeley.edu. PMID:21784873

  3. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence.

    PubMed

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A; Garrido, Joseba M; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-11-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  4. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence

    PubMed Central

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A.; Garrido, Joseba M.; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-01-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  5. Weakly Deleterious Mutations and Low Rates of Recombination Limit the Impact of Natural Selection on Bacterial Genomes