Science.gov

Sample records for bacteria comparative genomic

  1. Comparative genomics of green sulfur bacteria.

    PubMed

    Davenport, Colin; Ussery, David W; Tümmler, Burkhard

    2010-06-01

    Eleven completely sequenced Chlorobi genomes were compared in oligonucleotide usage, gene contents, and synteny. The green sulfur bacteria (GSB) are equipped with a core genome that sustains their anoxygenic phototrophic lifestyle by photosynthesis, sulfur oxidation, and CO(2) fixation. Whole-genome gene family and single gene sequence comparisons yielded similar phylogenetic trees of the sequenced chromosomes indicating a concerted vertical evolution of large gene sets. Chromosomal synteny of genes is not preserved in the phylum Chlorobi. The accessory genome is characterized by anomalous oligonucleotide usage and endows the strains with individual features for transport, secretion, cell wall, extracellular constituents, and a few elements of the biosynthetic apparatus. Giant genes are a peculiar feature of the genera Chlorobium and Prosthecochloris. The predicted proteins have a huge molecular weight of 10(6), and are probably instrumental for the bacteria to generate their own intimate (micro)environment.

  2. Comparative genomics of the lactic acid bacteria

    SciTech Connect

    Makarova, K.; Slesarev, A.; Wolf, Y.; Sorokin, A.; Mirkin, B.; Koonin, E.; Pavlov, A.; Pavlova, N.; Karamychev, V.; Polouchine, N.; Shakhova, V.; Grigoriev, I.; Lou, Y.; Rokhsar, D.; Lucas, S.; Huang, K.; Goodstein, D. M.; Hawkins, T.; Plengvidhya, V.; Welker, D.; Hughes, J.; Goh, Y.; Benson, A.; Baldwin, K.; Lee, J. -H.; Diaz-Muniz, I.; Dosti, B.; Smeianov, V; Wechter, W.; Barabote, R.; Lorca, G.; Altermann, E.; Barrangou, R.; Ganesan, B.; Xie, Y.; Rawsthorne, H.; Tamir, D.; Parker, C.; Breidt, F.; Broadbent, J.; Hutkins, R.; O'Sullivan, D.; Steele, J.; Unlu, G.; Saier, M.; Klaenhammer, T.; Richardson, P.; Kozyavkin, S.; Weimer, B.; Mills, D.

    2006-06-01

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.

  3. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    NASA Astrophysics Data System (ADS)

    Song, Jiuzhou; Ware, Tony; Liu, Shu-Lin; Surette, M.

    2004-12-01

    Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  4. Comparative genomics of phages and prophages in lactic acid bacteria.

    PubMed

    Desiere, Frank; Lucchini, Sacha; Canchaya, Carlos; Ventura, Marco; Brüssow, Harald

    2002-08-01

    Comparative phage genomics has become possible due to the availability of more than 100 complete phage genome sequences and the development of powerful bioinformatics tools. This technology, profiting from classical molecular-biology knowledge, has opened avenues of research for topics, which were difficult to address in the past. Now, it is possible to retrace part of the evolutionary history of phage modules by comparative genomics. The diagnosis of relatedness is hereby not uniquely based on sequence similarity alone, but includes topological considerations of genome organization. Detailed transcription maps have allowed in silico predictions of genome organization to be verified and refined. This comparative knowledge is providing the basis for a new taxonomic classification concept for bacteriophages infecting low G + C-content Gram-positive bacteria based on the genetic organization of the structural gene module. An Sfi21-like and an Sfi11-like genus of Siphoviridae is proposed. The gene maps of many phages show remarkable synteny in their structural genes defining a lambda super-group within Siphoviridae. A hierarchy of relatedness within the lambda super-group suggests elements of vertical evolution in Siphoviridae. Tailed phages are the result of both vertical and horizontal evolution and are thus fascinating objects for the study of molecular evolution. Prophage sequences integrated into the genomes of their bacterial host present theoretical challenges for evolutionary biologists. Prophages represent up to 10% of the genome in some LAB. In pathogenic streptococci prophages confer genes of selective value for the lysogenic cell. The lysogenic conversion genes are located between the lysin gene and the right phage attachment site. Non-attributed genes were found at the same genome position of prophages from lactic streptococci. These genes belong to the few prophage genes transcribed in the lysogen. Prophages from dairy bacteria might therefore also

  5. Comparative genomics of defense systems in archaea and bacteria

    PubMed Central

    Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.

    2013-01-01

    Our knowledge of prokaryotic defense systems has vastly expanded as the result of comparative genomic analysis, followed by experimental validation. This expansion is both quantitative, including the discovery of diverse new examples of known types of defense systems, such as restriction-modification or toxin-antitoxin systems, and qualitative, including the discovery of fundamentally new defense mechanisms, such as the CRISPR-Cas immunity system. Large-scale statistical analysis reveals that the distribution of different defense systems in bacterial and archaeal taxa is non-uniform, with four groups of organisms distinguishable with respect to the overall abundance and the balance between specific types of defense systems. The genes encoding defense system components in bacterial and archaea typically cluster in defense islands. In addition to genes encoding known defense systems, these islands contain numerous uncharacterized genes, which are candidates for new types of defense systems. The tight association of the genes encoding immunity systems and dormancy- or cell death-inducing defense systems in prokaryotic genomes suggests that these two major types of defense are functionally coupled, providing for effective protection at the population level. PMID:23470997

  6. Genomics of oral bacteria.

    PubMed

    Duncan, Margaret J

    2003-01-01

    Advances in bacterial genetics came with the discovery of the genetic code, followed by the development of recombinant DNA technologies. Now the field is undergoing a new revolution because of investigators' ability to sequence and assemble complete bacterial genomes. Over 200 genome projects have been completed or are in progress, and the oral microbiology research community has benefited through projects for oral bacteria and their non-oral-pathogen relatives. This review describes features of several oral bacterial genomes, and emphasizes the themes of species relationships, comparative genomics, and lateral gene transfer. Genomics is having a broad impact on basic research in microbial pathogenesis, and will lead to new approaches in clinical research and therapeutics. The oral microbiota is a unique community especially suited for new challenges to sequence the metagenomes of microbial consortia, and the genomes of uncultivable bacteria.

  7. Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes

    PubMed Central

    2010-01-01

    Background Structured noncoding RNAs perform many functions that are essential for protein synthesis, RNA processing, and gene regulation. Structured RNAs can be detected by comparative genomics, in which homologous sequences are identified and inspected for mutations that conserve RNA secondary structure. Results By applying a comparative genomics-based approach to genome and metagenome sequences from bacteria and archaea, we identified 104 candidate structured RNAs and inferred putative functions for many of these. Twelve candidate metabolite-binding RNAs were identified, three of which were validated, including one reported herein that binds the coenzyme S-adenosylmethionine. Newly identified cis-regulatory RNAs are implicated in photosynthesis or nitrogen regulation in cyanobacteria, purine and one-carbon metabolism, stomach infection by Helicobacter, and many other physiological processes. A candidate riboswitch termed crcB is represented in both bacteria and archaea. Another RNA motif may control gene expression from 3'-untranslated regions of mRNAs, which is unusual for bacteria. Many noncoding RNAs that likely act in trans are also revealed, and several of the noncoding RNA candidates are found mostly or exclusively in metagenome DNA sequences. Conclusions This work greatly expands the variety of highly structured noncoding RNAs known to exist in bacteria and archaea and provides a starting point for biochemical and genetic studies needed to validate their biologic functions. Given the sustained rate of RNA discovery over several similar projects, we expect that far more structured RNAs remain to be discovered from bacterial and archaeal organisms. PMID:20230605

  8. Comparative Genomics of Interreplichore Translocations in Bacteria: A Measure of Chromosome Topology?

    PubMed Central

    Khedkar, Supriya; Seshasayee, Aswin Sai Narain

    2016-01-01

    Genomes evolve not only in base sequence but also in terms of their architecture, defined by gene organization and chromosome topology. Whereas genome sequence data inform us about the changes in base sequences for a large variety of organisms, the study of chromosome topology is restricted to a few model organisms studied using microscopy and chromosome conformation capture techniques. Here, we exploit whole genome sequence data to study the link between gene organization and chromosome topology in bacteria. Using comparative genomics across ∼250 pairs of closely related bacteria we show that: (a) many organisms show a high degree of interreplichore translocations throughout the chromosome and not limited to the inversion-prone terminus (ter) or the origin of replication (oriC); (b) translocation maps may reflect chromosome topologies; and (c) symmetric interreplichore translocations do not disrupt the distance of a gene from oriC or affect gene expression states or strand biases in gene densities. In summary, we suggest that translocation maps might be a first line in defining a gross chromosome topology given a pair of closely related genome sequences. PMID:27172194

  9. Comparative Genomics of Interreplichore Translocations in Bacteria: A Measure of Chromosome Topology?

    PubMed

    Khedkar, Supriya; Seshasayee, Aswin Sai Narain

    2016-06-01

    Genomes evolve not only in base sequence but also in terms of their architecture, defined by gene organization and chromosome topology. Whereas genome sequence data inform us about the changes in base sequences for a large variety of organisms, the study of chromosome topology is restricted to a few model organisms studied using microscopy and chromosome conformation capture techniques. Here, we exploit whole genome sequence data to study the link between gene organization and chromosome topology in bacteria. Using comparative genomics across ∼250 pairs of closely related bacteria we show that: (a) many organisms show a high degree of interreplichore translocations throughout the chromosome and not limited to the inversion-prone terminus (ter) or the origin of replication (oriC); (b) translocation maps may reflect chromosome topologies; and (c) symmetric interreplichore translocations do not disrupt the distance of a gene from oriC or affect gene expression states or strand biases in gene densities. In summary, we suggest that translocation maps might be a first line in defining a gross chromosome topology given a pair of closely related genome sequences.

  10. Probing the diversity of chloromethane-degrading bacteria by comparative genomics and isotopic fractionation.

    PubMed

    Nadalig, Thierry; Greule, Markus; Bringel, Françoise; Keppler, Frank; Vuilleumier, Stéphane

    2014-01-01

    Chloromethane (CH3Cl) is produced on earth by a variety of abiotic and biological processes. It is the most important halogenated trace gas in the atmosphere, where it contributes to ozone destruction. Current estimates of the global CH3Cl budget are uncertain and suggest that microorganisms might play a more important role in degrading atmospheric CH3Cl than previously thought. Its degradation by bacteria has been demonstrated in marine, terrestrial, and phyllospheric environments. Improving our knowledge of these degradation processes and their magnitude is thus highly relevant for a better understanding of the global budget of CH3Cl. The cmu pathway, for chloromethane utilisation, is the only microbial pathway for CH3Cl degradation elucidated so far, and was characterized in detail in aerobic methylotrophic Alphaproteobacteria. Here, we reveal the potential of using a two-pronged approach involving a combination of comparative genomics and isotopic fractionation during CH3Cl degradation to newly address the question of the diversity of chloromethane-degrading bacteria in the environment. Analysis of available bacterial genome sequences reveals that several bacteria not yet known to degrade CH3Cl contain part or all of the complement of cmu genes required for CH3Cl degradation. These organisms, unlike bacteria shown to grow with CH3Cl using the cmu pathway, are obligate anaerobes. On the other hand, analysis of the complete genome of the chloromethane-degrading bacterium Leisingera methylohalidivorans MB2 showed that this bacterium does not contain cmu genes. Isotope fractionation experiments with L. methylohalidivorans MB2 suggest that the unknown pathway used by this bacterium for growth with CH3Cl can be differentiated from the cmu pathway. This result opens the prospect that contributions from bacteria with the cmu and Leisingera-type pathways to the atmospheric CH3Cl budget may be teased apart in the future.

  11. [Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

    PubMed

    Kai, Xia; Xinle, Liang; Yudong, Li

    2015-12-01

    The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.

  12. Comparative genomics of Roseobacter clade bacteria isolated from the accessory nidamental gland of Euprymna scolopes.

    PubMed

    Collins, Andrew J; Fullmer, Matthew S; Gogarten, Johann P; Nyholm, Spencer V

    2015-01-01

    The accessory nidamental gland (ANG) of the female Hawaiian bobtail squid, Euprymna scolopes, houses a consortium of bacteria including members of the Flavobacteriales, Rhizobiales, and Verrucomicrobia but is dominated by members of the Roseobacter clade (Rhodobacterales) within the Alphaproteobacteria. These bacteria are deposited into the jelly coat of the squid's eggs, however, the function of the ANG and its bacterial symbionts has yet to be elucidated. In order to gain insight into this consortium and its potential role in host reproduction, we cultured 12 Rhodobacterales isolates from ANGs of sexually mature female squid and sequenced their genomes with Illumina sequencing technology. For taxonomic analyses, the ribosomal proteins of 79 genomes representing both roseobacters and non-roseobacters along with a separate MLSA analysis of 33 housekeeping genes from Roseobacter organisms placed all 12 isolates from the ANG within two groups of a single Roseobacter clade. Average nucelotide identity analysis suggests the ANG isolates represent three genera (Leisingera, Ruegeria, and Tateyamaria) comprised of seven putative species groups. All but one of the isolates contains a predicted Type VI secretion system, which has been shown to be important in secreting signaling and/or effector molecules in host-microbe associations and in bacteria-bacteria interactions. All sequenced genomes also show potential for secondary metabolite production, and are predicted to be involved with the production of acyl homoserine lactones (AHLs) and/or siderophores. An AHL bioassay confirmed AHL production in three tested isolates and from whole ANG homogenates. The dominant symbiont, Leisingera sp. ANG1, showed greater viability in iron-limiting conditions compared to other roseobacters, possibly due to higher levels of siderophore production. Future comparisons will try to elucidate novel metabolic pathways of the ANG symbionts to understand their putative role in host development.

  13. Identification of DNA Methyltransferase Genes in Human Pathogenic Bacteria by Comparative Genomics.

    PubMed

    Brambila-Tapia, Aniel Jessica Leticia; Poot-Hernández, Augusto Cesar; Perez-Rueda, Ernesto; Rodríguez-Vázquez, Katya

    2016-06-01

    DNA methylation plays an important role in gene expression and virulence in some pathogenic bacteria. In this report, we describe DNA methyltransferases (MTases) present in human pathogenic bacteria and compared them with related species, which are not pathogenic or less pathogenic, based in comparative genomics. We performed a search in the KEGG database of the KEGG database orthology groups associated with adenine and cytosine DNA MTase activities (EC: 2.1.1.37, EC: 2.1.1.113 and EC: 2.1.1.72) in 37 human pathogenic species and 18 non/less pathogenic relatives and performed comparisons of the number of these MTases sequences according to their genome size, the DNA MTase type and with their non-less pathogenic relatives. We observed that Helicobacter pylori and Neisseria spp. presented the highest number of MTases while ten different species did not present a predicted DNA MTase. We also detected a significant increase of adenine MTases over cytosine MTases (2.19 vs. 1.06, respectively, p < 0.001). Adenine MTases were the only MTases associated with restriction modification systems and DNA MTases associated with type I restriction modification systems were more numerous than those associated with type III restriction modification systems (0.84 vs. 0.17, p < 0.001); additionally, there was no correlation with the genome size and the total number of DNA MTases, indicating that the number of DNA MTases is related to the particular evolution and lifestyle of specific species, regulating the expression of virulence genes in some pathogenic bacteria.

  14. Comparative genomics of Roseobacter clade bacteria isolated from the accessory nidamental gland of Euprymna scolopes

    PubMed Central

    Collins, Andrew J.; Fullmer, Matthew S.; Gogarten, Johann P.; Nyholm, Spencer V.

    2015-01-01

    The accessory nidamental gland (ANG) of the female Hawaiian bobtail squid, Euprymna scolopes, houses a consortium of bacteria including members of the Flavobacteriales, Rhizobiales, and Verrucomicrobia but is dominated by members of the Roseobacter clade (Rhodobacterales) within the Alphaproteobacteria. These bacteria are deposited into the jelly coat of the squid’s eggs, however, the function of the ANG and its bacterial symbionts has yet to be elucidated. In order to gain insight into this consortium and its potential role in host reproduction, we cultured 12 Rhodobacterales isolates from ANGs of sexually mature female squid and sequenced their genomes with Illumina sequencing technology. For taxonomic analyses, the ribosomal proteins of 79 genomes representing both roseobacters and non-roseobacters along with a separate MLSA analysis of 33 housekeeping genes from Roseobacter organisms placed all 12 isolates from the ANG within two groups of a single Roseobacter clade. Average nucelotide identity analysis suggests the ANG isolates represent three genera (Leisingera, Ruegeria, and Tateyamaria) comprised of seven putative species groups. All but one of the isolates contains a predicted Type VI secretion system, which has been shown to be important in secreting signaling and/or effector molecules in host–microbe associations and in bacteria–bacteria interactions. All sequenced genomes also show potential for secondary metabolite production, and are predicted to be involved with the production of acyl homoserine lactones (AHLs) and/or siderophores. An AHL bioassay confirmed AHL production in three tested isolates and from whole ANG homogenates. The dominant symbiont, Leisingera sp. ANG1, showed greater viability in iron-limiting conditions compared to other roseobacters, possibly due to higher levels of siderophore production. Future comparisons will try to elucidate novel metabolic pathways of the ANG symbionts to understand their putative role in host

  15. Comparative evaluation of the genomes of three common Drosophila-associated bacteria

    PubMed Central

    Petkau, Kristina; Fast, David; Duggal, Aashna

    2016-01-01

    ABSTRACT Drosophila melanogaster is an excellent model to explore the molecular exchanges that occur between an animal intestine and associated microbes. Previous studies in Drosophila uncovered a sophisticated web of host responses to intestinal bacteria. The outcomes of these responses define critical events in the host, such as the establishment of immune responses, access to nutrients, and the rate of larval development. Despite our steady march towards illuminating the host machinery that responds to bacterial presence in the gut, there are significant gaps in our understanding of the microbial products that influence bacterial association with a fly host. We sequenced and characterized the genomes of three common Drosophila-associated microbes: Lactobacillus plantarum, Lactobacillus brevis and Acetobacter pasteurianus. For each species, we compared the genomes of Drosophila-associated strains to the genomes of strains isolated from alternative sources. We found that environmental Lactobacillus strains readily associated with adult Drosophila and were similar to fly isolates in terms of genome organization. In contrast, we identified a strain of A. pasteurianus that apparently fails to associate with adult Drosophila due to an inability to grow on fly nutrient food. Comparisons between association competent and incompetent A. pasteurianus strains identified a short list of candidate genes that may contribute to survival on fly medium. Many of the gene products unique to fly-associated strains have established roles in the stabilization of host-microbe interactions. These data add to a growing body of literature that examines the microbial perspective of host-microbe relationships. PMID:27493201

  16. Comparative Genomics of Syntrophic Branched-Chain Fatty Acid Degrading Bacteria

    PubMed Central

    Narihiro, Takashi; Nobu, Masaru K.; Tamaki, Hideyuki; Kamagata, Yoichi; Sekiguchi, Yuji; Liu, Wen-Tso

    2016-01-01

    The syntrophic degradation of branched-chain fatty acids (BCFAs) such as 2-methylbutyrate and isobutyrate is an essential step in the production of methane from proteins/amino acids in anaerobic ecosystems. While a few syntrophic BCFA-degrading bacteria have been isolated, their metabolic pathways in BCFA and short-chain fatty acid (SCFA) degradation as well as energy conservation systems remain unclear. In an attempt to identify these pathways, we herein performed comparative genomics of three syntrophic bacteria: 2-methylbutyrate-degrading “Syntrophomonas wolfei subsp. methylbutyratica” strain JCM 14075T (=4J5T), isobutyrate-degrading Syntrophothermus lipocalidus strain TGB-C1T, and non-BCFA-metabolizing S. wolfei subsp. wolfei strain GöttingenT. We demonstrated that 4J5 and TGB-C1 both encode multiple genes/gene clusters involved in β-oxidation, as observed in the Göttingen genome, which has multiple copies of genes associated with butyrate degradation. The 4J5 genome possesses phylogenetically distinct β-oxidation genes, which may be involved in 2-methylbutyrate degradation. In addition, these Syntrophomonadaceae strains harbor various hydrogen/formate generation systems (i.e., electron-bifurcating hydrogenase, formate dehydrogenase, and membrane-bound hydrogenase) and energy-conserving electron transport systems, including electron transfer flavoprotein (ETF)-linked acyl-CoA dehydrogenase, ETF-linked iron-sulfur binding reductase, ETF dehydrogenase (FixABCX), and flavin oxidoreductase-heterodisulfide reductase (Flox-Hdr). Unexpectedly, the TGB-C1 genome encodes a nitrogenase complex, which may function as an alternative H2 generation mechanism. These results suggest that the BCFA-degrading syntrophic strains 4J5 and TGB-C1 possess specific β-oxidation-related enzymes for BCFA oxidation as well as appropriate energy conservation systems to perform thermodynamically unfavorable syntrophic metabolism. PMID:27431485

  17. Comparative genomics of pyridoxal 5′-phosphate-dependent transcription factor regulons in Bacteria

    PubMed Central

    Suvorova, Inna A.

    2016-01-01

    The MocR-subfamily transcription factors (MocR-TFs) characterized by the GntR-family DNA-binding domain and aminotransferase-like sensory domain are broadly distributed among certain lineages of Bacteria. Characterized MocR-TFs bind pyridoxal 5′-phosphate (PLP) and control transcription of genes involved in PLP, gamma aminobutyric acid (GABA) and taurine metabolism via binding specific DNA operator sites. To identify putative target genes and DNA binding motifs of MocR-TFs, we performed comparative genomics analysis of over 250 bacterial genomes. The reconstructed regulons for 825 MocR-TFs comprise structural genes from over 200 protein families involved in diverse biological processes. Using the genome context and metabolic subsystem analysis we tentatively assigned functional roles for 38 out of 86 orthologous groups of studied regulators. Most of these MocR-TF regulons are involved in PLP metabolism, as well as utilization of GABA, taurine and ectoine. The remaining studied MocR-TF regulators presumably control genes encoding enzymes involved in reduction/oxidation processes, various transporters and PLP-dependent enzymes, for example aminotransferases. Predicted DNA binding motifs of MocR-TFs are generally similar in each orthologous group and are characterized by two to four repeated sequences. Identified motifs were classified according to their structures. Motifs with direct and/or inverted repeat symmetry constitute the majority of inferred DNA motifs, suggesting preferable TF dimerization in head-to-tail or head-to-head configuration. The obtained genomic collection of in silico reconstructed MocR-TF motifs and regulons in Bacteria provides a basis for future experimental characterization of molecular mechanisms for various regulators in this family. PMID:28348826

  18. Assessment of transfer methods for comparative genomics of regulatory networks in bacteria.

    PubMed

    Kılıç, Sefa; Erill, Ivan

    2016-08-31

    Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision

  19. A comparative genomic analysis of energy metabolism in sulfate reducing bacteria and archaea.

    PubMed

    Pereira, Inês A Cardoso; Ramos, Ana Raquel; Grein, Fabian; Marques, Marta Coimbra; da Silva, Sofia Marques; Venceslau, Sofia Santos

    2011-01-01

    The number of sequenced genomes of sulfate reducing organisms (SRO) has increased significantly in the recent years, providing an opportunity for a broader perspective into their energy metabolism. In this work we carried out a comparative survey of energy metabolism genes found in 25 available genomes of SRO. This analysis revealed a higher diversity of possible energy conserving pathways than classically considered to be present in these organisms, and permitted the identification of new proteins not known to be present in this group. The Deltaproteobacteria (and Thermodesulfovibrio yellowstonii) are characterized by a large number of cytochromes c and cytochrome c-associated membrane redox complexes, indicating that periplasmic electron transfer pathways are important in these bacteria. The Archaea and Clostridia groups contain practically no cytochromes c or associated membrane complexes. However, despite the absence of a periplasmic space, a few extracytoplasmic membrane redox proteins were detected in the Gram-positive bacteria. Several ion-translocating complexes were detected in SRO including H(+)-pyrophosphatases, complex I homologs, Rnf, and Ech/Coo hydrogenases. Furthermore, we found evidence that cytoplasmic electron bifurcating mechanisms, recently described for other anaerobes, are also likely to play an important role in energy metabolism of SRO. A number of cytoplasmic [NiFe] and [FeFe] hydrogenases, formate dehydrogenases, and heterodisulfide reductase-related proteins are likely candidates to be involved in energy coupling through electron bifurcation, from diverse electron donors such as H(2), formate, pyruvate, NAD(P)H, β-oxidation, and others. In conclusion, this analysis indicates that energy metabolism of SRO is far more versatile than previously considered, and that both chemiosmotic and flavin-based electron bifurcating mechanisms provide alternative strategies for energy conservation.

  20. A Comparative Genomic Analysis of Energy Metabolism in Sulfate Reducing Bacteria and Archaea

    PubMed Central

    Pereira, Inês A. Cardoso; Ramos, Ana Raquel; Grein, Fabian; Marques, Marta Coimbra; da Silva, Sofia Marques; Venceslau, Sofia Santos

    2011-01-01

    The number of sequenced genomes of sulfate reducing organisms (SRO) has increased significantly in the recent years, providing an opportunity for a broader perspective into their energy metabolism. In this work we carried out a comparative survey of energy metabolism genes found in 25 available genomes of SRO. This analysis revealed a higher diversity of possible energy conserving pathways than classically considered to be present in these organisms, and permitted the identification of new proteins not known to be present in this group. The Deltaproteobacteria (and Thermodesulfovibrio yellowstonii) are characterized by a large number of cytochromes c and cytochrome c-associated membrane redox complexes, indicating that periplasmic electron transfer pathways are important in these bacteria. The Archaea and Clostridia groups contain practically no cytochromes c or associated membrane complexes. However, despite the absence of a periplasmic space, a few extracytoplasmic membrane redox proteins were detected in the Gram-positive bacteria. Several ion-translocating complexes were detected in SRO including H+-pyrophosphatases, complex I homologs, Rnf, and Ech/Coo hydrogenases. Furthermore, we found evidence that cytoplasmic electron bifurcating mechanisms, recently described for other anaerobes, are also likely to play an important role in energy metabolism of SRO. A number of cytoplasmic [NiFe] and [FeFe] hydrogenases, formate dehydrogenases, and heterodisulfide reductase-related proteins are likely candidates to be involved in energy coupling through electron bifurcation, from diverse electron donors such as H2, formate, pyruvate, NAD(P)H, β-oxidation, and others. In conclusion, this analysis indicates that energy metabolism of SRO is far more versatile than previously considered, and that both chemiosmotic and flavin-based electron bifurcating mechanisms provide alternative strategies for energy conservation. PMID:21747791

  1. Comparative Genomic Insights into Ecophysiology of Neutrophilic, Microaerophilic Iron Oxidizing Bacteria

    PubMed Central

    Kato, Shingo; Ohkuma, Moriya; Powell, Deborah H.; Krepski, Sean T.; Oshima, Kenshiro; Hattori, Masahira; Shapiro, Nicole; Woyke, Tanja; Chan, Clara S.

    2015-01-01

    Neutrophilic microaerophilic iron-oxidizing bacteria (FeOB) are thought to play a significant role in cycling of carbon, iron and associated elements in both freshwater and marine iron-rich environments. However, the roles of the neutrophilic microaerophilic FeOB are still poorly understood due largely to the difficulty of cultivation and lack of functional gene markers. Here, we analyze the genomes of two freshwater neutrophilic microaerophilic stalk-forming FeOB, Ferriphaselus amnicola OYT1 and Ferriphaselus strain R-1. Phylogenetic analyses confirm that these are distinct species within Betaproteobacteria; we describe strain R-1 and propose the name F. globulitus. We compare the genomes to those of two freshwater Betaproteobacterial and three marine Zetaproteobacterial FeOB isolates in order to look for mechanisms common to all FeOB, or just stalk-forming FeOB. The OYT1 and R-1 genomes both contain homologs to cyc2, which encodes a protein that has been shown to oxidize Fe in the acidophilic FeOB, Acidithiobacillus ferrooxidans. This c-type cytochrome common to all seven microaerophilic FeOB isolates, strengthening the case for its common utility in the Fe oxidation pathway. In contrast, the OYT1 and R-1 genomes lack mto genes found in other freshwater FeOB. OYT1 and R-1 both have genes that suggest they can oxidize sulfur species. Both have the genes necessary to fix carbon by the Calvin–Benson–Basshom pathway, while only OYT1 has the genes necessary to fix nitrogen. The stalk-forming FeOB share xag genes that may help form the polysaccharide structure of stalks. Both OYT1 and R-1 make a novel biomineralization structure, short rod-shaped Fe oxyhydroxides much smaller than their stalks; these oxides are constantly shed, and may be a vector for C, P, and metal transport to downstream environments. Our results show that while different FeOB are adapted to particular niches, freshwater and marine FeOB likely share common mechanisms for Fe oxidation electron

  2. Comparative genomic insights into ecophysiology of neutrophilic, microaerophilic iron oxidizing bacteria

    SciTech Connect

    Kato, Shingo; Ohkuma, Moriya; Powell, Deborah H.; Krepski, Sean T.; Oshima, Kenshiro; Hattori, Masahira; Shapiro, Nicole; Woyke, Tanja; Chan, Clara S.

    2015-11-13

    Neutrophilic microaerophilic iron-oxidizing bacteria (FeOB) are thought to play a significant role in cycling of carbon, iron and associated elements in both freshwater and marine iron-rich environments. However, the roles of the neutrophilic microaerophilic FeOB are still poorly understood due largely to the difficulty of cultivation and lack of functional gene markers. Here, we analyze the genomes of two freshwater neutrophilic microaerophilic stalk-forming FeOB, Ferriphaselus amnicola OYT1 and Ferriphaselus strain R-1. Phylogenetic analyses confirm that these are distinct species within Betaproteobacteria; we describe strain R-1 and propose the name F. globulitus. We compare the genomes to those of two freshwater Betaproteobacterial and three marine Zetaproteobacterial FeOB isolates in order to look for mechanisms common to all FeOB, or just stalk-forming FeOB. The OYT1 and R-1 genomes both contain homologs to cyc2, which encodes a protein that has been shown to oxidize Fe in the acidophilic FeOB, Acidithiobacillus ferrooxidans. This c-type cytochrome common to all seven microaerophilic FeOB isolates, strengthening the case for its common utility in the Fe oxidation pathway. In contrast, the OYT1 and R-1 genomes lack mto genes found in other freshwater FeOB. OYT1 and R-1 both have genes that suggest they can oxidize sulfur species. Both have the genes necessary to fix carbon by the Calvin–Benson– Basshom pathway, while only OYT1 has the genes necessary to fix nitrogen. The stalk-forming FeOB share xag genes that may help form the polysaccharide structure of stalks. Both OYT1 and R-1 make a novel biomineralization structure, short rod-shaped Fe oxyhydroxides much smaller than their stalks; these oxides are constantly shed, and may be a vector for C, P, and metal transport to downstream environments. Lastly, our results show that while different FeOB are adapted to particular niches, freshwater and marine FeOB likely share

  3. Comparative genomic insights into ecophysiology of neutrophilic, microaerophilic iron oxidizing bacteria

    DOE PAGES

    Kato, Shingo; Ohkuma, Moriya; Powell, Deborah H.; ...

    2015-11-13

    Neutrophilic microaerophilic iron-oxidizing bacteria (FeOB) are thought to play a significant role in cycling of carbon, iron and associated elements in both freshwater and marine iron-rich environments. However, the roles of the neutrophilic microaerophilic FeOB are still poorly understood due largely to the difficulty of cultivation and lack of functional gene markers. Here, we analyze the genomes of two freshwater neutrophilic microaerophilic stalk-forming FeOB, Ferriphaselus amnicola OYT1 and Ferriphaselus strain R-1. Phylogenetic analyses confirm that these are distinct species within Betaproteobacteria; we describe strain R-1 and propose the name F. globulitus. We compare the genomes to those of two freshwatermore » Betaproteobacterial and three marine Zetaproteobacterial FeOB isolates in order to look for mechanisms common to all FeOB, or just stalk-forming FeOB. The OYT1 and R-1 genomes both contain homologs to cyc2, which encodes a protein that has been shown to oxidize Fe in the acidophilic FeOB, Acidithiobacillus ferrooxidans. This c-type cytochrome common to all seven microaerophilic FeOB isolates, strengthening the case for its common utility in the Fe oxidation pathway. In contrast, the OYT1 and R-1 genomes lack mto genes found in other freshwater FeOB. OYT1 and R-1 both have genes that suggest they can oxidize sulfur species. Both have the genes necessary to fix carbon by the Calvin–Benson– Basshom pathway, while only OYT1 has the genes necessary to fix nitrogen. The stalk-forming FeOB share xag genes that may help form the polysaccharide structure of stalks. Both OYT1 and R-1 make a novel biomineralization structure, short rod-shaped Fe oxyhydroxides much smaller than their stalks; these oxides are constantly shed, and may be a vector for C, P, and metal transport to downstream environments. Lastly, our results show that while different FeOB are adapted to particular niches, freshwater and marine FeOB likely share common mechanisms for Fe

  4. Transport capabilities of eleven gram-positive bacteria: comparative genomic analyses.

    PubMed

    Lorca, Graciela L; Barabote, Ravi D; Zlotopolski, Vladimir; Tran, Can; Winnen, Brit; Hvorup, Rikki N; Stonestrom, Aaron J; Nguyen, Elizabeth; Huang, Li-Wen; Kim, David S; Saier, Milton H

    2007-06-01

    The genomes of eleven Gram-positive bacteria that are important for human health and the food industry, nine low G+C lactic acid bacteria and two high G+C Gram-positive organisms, were analyzed for their complement of genes encoding transport proteins. Thirteen to 18% of their genes encode transport proteins, larger percentages than observed for most other bacteria. All of these bacteria possess channel proteins, some of which probably function to relieve osmotic stress. Amino acid uptake systems predominate over sugar and peptide cation symporters, and of the sugar uptake porters, those specific for oligosaccharides and glycosides often outnumber those for free sugars. About 10% of the total transport proteins are constituents of putative multidrug efflux pumps with Major Facilitator Superfamily (MFS)-type pumps (55%) being more prevalent than ATP-binding cassette (ABC)-type pumps (33%), which, however, usually greatly outnumber all other types. An exception to this generalization is Streptococcus thermophilus with 54% of its drug efflux pumps belonging to the ABC superfamily and 23% belonging each to the Multidrug/Oligosaccharide/Polysaccharide (MOP) superfamily and the MFS. These bacteria also display peptide efflux pumps that may function in intercellular signalling, and macromolecular efflux pumps, many of predictable specificities. Most of the bacteria analyzed have no pmf-coupled or transmembrane flow electron carriers. The one exception is Brevibacterium linens, which in addition to these carriers, also has transporters of several families not represented in the other ten bacteria examined. Comparisons with the genomes of organisms from other bacterial kingdoms revealed that lactic acid bacteria possess distinctive proportions of recognized transporter types (e.g., more porters specific for glycosides than reducing sugars). Some homologues of transporters identified had previously been identified only in Gram-negative bacteria or in eukaryotes. Our studies

  5. Transport Capabilities of Eleven Gram-positive Bacteria: Comparative Genomic Analyses

    PubMed Central

    Lorca, Graciela L.; Barabote, Ravi D.; Zlotopolski, Vladimir; Tran, Can; Winnen, Brit; Hvorup, Rikki N.; Stonestrom, Aaron J.; Nguyen, Elizabeth; Huang, Li-Wen; Kim, David S.; Saier, Milton H.

    2007-01-01

    The genomes of eleven Gram-positive bacteria that are important for human health and the food industry, nine low G+C lactic acid bacteria and two high G+C Gram-positive organisms, were analyzed for their complement of genes encoding transport proteins. Thirteen to eighteen percent of their genes encode transport proteins, larger percentages than observed for most other bacteria. All of these bacteria possess channel proteins, some of which probably function to relieve osmotic stress. Amino acid uptake systems predominate over sugar and peptide cation symporters, and of the sugar uptake porters, those specific for oligosaccharides and glycosides often outnumber those for free sugars. About 10% of the total transport proteins are constituents of putative multidrug efflux pumps with Major Facilitator Superfamily (MFS)-type pumps (55%) being more prevalent than ATP-binding cassette (ABC)-type pumps (33%), which, however, usually greatly outnumber all other types. An exception to this generalization is Streptococcus thermophilus with 54% of its drug efflux pumps belonging to the ABC superfamily and 23% belonging each to the Multidrug/Oligosaccharide/Polysaccharide (MOP) superfamily and the MFS. These bacteria also display peptide efflux pumps that may function in intercellular signalling, and macromolecular efflux pumps, many of predictable specificities. Most of the bacteria analyzed have no pmf-coupled or transmembrane flow electron carriers. The one exception is Brevibacterium linens, which in addition to these carriers, also has transporters of several families not represented in the other ten bacteria examined. Comparisons with the genomes of organisms from other bacterial kingdoms revealed that lactic acid bacteria possess distinctive proportions of recognized transporter types (e.g., more porters specific for glycosides than reducing sugars). Some homologues of transporters identified had previously been identified only in Gram-negative bacteria or in eukaryotes

  6. Comparative genomic analysis of dha regulon and related genes for anaerobic glycerol metabolism in bacteria.

    PubMed

    Sun, Jibin; van den Heuvel, Joop; Soucaille, Philippe; Qu, Yinbo; Zeng, An-Ping

    2003-01-01

    The dihydroxyacetone (dha) regulon of bacteria encodes genes for the anaerobic metabolism of glycerol. In this work, genomic data are used to analyze and compare the dha regulon and related genes in different organisms in silico with respect to gene organization, sequence similarity, and possible functions. Database searches showed that among the organisms, the genomes of which have been sequenced so far, only two, i.e., Klebsiella pneumoniae MGH 78578 and Clostridium perfringens contain a complete dha regulon bearing all known enzymes. The components and their organization in the dha regulon of these two organisms differ considerably from each other and also from the previously partially sequenced dha regulons in Citrobacter freundii, Clostridium pasteurianum, and Clostridium butyricum. Unlike all of the other organisms, genes for the oxidative and reductive pathways of anaerobic glycerol metabolism in C. perfringens are located in two separate organization units on the chromosome. Comparisons of deduced protein sequences of genes with similar functions showed that the dha regulon components in K. pneumoniae and C. freundii have high similarities (80-95%) but lower similarities to those of the Clostridium species (30-80%). Interestingly, the protein sequence similarities among the dha genes of the Clostridium species are in many cases even lower than those between the Clostridium species and K. pneumoniae or C. freundii, suggesting two different types of dha regulon in the Clostridium species studied. The in silico reconstruction and comparison of dha regulons revealed several new genes in the microorganisms studied. In particular, a novel dha kinase that is phosphoenolpyruvate-dependent is identified and experimentally confirmed for K. pneumoniae in addition to the known ATP-dependent dha kinase. This finding gives new insights into the regulation of glycerol metabolism in K. pneumoniae and explains some hitherto not well understood experimental observations.

  7. Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline

    PubMed Central

    Weinberg, Zasha; Barrick, Jeffrey E.; Yao, Zizhen; Roth, Adam; Kim, Jane N.; Gore, Jeremy; Wang, Joy Xin; Lee, Elaine R.; Block, Kirsten F.; Sudarsan, Narasimhan; Neph, Shane; Tompa, Martin; Ruzzo, Walter L.

    2007-01-01

    We applied a computational pipeline based on comparative genomics to bacteria, and identified 22 novel candidate RNA motifs. We predicted six to be riboswitches, which are mRNA elements that regulate gene expression on binding a specific metabolite. In separate studies, we confirmed that two of these are novel riboswitches. Three other riboswitch candidates are upstream of either a putative transporter gene in the order Lactobacillales, citric acid cycle genes in Burkholderiales or molybdenum cofactor biosynthesis genes in several phyla. The remaining riboswitch candidate, the widespread Genes for the Environment, for Membranes and for Motility (GEMM) motif, is associated with genes important for natural competence in Vibrio cholerae and the use of metal ions as electron acceptors in Geobacter sulfurreducens. Among the other motifs, one has a genetic distribution similar to a previously published candidate riboswitch, ykkC/yxkD, but has a different structure. We identified possible non-coding RNAs in five phyla, and several additional cis-regulatory RNAs, including one in ε-proteobacteria (upstream of purD, involved in purine biosynthesis), and one in Cyanobacteria (within an ATP synthase operon). These candidate RNAs add to the growing list of RNA motifs involved in multiple cellular processes, and suggest that many additional RNAs remain to be discovered. PMID:17621584

  8. Functional genomics of intracellular bacteria.

    PubMed

    de Barsy, Marie; Greub, Gilbert

    2013-07-01

    During the genomic era, a large amount of whole-genome sequences accumulated, which identified many hypothetical proteins of unknown function. Rapidly, functional genomics, which is the research domain that assign a function to a given gene product, has thus been developed. Functional genomics of intracellular pathogenic bacteria exhibit specific peculiarities due to the fastidious growth of most of these intracellular micro-organisms, due to the close interaction with the host cell, due to the risk of contamination of experiments with host cell proteins and, for some strict intracellular bacteria such as Chlamydia, due to the absence of simple genetic system to manipulate the bacterial genome. To identify virulence factors of intracellular pathogenic bacteria, functional genomics often rely on bioinformatic analyses compared with model organisms such as Escherichia coli and Bacillus subtilis. The use of heterologous expression is another common approach. Given the intracellular lifestyle and the many effectors that are used by the intracellular bacteria to corrupt host cell functions, functional genomics is also often targeting the identification of new effectors such as those of the T4SS of Brucella and Legionella.

  9. Comparative genomics analyses on EPS biosynthesis genes required for floc formation of Zoogloea resiniphila and other activated sludge bacteria.

    PubMed

    An, Weixing; Guo, Feng; Song, Yulong; Gao, Na; Bai, Shijie; Dai, Jingcheng; Wei, Hehong; Zhang, Liping; Yu, Dianzhen; Xia, Ming; Yu, Ying; Qi, Ming; Tian, Chunyuan; Chen, Haofeng; Wu, Zhenbin; Zhang, Tong; Qiu, Dongru

    2016-10-01

    Activated sludge (AS) process has been widely utilized for municipal sewage and industrial wastewater treatment. Zoolgoea and its related floc-forming bacteria are required for formation of AS flocs which is the key to gravitational effluent-and-sludge separation and AS recycling. However, little is known about the genetics, biochemistry and physiology of Zoogloea and its related bacteria. This report deals with the comparative genomic analyses on two Zoogloea resiniphila draft genomes and the closely related proteobacterial species commonly found in AS. In particular, the metabolic processes involved in removal of organic matters, nitrogen and phosphorus were analyzed. Furthermore, it is revealed that a large gene cluster, encoding eight glycosyltransferases and other proteins involved in biosynthesis and export of extracellular polysaccharides (EPS), was required for floc formation. One of the two asparagine synthase paralogues, associated with this EPS biosynthesis gene cluster, was required for floc formation in Zoogloea. Similar EPS biosynthesis gene cluster(s) were identified in the genome of other AS proteobacteria including polyphosphate-accumulating Candidatus Accumulibacter phosphatis (CAP) and nitrifying Nitrosopira and Nitrosomonas bacteria, but the gene composition varies interspecifically and intraspecifically. Our results indicate that floc formation of desired AS bacteria, including CAP strains, facilitate their recruitment into AS and gradual enrichment via repeated AS settling and recycling processes. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Comparative genomic analysis of T-box regulatory systems in bacteria

    PubMed Central

    Vitreschak, Alexey G.; Mironov, Andrei A.; Lyubetsky, Vassily A.; Gelfand, Mikhail S.

    2008-01-01

    T-box antitermination is one of the main mechanisms of regulation of genes involved in amino acid metabolism in Gram-positive bacteria. T-box regulatory sites consist of conserved sequence and RNA secondary structure elements. Using a set of known T-box sites, we constructed the common pattern and used it to scan available bacterial genomes. New T-boxes were found in various Gram-positive bacteria, some Gram-negative bacteria (δ-proteobacteria), and some other bacterial groups (Deinococcales/Thermales, Chloroflexi, Dictyoglomi). The majority of T-box-regulated genes encode aminoacyl-tRNA synthetases. Two other groups of T-box-regulated genes are amino acid biosynthetic genes and transporters, as well as genes with unknown function. Analysis of candidate T-box sites resulted in new functional annotations. We assigned the amino acid specificity to a large number of candidate amino acid transporters and a possible function to amino acid biosynthesis genes. We then studied the evolution of the T-boxes. Analysis of the constructed phylogenetic trees demonstrated that in addition to the normal evolution consistent with the evolution of regulated genes, T-boxes may be duplicated, transferred to other genes, and change specificity. We observed several cases of recent T-box regulon expansion following the loss of a previously existing regulatory system, in particular, arginine regulon in Clostridium difficile and methionine regulon in Lactobacillaceae. Finally, we described a new structural class of T-boxes containing duplicated terminator–antiterminator elements and unusual reduced T-boxes regulating initiation of translation in the Actinobacteria. PMID:18359782

  11. Whole-genome comparative analysis of virulence genes unveils similarities and differences between endophytes and other symbiotic bacteria

    PubMed Central

    Lòpez-Fernàndez, Sebastiàn; Sonego, Paolo; Moretto, Marco; Pancher, Michael; Engelen, Kristof; Pertot, Ilaria; Campisano, Andrea

    2015-01-01

    Plant pathogens and endophytes co-exist and often interact with the host plant and within its microbial community. The outcome of these interactions may lead to healthy plants through beneficial interactions, or to disease through the inducible production of molecules known as virulence factors. Unravelling the role of virulence in endophytes may crucially improve our understanding of host-associated microbial communities and their correlation with host health. Virulence is the outcome of a complex network of interactions, and drawing the line between pathogens and endophytes has proven to be conflictive, as strain-level differences in niche overlapping, ecological interactions, state of the host's immune system and environmental factors are seldom taken into account. Defining genomic differences between endophytes and plant pathogens is decisive for understanding the boundaries between these two groups. Here we describe the major differences at the genomic level between seven grapevine endophytic test bacteria, and 12 reference strains. We describe the virulence factors detected in the genomes of the test group, as compared to endophytic and non-endophytic references, to better understand the distribution of these traits in endophytic genomes. To do this, we adopted a comparative whole-genome approach, encompassing BLAST-based searches through the GUI-based tools Mauve and BRIG as well as calculating the core and accessory genomes of three genera of enterobacteria. We outline divergences in metabolic pathways of these endophytes and reference strains, with the aid of the online platform RAST. We present a summary of the major differences that help in the drawing of the boundaries between harmless and harmful bacteria, in the spirit of contributing to a microbiological definition of endophyte. PMID:26074885

  12. How Magnetotactic Bacteria Respond to Radiation Induced Stress and Damage: Comparative Genomics Evidences for Evolutionary Adaptation

    NASA Astrophysics Data System (ADS)

    Wang, Y.; Pan, Y.

    2015-12-01

    Solar radiation and galactic cosmic radiation is believed to be major restriction factors influencing survival and evolution of life. On planet earth, geomagnetic field along with atmosphere protect living beings from the harmful radiation. During a geomagnetic reversal or excursion, however, the efflux of charged particles on earth surface would increase as the shielding effect of magnetic field decrease. The stratospheric ozone can also be partially stripped away by solar wind when the strength of the field is weak, leading to an increasing ultraviolet radiation penetration to the earth surface. However, studies on the mechanism of radiation induced stress and damage are focused only on bacteria that have no response to magnetic field. This study was motivated by the need to fill the gap upon knowledge of that on magnetic field sensitive microorganism. Magnetotactic bacteria (MTB) are a group of microbes that are able to synthesis intracellular nano-sized magnetic particles (named magnetosomes). These chain-arranged magnetosomes help MTB sense and swim along the magnetic field to find their optimal living environment efficiently. In this paper, in silico prediction of stress and damage repair genes in response to different radiation were carried out on the complete genome of four nonmagnetotactic and four magnetotactic spirilla. In silico analyses of the genomes of magnetic field sensitive and non-sensitive spirilla revealed: 1) all strains contain genes for regulate responses superoxide and peroxide stress, DNA pyrimidine dimer and string breaks; 2) non-magnetotactic spirilla have more genes dealing with oxidative stress, while magnetotactic spirilla may benefit from magnetotaxis by swimming into oxic-anoxic zone away from oxidative stress and direct radiation damage; yet, the lipid hydroperoxide peroxidase gene in MTB may be responsible for possible ROS generated by the membrane enveloped magnetite magnetosome; 3) magnetotactic spirilla possess SOS rec

  13. Comparative genomics of freshwater Fe-oxidizing bacteria: implications for physiology, ecology, and systematics

    PubMed Central

    Emerson, David; Field, Erin K.; Chertkov, Olga; Davenport, Karen W.; Goodwin, Lynne; Munk, Christine; Nolan, Matt; Woyke, Tanja

    2013-01-01

    The two microaerophilic, Fe-oxidizing bacteria (FeOB) Sideroxydans ES-1 and Gallionella ES-2 have single circular chromosomes of 3.00 and 3.16 Mb that encode 3049 and 3006 genes, respectively. Multi-locus sequence analysis (MLSA) confirmed the relationship of these two organisms to one another, and indicated they may form a novel order, the Gallionellalaes, within the Betaproteobacteria. Both are adapted for chemolithoautotropy, including pathways for CO2-fixation, and electron transport pathways adapted for growth at low O2-levels, an important adaptation for growing on Fe(II). Both genomes contain Mto-genes implicated in iron-oxidation, as well as other genes that could be involved in Fe-oxidation. Nearly 10% of their genomes are devoted to environmental sensing, signal transduction, and chemotaxis, consistent with their requirement for growing in narrow redox gradients of Fe(II) and O2. There are important differences as well. Sideroxydans ES-1 is more metabolically flexible, and can utilize reduced S-compounds, including thiosulfate, for lithotrophic growth. It has a suite of genes for nitrogen fixation. Gallionella ES-2 contains additional gene clusters for exopolysaccharide production, and has more capacity to resist heavy metals. Both strains contain genes for hemerythrins and globins, but ES-1 has an especially high numbers of these genes that may be involved in oxygen homeostasis, or storage. The two strains share homology with the marine FeOB Mariprofundus ferrooxydans PV-1 in CO2 fixation genes, and respiratory genes. In addition, ES-1 shares a suite of 20 potentially redox active genes with PV-1, as well as a large prophage. Combined these genetic, morphological, and physiological differences indicate that these are two novel species, Sideroxydans lithotrophicus ES-1T (ATCC 700298T; JCM 14762; DSMZ 22444; NCMA B100), and Gallionella capsiferriformans ES-2T (ATCC 700299T; JCM 14763; DSMZ 22445; NCMA B101). PMID:24062729

  14. Comparative genomics and phylogenomic analyses of lysine riboswitch distributions in bacteria.

    PubMed

    Mukherjee, Sumit; Barash, Danny; Sengupta, Supratim

    2017-01-01

    Riboswitches are cis-regulatory elements that regulate the expression of genes involved in biosynthesis or transport of a ligand that binds to them. Among the nearly 40 classes of riboswitches discovered so far, three are known to regulate the concentration of biologically encoded amino acids glycine, lysine, and glutamine. While some comparative genomics studies of riboswitches focusing on their gross distribution across different bacterial taxa have been carried out recently, systematic functional annotation and analysis of lysine riboswitches and the genes they regulate are still lacking. We analyzed 2785 complete bacterial genome sequences to systematically identify 468 lysine riboswitches (not counting hits from multiple strains of the same species) and obtain a detailed phylogenomic map of gene-specific lysine riboswitch distribution across diverse prokaryotic phyla. We find that lysine riboswitches are most abundant in Firmicutes and Gammaproteobacteria where they are found upstream to both biosynthesis and/or transporter genes. They are relatively rare in all other prokaryotic phyla where if present they are primarily found upstream to operons containing many lysine biosynthesis genes. The genome-wide study of the genetic organisation of the lysine riboswitches show considerable variation both within and across different Firmicute orders. Correlating the location of a riboswitch with its genomic context and its phylogenetic relationship with other evolutionarily related riboswitch carrying species, enables identification and annotation of many lysine biosynthesis, transporter and catabolic genes. It also reveals previously unknown patterns of lysine riboswitch distribution and gene/operon regulation and allows us to draw inferences about the possible point of origin of lysine riboswitches. Additionally, evidence of horizontal transfer of riboswitches was found between Firmicutes and Actinobacteria. Our analysis provides a useful resource that will lead to a

  15. Whole genome plasticity in pathogenic bacteria.

    PubMed

    Dobrindt, U; Hacker, J

    2001-10-01

    The exploitation of bacterial genome sequences has so far provided a wealth of new general information about the genetic diversity of bacteria, such as that of many pathogens. Comparative genomics uncovered many genome variations in closely related bacteria and revealed basic principles involved in bacterial diversification, improving our knowledge of the evolution of bacterial pathogens. A correlation between metabolic versatility and genome size has become evident. The degenerated life styles of obligate intracellular pathogens correlate with significantly reduced genome sizes, a phenomenon that has been termed "evolution by reduction". These mechanisms can permanently alter bacterial genotypes and result in adaptation to their environment by genome optimization. In this review, we summarize the recent results of genome-wide approaches to studying the genetic diversity of pathogenic bacteria that indicate that the acquisition of DNA and the loss of genetic information are two important mechanisms that contribute to strain-specific differences in genome content.

  16. Comparative genomics reveals new evolutionary and ecological patterns of selenium utilization in bacteria

    PubMed Central

    Peng, Ting; Lin, Jie; Xu, Yin-Zhen; Zhang, Yan

    2016-01-01

    Selenium (Se) is an important micronutrient for many organisms, which is required for the biosynthesis of selenocysteine, selenouridine and Se-containing cofactor. Several key genes involved in different Se utilization traits have been characterized; however, systematic studies on the evolution and ecological niches of Se utilization are very limited. Here, we analyzed more than 5200 sequenced organisms to examine the occurrence patterns of all Se traits in bacteria. A global species map of all Se utilization pathways has been generated, which demonstrates the most detailed understanding of Se utilization in bacteria so far. In addition, the selenophosphate synthetase gene, which is used to define the overall Se utilization, was also detected in some organisms that do not have any of the known Se traits, implying the presence of a novel Se form in this domain. Phylogenetic analyses of components of different Se utilization traits revealed new horizontal gene transfer events for each of them. Moreover, by characterizing the selenoproteomes of all organisms, we found a new selenoprotein-rich phylum and additional selenoprotein-rich species. Finally, the relationship between ecological environments and Se utilization was investigated and further verified by metagenomic analysis of environmental samples, which indicates new macroevolutionary trends of each Se utilization trait in bacteria. Our data provide insights into the general features of Se utilization in bacteria and should be useful for a further understanding of the evolutionary dynamics of Se utilization in nature. PMID:26800233

  17. Exploring Other Genomes: Bacteria.

    ERIC Educational Resources Information Center

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  18. Exploring Other Genomes: Bacteria.

    ERIC Educational Resources Information Center

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  19. Comparative genomics of transport proteins in developmental bacteria: Myxococcus xanthus and Streptomyces coelicolor.

    PubMed

    Getsin, Ilya; Nalbandian, Gina H; Yee, Daniel C; Vastermark, Ake; Paparoditis, Philipp C G; Reddy, Vamsee S; Saier, Milton H

    2013-12-05

    Two of the largest fully sequenced prokaryotic genomes are those of the actinobacterium, Streptomyces coelicolor (Sco), and the δ-proteobacterium, Myxococcus xanthus (Mxa), both differentiating, sporulating, antibiotic producing, soil microbes. Although the genomes of Sco and Mxa are the same size (~9 Mbp), Sco has 10% more genes that are on average 10% smaller than those in Mxa. Surprisingly, Sco has 93% more identifiable transport proteins than Mxa. This is because Sco has amplified several specific types of its transport protein genes, while Mxa has done so to a much lesser extent. Amplification is substrate- and family-specific. For example, Sco but not Mxa has amplified its voltage-gated ion channels but not its aquaporins and mechano-sensitive channels. Sco but not Mxa has also amplified drug efflux pumps of the DHA2 Family of the Major Facilitator Superfamily (MFS) (49 versus 6), amino acid transporters of the APC Family (17 versus 2), ABC-type sugar transport proteins (85 versus 6), and organic anion transporters of several families. Sco has not amplified most other types of transporters. Mxa has selectively amplified one family of macrolid exporters relative to Sco (16 versus 1), consistent with the observation that Mxa makes more macrolids than does Sco. Except for electron transport carriers, there is a poor correlation between the types of transporters found in these two organisms, suggesting that their solutions to differentiative and metabolic needs evolved independently. A number of unexpected and surprising observations are presented, and predictions are made regarding the physiological functions of recognizable transporters as well as the existence of yet to be discovered transport systems in these two important model organisms and their relatives. The results provide insight into the evolutionary processes by which two dissimilar prokaryotes evolved complexity, particularly through selective chromosomal gene amplification.

  20. Comparative genomics of transport proteins in developmental bacteria: Myxococcus xanthus and Streptomyces coelicolor

    PubMed Central

    2013-01-01

    Background Two of the largest fully sequenced prokaryotic genomes are those of the actinobacterium, Streptomyces coelicolor (Sco), and the δ-proteobacterium, Myxococcus xanthus (Mxa), both differentiating, sporulating, antibiotic producing, soil microbes. Although the genomes of Sco and Mxa are the same size (~9 Mbp), Sco has 10% more genes that are on average 10% smaller than those in Mxa. Results Surprisingly, Sco has 93% more identifiable transport proteins than Mxa. This is because Sco has amplified several specific types of its transport protein genes, while Mxa has done so to a much lesser extent. Amplification is substrate- and family-specific. For example, Sco but not Mxa has amplified its voltage-gated ion channels but not its aquaporins and mechano-sensitive channels. Sco but not Mxa has also amplified drug efflux pumps of the DHA2 Family of the Major Facilitator Superfamily (MFS) (49 versus 6), amino acid transporters of the APC Family (17 versus 2), ABC-type sugar transport proteins (85 versus 6), and organic anion transporters of several families. Sco has not amplified most other types of transporters. Mxa has selectively amplified one family of macrolid exporters relative to Sco (16 versus 1), consistent with the observation that Mxa makes more macrolids than does Sco. Conclusions Except for electron transport carriers, there is a poor correlation between the types of transporters found in these two organisms, suggesting that their solutions to differentiative and metabolic needs evolved independently. A number of unexpected and surprising observations are presented, and predictions are made regarding the physiological functions of recognizable transporters as well as the existence of yet to be discovered transport systems in these two important model organisms and their relatives. The results provide insight into the evolutionary processes by which two dissimilar prokaryotes evolved complexity, particularly through selective chromosomal gene

  1. Genomics of Probiotic Bacteria

    NASA Astrophysics Data System (ADS)

    O'Flaherty, Sarah; Goh, Yong Jun; Klaenhammer, Todd R.

    Probiotic bacteria from the Lactobacillus and Bifidobacterium species belong to the Firmicutes and the Actinobacteria phylum, respectively. Lactobacilli are members of the lactic acid bacteria (LAB) group, a broadly defined family of microorganisms that ferment various hexoses into primarily lactic acid. Lactobacilli are typically low G + C gram-positive species which are phylogenetically diverse, with over 100 species documented to date. Bifidobacteria are heterofermentative, high G + C content bacteria with about 30 species of bifidobacteria described to date.

  2. Comparative Genomic Evidence for a Close Relationship between the Dimorphic Prosthecate Bacteria Hyphomonas neptunium and Caulobacter crescentus

    PubMed Central

    Badger, Jonathan H.; Hoover, Timothy R.; Brun, Yves V.; Weiner, Ronald M.; Laub, Michael T.; Alexandre, Gladys; Mrázek, Jan; Ren, Qinghu; Paulsen, Ian T.; Nelson, Karen E.; Khouri, Hoda M.; Radune, Diana; Sosa, Julia; Dodson, Robert J.; Sullivan, Steven A.; Rosovitz, M. J.; Madupu, Ramana; Brinkac, Lauren M.; Durkin, A. Scott; Daugherty, Sean C.; Kothari, Sagar P.; Giglio, Michelle Gwinn; Zhou, Liwei; Haft, Daniel H.; Selengut, Jeremy D.; Davidsen, Tanja M.; Yang, Qi; Zafar, Nikhat; Ward, Naomi L.

    2006-01-01

    The dimorphic prosthecate bacteria (DPB) are α-proteobacteria that reproduce in an asymmetric manner rather than by binary fission and are of interest as simple models of development. Prior to this work, the only member of this group for which genome sequence was available was the model freshwater organism Caulobacter crescentus. Here we describe the genome sequence of Hyphomonas neptunium, a marine member of the DPB that differs from C. crescentus in that H. neptunium uses its stalk as a reproductive structure. Genome analysis indicates that this organism shares more genes with C. crescentus than it does with Silicibacter pomeroyi (a closer relative according to 16S rRNA phylogeny), that it relies upon a heterotrophic strategy utilizing a wide range of substrates, that its cell cycle is likely to be regulated in a similar manner to that of C. crescentus, and that the outer membrane complements of H. neptunium and C. crescentus are remarkably similar. H. neptunium swarmer cells are highly motile via a single polar flagellum. With the exception of cheY and cheR, genes required for chemotaxis were absent in the H. neptunium genome. Consistent with this observation, H. neptunium swarmer cells did not respond to any chemotactic stimuli that were tested, which suggests that H. neptunium motility is a random dispersal mechanism for swarmer cells rather than a stimulus-controlled navigation system for locating specific environments. In addition to providing insights into bacterial development, the H. neptunium genome will provide an important resource for the study of other interesting biological processes including chromosome segregation, polar growth, and cell aging. PMID:16980487

  3. Comparative genomic evidence for a close relationship between the dimorphic prosthecate bacteria Hyphomonas neptunium and Caulobacter crescentus.

    PubMed

    Badger, Jonathan H; Hoover, Timothy R; Brun, Yves V; Weiner, Ronald M; Laub, Michael T; Alexandre, Gladys; Mrázek, Jan; Ren, Qinghu; Paulsen, Ian T; Nelson, Karen E; Khouri, Hoda M; Radune, Diana; Sosa, Julia; Dodson, Robert J; Sullivan, Steven A; Rosovitz, M J; Madupu, Ramana; Brinkac, Lauren M; Durkin, A Scott; Daugherty, Sean C; Kothari, Sagar P; Giglio, Michelle Gwinn; Zhou, Liwei; Haft, Daniel H; Selengut, Jeremy D; Davidsen, Tanja M; Yang, Qi; Zafar, Nikhat; Ward, Naomi L

    2006-10-01

    The dimorphic prosthecate bacteria (DPB) are alpha-proteobacteria that reproduce in an asymmetric manner rather than by binary fission and are of interest as simple models of development. Prior to this work, the only member of this group for which genome sequence was available was the model freshwater organism Caulobacter crescentus. Here we describe the genome sequence of Hyphomonas neptunium, a marine member of the DPB that differs from C. crescentus in that H. neptunium uses its stalk as a reproductive structure. Genome analysis indicates that this organism shares more genes with C. crescentus than it does with Silicibacter pomeroyi (a closer relative according to 16S rRNA phylogeny), that it relies upon a heterotrophic strategy utilizing a wide range of substrates, that its cell cycle is likely to be regulated in a similar manner to that of C. crescentus, and that the outer membrane complements of H. neptunium and C. crescentus are remarkably similar. H. neptunium swarmer cells are highly motile via a single polar flagellum. With the exception of cheY and cheR, genes required for chemotaxis were absent in the H. neptunium genome. Consistent with this observation, H. neptunium swarmer cells did not respond to any chemotactic stimuli that were tested, which suggests that H. neptunium motility is a random dispersal mechanism for swarmer cells rather than a stimulus-controlled navigation system for locating specific environments. In addition to providing insights into bacterial development, the H. neptunium genome will provide an important resource for the study of other interesting biological processes including chromosome segregation, polar growth, and cell aging.

  4. Comparative analysis of the mosaic genomes of tailed archaeal viruses and proviruses suggests common themes for virion architecture and assembly with tailed viruses of bacteria.

    PubMed

    Krupovic, Mart; Forterre, Patrick; Bamford, Dennis H

    2010-03-19

    Tailed double-stranded DNA viruses (order Caudovirales) represent the dominant morphotype among viruses infecting bacteria. Analysis and comparison of complete genome sequences of tailed bacterial viruses provided insights into their origin and evolution. Structural and genomic studies have unexpectedly revealed that tailed bacterial viruses are evolutionarily related to eukaryotic herpesviruses. Organisms from the third domain of life, Archaea, are also infected by viruses that, in their overall morphology, resemble tailed viruses of bacteria. However, high-resolution structural information is currently unavailable for any of these viruses, and only a few complete genomes have been sequenced so far. Here we identified nine proviruses that are clearly related to tailed bacterial viruses and integrated into chromosomes of species belonging to four different taxonomic orders of the Archaea. This more than doubled the number of genome sequences available for comparative studies. Our analyses indicate that highly mosaic tailed archaeal virus genomes evolve by homologous and illegitimate recombination with genomes of other viruses, by diversification, and by acquisition of cellular genes. Comparative genomics of these viruses and related proviruses revealed a set of conserved genes encoding putative proteins similar to virion assembly and maturation, as well as genome packaging proteins of tailed bacterial viruses and herpesviruses. Furthermore, fold prediction and structural modeling experiments suggest that the major capsid proteins of tailed archaeal viruses adopt the same topology as the corresponding proteins of tailed bacterial viruses and eukaryotic herpesviruses. Data presented in this study strongly support the hypothesis that tailed viruses infecting archaea share a common ancestry with tailed bacterial viruses and herpesviruses.

  5. Comparative genomics of the liberibacteral plant pathogens

    USDA-ARS?s Scientific Manuscript database

    Comparative analyses of multiple Liberibacter genomes provide significant insights into the evolutionary history, genetic diversity, and phylogenetic and metabolomic capacities among pathogenic bacteria that have caused tremendous economic losses to agricultural crops. In addition, genomic analyses ...

  6. Freshwater bacterial lifestyles inferred from comparative genomics.

    PubMed

    Livermore, Joshua A; Emrich, Scott J; Tan, John; Jones, Stuart E

    2014-03-01

    While micro-organisms actively mediate and participate in freshwater ecosystem services, we know little about freshwater microbial genetic diversity. Genome sequences are available for many bacteria from the human microbiome and the ocean (over 800 and 200, respectively), but only two freshwater genomes are currently available: the streamlined genomes of Polynucleobacter necessarius ssp. asymbioticus and the Actinobacterium AcI-B1. Here, we sequenced and analysed draft genomes of eight phylogentically diverse freshwater bacteria exhibiting a range of lifestyle characteristics. Comparative genomics of these bacteria reveals putative freshwater bacterial lifestyles based on differences in predicted growth rate, capability to respond to environmental stimuli and diversity of useable carbon substrates. Our conceptual model based on these genomic characteristics provides a foundation on which further ecophysiological and genomic studies can be built. In addition, these genomes greatly expand the diversity of existing genomic context for future studies on the ecology and genetics of freshwater bacteria.

  7. Comparative genomics of Lactobacillus

    PubMed Central

    Kant, Ravi; Blom, Jochen; Palva, Airi; Siezen, Roland J.; de Vos, Willem M.

    2011-01-01

    Summary The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein‐encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group‐specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes. PMID:21375712

  8. Comparative genomics of nematodes.

    PubMed

    Mitreva, Makedonka; Blaxter, Mark L; Bird, David M; McCarter, James P

    2005-10-01

    Recent transcriptome and genome projects have dramatically expanded the biological data available across the phylum Nematoda. Here we summarize analyses of these sequences, which have revealed multiple unexpected results. Despite a uniform body plan, nematodes are more diverse at the molecular level than was previously recognized, with many species- and group-specific novel genes. In the genus Caenorhabditis, changes in chromosome arrangement, particularly local inversions, are also rapid, with breakpoints occurring at 50-fold the rate in vertebrates. Tylenchid plant parasitic nematode genomes contain several genes closely related to genes in bacteria, implicating horizontal gene transfer events in the origins of plant parasitism. Functional genomics techniques are also moving from Caenorhabditis elegans to application throughout the phylum. Soon, eight more draft nematode genome sequences will be available. This unique resource will underpin both molecular understanding of these most abundant metazoan organisms and aid in the examination of the dynamics of genome evolution in animals.

  9. Preparation of genomic DNA from bacteria.

    PubMed

    Andreou, Lefkothea-Vasiliki

    2013-01-01

    The purpose of this protocol is the isolation of bulk cellular DNA from bacteria (alternatively see Preparation of genomic DNA from Saccharomyces cerevisiae or Isolation of Genomic DNA from Mammalian Cells protocols). Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  11. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.

  12. Horizontal transfer of PAH catabolism genes in Mycobacterium: evidence from comparative genomics and isolated pyrene-degrading bacteria.

    PubMed

    DeBruyn, Jennifer M; Mead, Thomas J; Sayler, Gary S

    2012-01-03

    Biodegradation of high molecular weight polycyclic aromatic hydrocarbons (PAHs), such as pyrene and benzo[a]pyrene, has only been observed in a few genera, namely fast-growing Mycobacterium and Rhodococcus. In M. vanbaalenii PYR-1, multiple aromatic ring hydroxylating dioxygenase (ARHDOs) genes including pyrene dioxygenases nidAB and nidA3B3 are localized in one genomic region. Here we examine the homologous genomic regions in four other PAH-degrading Mycobacterium (strains JLS, KMS, and MCS, and M. gilvum PYR-GCK), presenting evidence for past horizontal gene transfer events. Seven distinct types of ARHDO genes are present in all five genomes, and display conserved syntenic architecture with respect to gene order, orientation, and association with other genes. Duplications and putative integrase and transposase genes suggest past gene shuffling. To corroborate these observations, pyrene-degrading strains were isolated from two PAH-contaminated sediments: Chattanooga Creek (Tennessee) and Lake Erie (western basin). Some were related to fast-growing Mycobacterium spp. and carried both nidA and nidA3 genes. Other isolates belonged to Microbacteriaceae and Intrasporangiaceae presenting the first evidence of pyrene degradation in these families. These isolates had nidA (and some, nidA3) genes that were homologous to Mycobacterial ARHDO genes, suggesting that horizontal gene transfer events have occurred.

  13. Units of plasticity in bacterial genomes: new insight from the comparative genomics of two bacteria interacting with invertebrates, Photorhabdus and Xenorhabdus

    PubMed Central

    2010-01-01

    Background Flexible genomes facilitate bacterial evolution and are classically organized into polymorphic strain-specific segments called regions of genomic plasticity (RGPs). Using a new web tool, RGPFinder, we investigated plasticity units in bacterial genomes, by exhaustive description of the RGPs in two Photorhabdus and two Xenorhabdus strains, belonging to the Enterobacteriaceae and interacting with invertebrates (insects and nematodes). Results RGPs account for about 60% of the genome in each of the four genomes studied. We classified RGPs into genomic islands (GIs), prophages and two new classes of RGP without the features of classical mobile genetic elements (MGEs) but harboring genes encoding enzymes catalyzing DNA recombination (RGPmob), or with no remarkable feature (RGPnone). These new classes accounted for most of the RGPs and are probably hypervariable regions, ancient MGEs with degraded mobilization machinery or non canonical MGEs for which the mobility mechanism has yet to be described. We provide evidence that not only the GIs and the prophages, but also RGPmob and RGPnone, have a mosaic structure consisting of modules. A module is a block of genes, 0.5 to 60 kb in length, displaying a conserved genomic organization among the different Enterobacteriaceae. Modules are functional units involved in host/environment interactions (22-31%), metabolism (22-27%), intracellular or intercellular DNA mobility (13-30%), drug resistance (4-5%) and antibiotic synthesis (3-6%). Finally, in silico comparisons and PCR multiplex analysis indicated that these modules served as plasticity units within the bacterial genome during genome speciation and as deletion units in clonal variants of Photorhabdus. Conclusions This led us to consider the modules, rather than the entire RGP, as the true unit of plasticity in bacterial genomes, during both short-term and long-term genome evolution. PMID:20950463

  14. Horizontal gene transfer and the rock record: comparative genomics of phylogenetically distant bacteria that induce wrinkle structure formation in modern sediments.

    PubMed

    Flood, B E; Bailey, J V; Biddle, J F

    2014-03-01

    Wrinkle structures are sedimentary features that are produced primarily through the trapping and binding of siliciclastic sediments by mat-forming micro-organisms. Wrinkle structures and related sedimentary structures in the rock record are commonly interpreted to represent the stabilizing influence of cyanobacteria on sediments because cyanobacteria are known to produce similar textures and structures in modern tidal flat settings. However, other extant bacteria such as filamentous representatives of the family Beggiatoaceae can also interact with sediments to produce sedimentary features that morphologically resemble many of those associated with cyanobacteria-dominated mats. While Beggiatoa spp. and cyanobacteria are metabolically and phylogenetically distant, genomic analyses show that the two groups share hundreds of homologous genes, likely as the result of horizontal gene transfer. The comparative genomics results described here suggest that some horizontally transferred genes may code for phenotypic traits such as filament formation, chemotaxis, and the production of extracellular polymeric substances that potentially underlie the similar biostabilizing influences of these organisms on sediments. We suggest that the ecological utility of certain basic life modes such as the construction of mats and biofilms, coupled with the lateral mobility of genes in the microbial world, introduces an element of uncertainty into the inference of specific phylogenetic origins from gross morphological features preserved in the ancient rock record.

  15. Ebolavirus comparative genomics

    DOE PAGES

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; ...

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  16. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  17. Ebolavirus comparative genomics.

    PubMed

    Jun, Se-Ran; Leuze, Michael R; Nookaew, Intawat; Uberbacher, Edward C; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas D; Wassenaar, Trudy M; Ussery, David W

    2015-09-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

  18. Functional genomics of lactic acid bacteria: from food to health

    PubMed Central

    2014-01-01

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health. PMID:25186768

  19. Functional genomics of lactic acid bacteria: from food to health.

    PubMed

    Douillard, François P; de Vos, Willem M

    2014-08-29

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health.

  20. Functional genomics of pathogenic bacteria.

    PubMed Central

    Moxon, E R; Hood, D W; Saunders, N J; Schweda, E K H; Richards, J C

    2002-01-01

    Microbial diseases remain the commonest cause of global mortality and morbidity. Automated-DNA sequencing has revolutionized the investigation of pathogenic microbes by making the immense fund of information contained in their genomes available at reasonable cost. The challenge is how this information can be used to increase current understanding of the biology of commensal and virulence behaviour of pathogens with particular emphasis on in vivo function and novel approaches to prevention. One example of the application of whole-genome-sequence information is afforded by investigations of the pathogenic role of Haemophilus influenzae lipopolysaccharide and its candidacy as a vaccine. PMID:11839188

  1. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  2. A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

    PubMed Central

    Wu, Dongying; Hugenholtz, Philip; Mavromatis, Konstantinos; Pukall, Rüdiger; Dalin, Eileen; Ivanova, Natalia N.; Kunin, Victor; Goodwin, Lynne; Wu, Martin; Tindall, Brian J.; Hooper, Sean D.; Pati, Amrita; Lykidis, Athanasios; Spring, Stefan; Anderson, Iain J.; D’haeseleer, Patrik; Zemla, Adam; Singer, Mitchell; Lapidus, Alla; Nolan, Matt; Copeland, Alex; Han, Cliff; Chen, Feng; Cheng, Jan-Fang; Lucas, Susan; Kerfeld, Cheryl; Lang, Elke; Gronow, Sabine; Chain, Patrick; Bruce, David; Rubin, Edward M.; Kyrpides, Nikos C.; Klenk, Hans-Peter; Eisen, Jonathan A.

    2011-01-01

    Sequencing of bacterial and archaeal genomes has revolutionized our understanding of the many roles played by microorganisms1. There are now nearly 1,000 completed bacterial and archaeal genomes available2, most of which were chosen for sequencing on the basis of their physiology. As a result, the perspective provided by the currently available genomes is limited by a highly biased phylogenetic distribution3–5. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, we have sequenced and analysed the genomes of 56 culturable species of Bacteria and Archaea selected to maximize phylogenetic coverage. Analysis of these genomes demonstrated pronounced benefits (compared to an equivalent set of genomes randomly selected from the existing database) in diverse areas including the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. Our results strongly support the need for systematic ‘phylogenomic’ efforts to compile a phylogeny-driven ‘Genomic Encyclopedia of Bacteria and Archaea’ in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come. PMID:20033048

  3. Comparative cytotoxicity of periodontal bacteria

    SciTech Connect

    Stevens, R.H.; Hammond, B.F.

    1988-11-01

    The direct cytotoxicity of sonic extracts (SE) from nine periodontal bacteria for human gingival fibroblasts (HGF) was compared. Equivalent dosages (in terms of protein concentration) of SE were used to challenge HGF cultures. The cytotoxic potential of each SE was assessed by its ability to (1) inhibit HGF proliferation, as measured by direct cell counts; (2) inhibit 3H-thymidine incorporation in HGF cultures; or (3) cause morphological alterations of the cells in challenged cultures. The highest concentration (500 micrograms SE protein/ml) of any of the SEs used to challenge the cells was found to be markedly inhibitory to the HGFs by all three of the criteria of cytotoxicity. At the lowest dosage tested (50 micrograms SE protein/ml); only SE from Actinobacillus actinomycetemcomitans, Bacteroides gingivalis, and Fusobacterium nucleatum caused a significant effect (greater than 90% inhibition or overt morphological abnormalities) in the HGFs as determined by any of the criteria employed. SE from Capnocytophaga sputigena, Eikenella corrodens, or Wolinella recta also inhibited cell proliferation and thymidine incorporation at this dosage; however, the degree of inhibition (5-50%) was consistently, clearly less than that of the first group of three organisms named above. The SE of the three other organisms tested (Actinomyces odontolyticus, Bacteroides intermedius, and Streptococcus sanguis) had little or no effect (0-10% inhibition) at this concentration. The data suggest that the outcome of the interaction between bacterial components and normal resident cells of the periodontium is, at least in part, a function of the bacterial species.

  4. Precision genome engineering in lactic acid bacteria

    PubMed Central

    2014-01-01

    Innovative new genome engineering technologies for manipulating chromosomes have appeared in the last decade. One of these technologies, recombination mediated genetic engineering (recombineering) allows for precision DNA engineering of chromosomes and plasmids in Escherichia coli. Single-stranded DNA recombineering (SSDR) allows for the generation of subtle mutations without the need for selection and without leaving behind any foreign DNA. In this review we discuss the application of SSDR technology in lactic acid bacteria, with an emphasis on key factors that were critical to move this technology from E. coli into Lactobacillus reuteri and Lactococcus lactis. We also provide a blueprint for how to proceed if one is attempting to establish SSDR technology in a lactic acid bacterium. The emergence of CRISPR-Cas technology in genome engineering and its potential application to enhancing SSDR in lactic acid bacteria is discussed. The ability to perform precision genome engineering in medically and industrially important lactic acid bacteria will allow for the genetic improvement of strains without compromising safety. PMID:25185700

  5. Comparative genomics of Brassicaceae crops

    PubMed Central

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-01-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  6. Comparative genomics unravels metabolic differences at the species and/or strain level and extremely acidic environmental adaptation of ten bacteria belonging to the genus Acidithiobacillus.

    PubMed

    Zhang, Xian; She, Siyuan; Dong, Weiling; Niu, Jiaojiao; Xiao, Yunhua; Liang, Yili; Liu, Xueduan; Zhang, Xiaoxia; Fan, Fenliang; Yin, Huaqun

    2016-12-01

    Members of the Acidithiobacillus genus are widely found in extreme environments characterized by low pH and high concentrations of toxic substances, thus it is necessary to identify the cellular mechanisms needed to cope with these harsh conditions. Pan-genome analysis of ten bacteria belonging to the genus Acidithiobacillus suggested the existence of core genome, most of which were assigned to the metabolism-associated genes. Additionally, the unique genes of Acidithiobacillus ferrooxidans were much less than those of other species. A large proportion of Acidithiobacillus ferrivorans-specific genes were mapped especially to metabolism-related genes, indicating that diverse metabolic pathways might confer an advantage for adaptation to local environmental conditions. Analyses of functional metabolisms revealed the differences of carbon metabolism, nitrogen metabolism, and sulfur metabolism at the species and/or strain level. The findings also showed that Acidithiobacillus spp. harbored specific adaptive mechanisms for thriving under extreme environments. The genus Acidithiobacillus had the genetic potential to resist and metabolize toxic substances such as heavy metals and organic solvents. Comparison across species and/or strains of Acidithiobacillus populations provided a deeper appreciation of metabolic differences and environmental adaptation, as well as highlighting the importance of cellular mechanisms that maintain the basal physiological functions under complex acidic environmental conditions. Copyright © 2016 Elsevier GmbH. All rights reserved.

  7. Isolation and characterization of a crude oil degrading bacteria from formation water: comparative genomic analysis of environmental Ochrobactrum intermedium isolate versus clinical strains*

    PubMed Central

    CHAI, Lu-jun; JIANG, Xia-wei; ZHANG, Fan; ZHENG, Bei-wen; SHU, Fu-chang; WANG, Zheng-liang; CUI, Qing-feng; DONG, Han-ping; ZHANG, Zhong-zhi; HOU, Du-jie; SHE, Yue-hui

    2015-01-01

    In this study, we isolated an environmental clone of Ochrobactrum intermedium, strain 2745-2, from the formation water of Changqing oilfield in Shanxi, China, which can degrade crude oil. Strain 2745-2 is aerobic and rod-shaped with optimum growth at 42 °C and pH 5.5. We sequenced the genome and found a single chromosome of 4 800 175 bp, with a G+C content of 57.63%. Sixty RNAs and 4737 protein-coding genes were identified: many of the genes are responsible for the degradation, emulsification, and metabolizing of crude oil. A comparative genomic analysis with related clinical strains (M86, 229E, and LMG3301T) showed that genes involved in virulence, disease, defense, phages, prophages, transposable elements, plasmids, and antibiotic resistance are also present in strain 2745-2. PMID:26465134

  8. Comparative Microbial Genomics and Forensics.

    PubMed

    Massey, Steven E

    2016-08-01

    Forensic science concerns the application of scientific techniques to questions of a legal nature and may also be used to address questions of historical importance. Forensic techniques are often used in legal cases that involve crimes against persons or property, and they increasingly may involve cases of bioterrorism, crimes against nature, medical negligence, or tracing the origin of food- and crop-borne disease. Given the rapid advance of genome sequencing and comparative genomics techniques, we ask how these might be used to address cases of a forensic nature, focusing on the use of microbial genome sequence analysis. Such analyses rely on the increasingly large numbers of microbial genomes present in public databases, the ability of individual investigators to rapidly sequence whole microbial genomes, and an increasing depth of understanding of their evolution and function. Suggestions are made as to how comparative microbial genomics might be applied forensically and may represent possibilities for the future development of forensic techniques. A particular emphasis is on the nascent field of genomic epidemiology, which utilizes rapid whole-genome sequencing to identify the source and spread of infectious outbreaks. Also discussed is the application of comparative microbial genomics to the study of historical epidemics and deaths and how the approaches developed may also be applicable to more recent and actionable cases.

  9. Cloud computing for comparative genomics.

    PubMed

    Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J

    2010-05-18

    Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  10. Cloud computing for comparative genomics

    PubMed Central

    2010-01-01

    Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems. PMID:20482786

  11. Genomic reconstruction of transcriptional regulatory networks in lactic acid bacteria

    PubMed Central

    2013-01-01

    Background Genome scale annotation of regulatory interactions and reconstruction of regulatory networks are the crucial problems in bacterial genomics. The Lactobacillales order of bacteria collates various microorganisms having a large economic impact, including both human and animal pathogens and strains used in the food industry. Nonetheless, no systematic genome-wide analysis of transcriptional regulation has been previously made for this taxonomic group. Results A comparative genomics approach was used for reconstruction of transcriptional regulatory networks in 30 selected genomes of lactic acid bacteria. The inferred networks comprise regulons for 102 orthologous transcription factors (TFs), including 47 novel regulons for previously uncharacterized TFs. Numerous differences between regulatory networks of the Streptococcaceae and Lactobacillaceae groups were described on several levels. The two groups are characterized by substantially different sets of TFs encoded in their genomes. Content of the inferred regulons and structure of their cognate TF binding motifs differ for many orthologous TFs between the two groups. Multiple cases of non-orthologous displacements of TFs that control specific metabolic pathways were reported. Conclusions The reconstructed regulatory networks substantially expand the existing knowledge of transcriptional regulation in lactic acid bacteria. In each of 30 studied genomes the obtained regulatory network contains on average 36 TFs and 250 target genes that are mostly involved in carbohydrate metabolism, stress response, metal homeostasis and amino acids biosynthesis. The inferred networks can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. All reconstructed regulons are captured within the Streptococcaceae and Lactobacillaceae collections in the RegPrecise database (http://regprecise.lbl.gov). PMID:23398941

  12. Comparative Genomics of the Cucurbitaceae

    USDA-ARS?s Scientific Manuscript database

    The genome size for watermelon, melon, cucumber, and pumpkin is 425, 454, 367, and 502 Mbp, respectively, and considered medium size as compared with most other crops. Whole-genome duplication is common in angiosperm plants. Research has revealed a paleohexaploidy (') event in the common ancestor of...

  13. The genomic basis of trophic strategy in marine bacteria.

    PubMed

    Lauro, Federico M; McDougald, Diane; Thomas, Torsten; Williams, Timothy J; Egan, Suhelen; Rice, Scott; DeMaere, Matthew Z; Ting, Lily; Ertan, Haluk; Johnson, Justin; Ferriera, Steven; Lapidus, Alla; Anderson, Iain; Kyrpides, Nikos; Munk, A Christine; Detter, Chris; Han, Cliff S; Brown, Mark V; Robb, Frank T; Kjelleberg, Staffan; Cavicchioli, Ricardo

    2009-09-15

    Many marine bacteria have evolved to grow optimally at either high (copiotrophic) or low (oligotrophic) nutrient concentrations, enabling different species to colonize distinct trophic habitats in the oceans. Here, we compare the genome sequences of two bacteria, Photobacterium angustum S14 and Sphingopyxis alaskensis RB2256, that serve as useful model organisms for copiotrophic and oligotrophic modes of life and specifically relate the genomic features to trophic strategy for these organisms and define their molecular mechanisms of adaptation. We developed a model for predicting trophic lifestyle from genome sequence data and tested >400,000 proteins representing >500 million nucleotides of sequence data from 126 genome sequences with metagenome data of whole environmental samples. When applied to available oceanic metagenome data (e.g., the Global Ocean Survey data) the model demonstrated that oligotrophs, and not the more readily isolatable copiotrophs, dominate the ocean's free-living microbial populations. Using our model, it is now possible to define the types of bacteria that specific ocean niches are capable of sustaining.

  14. Comparative Genome Analysis of Enterobacter cloacae

    PubMed Central

    Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

    2013-01-01

    The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314

  15. Comparative genomic, proteomic and exoproteomic analyses of three Pseudomonas strains reveals novel insights into the phosphorus scavenging capabilities of soil bacteria

    PubMed Central

    Murphy, Andrew R. J.; Scanlan, David J.; Bending, Gary D.; Jones, Alexandra M. E.; Moore, Jonathan D.; Goodall, Andrew; Hammond, John P.; Wellington, Elizabeth M. H.

    2016-01-01

    Summary Bacteria that inhabit the rhizosphere of agricultural crops can have a beneficial effect on crop growth. One such mechanism is the microbial‐driven solubilization and remineralization of complex forms of phosphorus (P). It is known that bacteria secrete various phosphatases in response to low P conditions. However, our understanding of their global proteomic response to P stress is limited. Here, exoproteomic analysis of Pseudomonas putida BIRD‐1 (BIRD‐1), Pseudomonas fluorescens SBW25 and Pseudomonas stutzeri DSM4166 was performed in unison with whole‐cell proteomic analysis of BIRD‐1 grown under phosphate (Pi) replete and Pi deplete conditions. Comparative exoproteomics revealed marked heterogeneity in the exoproteomes of each Pseudomonas strain in response to Pi depletion. In addition to well‐characterized members of the PHO regulon such as alkaline phosphatases, several proteins, previously not associated with the response to Pi depletion, were also identified. These included putative nucleases, phosphotriesterases, putative phosphonate transporters and outer membrane proteins. Moreover, in BIRD‐1, mutagenesis of the master regulator, phoBR, led us to confirm the addition of several novel PHO‐dependent proteins. Our data expands knowledge of the Pseudomonas PHO regulon, including species that are frequently used as bioinoculants, opening up the potential for more efficient and complete use of soil complexed P. PMID:27233093

  16. Culex genome is not just another genome for comparative genomics.

    PubMed

    Reddy, B P Niranjan; Labbé, Pierrick; Corbel, Vincent

    2012-03-30

    Formal publication of the Culex genome sequence has closed the human disease vector triangle by meeting the Anopheles gambiae and Aedes aegypti genome sequences. Compared to these other mosquitoes, Culex quinquefasciatus possesses many specific hallmark characteristics, and may thus provide different angles for research which ultimately leads to a practical solution for controlling the ever increasing burden of insect-vector-borne diseases around the globe. We argue the special importance of the cosmopolitan species- Culex genome sequence by invoking many interesting questions and the possible of potential of the Culex genome to answer those.

  17. Culex genome is not just another genome for comparative genomics

    PubMed Central

    2012-01-01

    Formal publication of the Culex genome sequence has closed the human disease vector triangle by meeting the Anopheles gambiae and Aedes aegypti genome sequences. Compared to these other mosquitoes, Culex quinquefasciatus possesses many specific hallmark characteristics, and may thus provide different angles for research which ultimately leads to a practical solution for controlling the ever increasing burden of insect-vector-borne diseases around the globe. We argue the special importance of the cosmopolitan species- Culex genome sequence by invoking many interesting questions and the possible of potential of the Culex genome to answer those. PMID:22463777

  18. Comparative genomic analysis of the genus Enterococcus.

    PubMed

    Zhong, Zhi; Zhang, Wenyi; Song, Yuqin; Liu, Wenjun; Xu, Haiyan; Xi, Xiaoxia; Menghe, Bilige; Zhang, Heping; Sun, Zhihong

    2017-03-01

    As important lactic acid bacteria, Enterococcus species are widely used in the production of fermented food. However, as some strains of Enterococcus are opportunistic pathogens, their safety has not been generally accepted. In recent years, a large number of new species have been described and classified within the genus Enterococcus, so a better understanding of the genetic relationships and evolution of Enterococcus species is needed. In this study, the genomes of 29 type strains of Enterococcus species were sequenced. In combination with eight complete genome sequences from the Genbank database, the whole genomes of 37 strains of Enterococcus were comparatively analyzed. The average length of Enterococcus genomes was 3.20Mb and the average GC content was 37.99%. The core- and pan- genomes were defined based on the genomes of the 37 strains of Enterococcus. The core-genome contained 605 genes, a large proportion of which were associated with carbohydrate metabolism, protein metabolism, DNA and RNA metabolism. The phylogenetic tree showed that habitat is very important in the evolution of Enterococcus. The genetic relationships were closer in strains that come from similar habitats. According to the topology of the time tree, we found that humans and mammals may be the original hosts of Enterococcus, and then species from humans and mammals made a host-shift to plants, birds, food and other environments. However, it was just an evolutionary scenario, and more data and efforts were needed to prove this postulation. The comparative genomic analysis provided a snapshot of the evolution and genetic diversity of the genus Enterococcus, which paves the way for follow-up studies on its taxonomy and functional genomics. Copyright © 2017 Elsevier GmbH. All rights reserved.

  19. Comparative genomics for biodiversity conservation

    PubMed Central

    Grueber, Catherine E.

    2015-01-01

    Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem. PMID:26106461

  20. Comparative Genome Mapping in Brassica

    PubMed Central

    Lagercrantz, U.; Lydiate, D. J.

    1996-01-01

    A Brassica nigra genetic linkage map was developed from a highly polymorphic cross analyzed with a set of low copy number Brassica RFLP probes. The Brassica genome is extensively duplicated with eight distinct sets of chromosomal segments, each present in three copies, covering virtually the whole genome. Thus, B. nigra could be descended from a hexaploid ancestor. A comparative analysis of B. nigra, B. oleracea and B. rapa genomes, based on maps developed using a common set of RFLP probes, was also performed. The three genomes have distinct chromosomal structures differentiated by a large number of rearrangements, but collinear regions involving virtually the whole of each the three genomes were identified. The genic contents of B. nigra, B. oleracea and B. rapa were basically equivalent and differences in chromosome number (8, 9 and 10, respectively) are probably the result of chromsome fusions and/or fissions. The strong conservation of overall genic content across the three Brassica genomes mirrors the conservation of genic content observed over a much longer evolutionary span in cereals. However, the rate of chromosomal rearrangement in crucifers is much higher than that observed in cereal genomes. PMID:8978073

  1. Direct transfer of whole genomes from bacteria to yeast

    PubMed Central

    Karas, Bogumil J; Jablanovic, Jelena; Sun, Lijie; Ma, Li; Goldgof, Gregory M; Stam, Jason; Ramon, Adi; Manary, Micah J; Winzeler, Elizabeth A; Venter, J Craig; Weyman, Philip D; Gibson, Daniel G; Glass, John I; Hutchison, Clyde A; Smith, Hamilton O; Suzuki, Yo

    2014-01-01

    Transfer of genomes into yeast facilitates genome engineering for genetically intractable organisms, but this process has been hampered by the need for cumbersome isolation of intact genomes before transfer. Here we demonstrate direct cell-to-cell transfer of bacterial genomes as large as 1.8 megabases (mb) into yeast under conditions that promote cell fusion. Moreover, we discovered that removal of restriction endonucleases from donor bacteria resulted in the enhancement of genome transfer. PMID:23542886

  2. Direct transfer of whole genomes from bacteria to yeast.

    PubMed

    Karas, Bogumil J; Jablanovic, Jelena; Sun, Lijie; Ma, Li; Goldgof, Gregory M; Stam, Jason; Ramon, Adi; Manary, Micah J; Winzeler, Elizabeth A; Venter, J Craig; Weyman, Philip D; Gibson, Daniel G; Glass, John I; Hutchison, Clyde A; Smith, Hamilton O; Suzuki, Yo

    2013-05-01

    Transfer of genomes into yeast facilitates genome engineering for genetically intractable organisms, but this process has been hampered by the need for cumbersome isolation of intact genomes before transfer. Here we demonstrate direct cell-to-cell transfer of bacterial genomes as large as 1.8 megabases (Mb) into yeast under conditions that promote cell fusion. Moreover, we discovered that removal of restriction endonucleases from donor bacteria resulted in the enhancement of genome transfer.

  3. Comparative genomic analyses in Asparagus.

    PubMed

    Kuhl, Joseph C; Havey, Michael J; Martin, William J; Cheung, Foo; Yuan, Qiaoping; Landherr, Lena; Hu, Yi; Leebens-Mack, James; Town, Christopher D; Sink, Kenneth C

    2005-12-01

    Garden asparagus (Asparagus officinalis L.) belongs to the monocot family Asparagaceae in the order Asparagales. Onion (Allium cepa L.) and Asparagus officinalis are 2 of the most economically important plants of the core Asparagales, a well supported monophyletic group within the Asparagales. Coding regions in onion have lower GC contents than the grasses. We compared the GC content of 3374 unique expressed sequence tags (ESTs) from A. officinalis with Lycoris longituba and onion (both members of the core Asparagales), Acorus americanus (sister to all other monocots), the grasses, and Arabidopsis. Although ESTs in A. officinalis and Acorus had a higher average GC content than Arabidopsis, Lycoris, and onion, all were clearly lower than the grasses. The Asparagaceae have the smallest nuclear genomes among all plants in the core Asparagales, which typically have huge genomes. Within the Asparagaceae, European Asparagus species have approximately twice the nuclear DNA of that of southern African Asparagus species. We cloned and sequenced 20 genomic amplicons from European A. officinalis and the southern African species Asparagus plumosus and observed no clear evidence for a recent genome doubling in A. officinalis relative to A. plumosus. These results indicate that members of the genus Asparagus with smaller genomes may be useful genomic models for plants in the core Asparagales.

  4. Enhancer Identification through Comparative Genomics

    SciTech Connect

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  5. Genome-Assisted Analysis of Dissimilatory Metal-Reducing Bacteria

    SciTech Connect

    Fredrickson, Jim K.; Romine, Margaret F.

    2005-06-01

    Whole genome sequence for Shewanella oneidensis and Geobacter sulfurreducens has provided numerous new biological insights into the function of these model dissimilatory metal-reducing bacteria. Many of the discoveries, including the identification of a high number of c-type cytochromes in both organisms, have been the result of comparative genomic analyses including several that were experimentally confirmed. Genome sequence has also aided the identification of genes important for the reduction of metal ions and other electron acceptors utilized by these organisms during anaerobic growth by facilitating the identification of genes disrupted by random insertions. Technologies for assaying global expression patterns for genes (mRNA) and proteins have also been enabled by the availability of genome sequence but their application has been limited mainly to the analysis of the role of global regulatory genes and to identifying genes expressed or repressed in response to specific electron acceptors. It is anticipated that details regarding the mechanisms of metal ion respiration, and metabolism in general, will eventually be revealed by comprehensive, systems-level analyses enabled by functional genomic analyses.

  6. Plant Comparative and Functional Genomics

    DOE PAGES

    Yang, Xiaohan; Leebens-Mack, Jim; Chen, Feng; ...

    2015-01-01

    Plants form the foundation for our global ecosystem and are essential for environmental and human health. An increasing number of available plant genomes and tractable experimental systems, comparative and functional plant genomics research is greatly expanding our knowledge of the molecular basis of economically and nutritionally important traits in crop plants. Inferences drawn from comparative genomics are motivating experimental investigations of gene function and gene interactions. In this special issue aims to highlight recent advances made in comparative and functional genomics research in plants. Nine original research articles in this special issue cover five important topics: (1) transcription factor genemore » families relevant to abiotic stress tolerance; (2) plant secondary metabolism; (3) transcriptomebased markers for quantitative trait locus; (4) epigenetic modifications in plant-microbe interactions; and (5) computational prediction of protein-protein interactions. Finally, we studied the plant species in these articles which include model species as well as nonmodel plant species of economic importance (e.g., food crops and medicinal plants).« less

  7. Analysis of the Core Genome and Pan-Genome of Autotrophic Acetogenic Bacteria

    PubMed Central

    Shin, Jongoh; Song, Yoseb; Jeong, Yujin; Cho, Byung-Kwan

    2016-01-01

    Acetogens are obligate anaerobic bacteria capable of reducing carbon dioxide (CO2) to multicarbon compounds coupled to the oxidation of inorganic substrates, such as hydrogen (H2) or carbon monoxide (CO), via the Wood-Ljungdahl pathway. Owing to the metabolic capability of CO2 fixation, much attention has been focused on understanding the unique pathways associated with acetogens, particularly their metabolic coupling of CO2 fixation to energy conservation. Most known acetogens are phylogenetically and metabolically diverse bacteria present in 23 different bacterial genera. With the increased volume of available genome information, acetogenic bacterial genomes can be analyzed by comparative genome analysis. Even with the genetic diversity that exists among acetogens, the Wood-Ljungdahl pathway, a central metabolic pathway, and cofactor biosynthetic pathways are highly conserved for autotrophic growth. Additionally, comparative genome analysis revealed that most genes in the acetogen-specific core genome were associated with the Wood-Ljungdahl pathway. The conserved enzymes and those predicted as missing can provide insight into biological differences between acetogens and allow for the discovery of promising candidates for industrial applications. PMID:27733845

  8. Comparative Genomics of multiple Candidatus Liberibacter asiaticus isolates reveals genetic diversity in Florida and provides clues to the evolution of the bacteria in citrus

    USDA-ARS?s Scientific Manuscript database

    Understanding genetic diversity of within and among the populations of an organism provides information about the potential diversity in pathogenicity and susceptibility to host defenses as well as sustainable effectiveness of control treatments. A near whole genome sequencing strategy was used to c...

  9. GenomeFingerprinter: the genome fingerprint and the universal genome fingerprint analysis for systematic comparative genomics.

    PubMed

    Ai, Yuncan; Ai, Hannan; Meng, Fanmei; Zhao, Lei

    2013-01-01

    No attention has been paid on comparing a set of genome sequences crossing genetic components and biological categories with far divergence over large size range. We define it as the systematic comparative genomics and aim to develop the methodology. First, we create a method, GenomeFingerprinter, to unambiguously produce a set of three-dimensional coordinates from a sequence, followed by one three-dimensional plot and six two-dimensional trajectory projections, to illustrate the genome fingerprint of a given genome sequence. Second, we develop a set of concepts and tools, and thereby establish a method called the universal genome fingerprint analysis (UGFA). Particularly, we define the total genetic component configuration (TGCC) (including chromosome, plasmid, and phage) for describing a strain as a systematic unit, the universal genome fingerprint map (UGFM) of TGCC for differentiating strains as a universal system, and the systematic comparative genomics (SCG) for comparing a set of genomes crossing genetic components and biological categories. Third, we construct a method of quantitative analysis to compare two genomes by using the outcome dataset of genome fingerprint analysis. Specifically, we define the geometric center and its geometric mean for a given genome fingerprint map, followed by the Euclidean distance, the differentiate rate, and the weighted differentiate rate to quantitatively describe the difference between two genomes of comparison. Moreover, we demonstrate the applications through case studies on various genome sequences, giving tremendous insights into the critical issues in microbial genomics and taxonomy. We have created a method, GenomeFingerprinter, for rapidly computing, geometrically visualizing, intuitively comparing a set of genomes at genome fingerprint level, and hence established a method called the universal genome fingerprint analysis, as well as developed a method of quantitative analysis of the outcome dataset. These have set

  10. GenomeFingerprinter: The Genome Fingerprint and the Universal Genome Fingerprint Analysis for Systematic Comparative Genomics

    PubMed Central

    Ai, Yuncan; Ai, Hannan; Meng, Fanmei; Zhao, Lei

    2013-01-01

    Background No attention has been paid on comparing a set of genome sequences crossing genetic components and biological categories with far divergence over large size range. We define it as the systematic comparative genomics and aim to develop the methodology. Results First, we create a method, GenomeFingerprinter, to unambiguously produce a set of three-dimensional coordinates from a sequence, followed by one three-dimensional plot and six two-dimensional trajectory projections, to illustrate the genome fingerprint of a given genome sequence. Second, we develop a set of concepts and tools, and thereby establish a method called the universal genome fingerprint analysis (UGFA). Particularly, we define the total genetic component configuration (TGCC) (including chromosome, plasmid, and phage) for describing a strain as a systematic unit, the universal genome fingerprint map (UGFM) of TGCC for differentiating strains as a universal system, and the systematic comparative genomics (SCG) for comparing a set of genomes crossing genetic components and biological categories. Third, we construct a method of quantitative analysis to compare two genomes by using the outcome dataset of genome fingerprint analysis. Specifically, we define the geometric center and its geometric mean for a given genome fingerprint map, followed by the Euclidean distance, the differentiate rate, and the weighted differentiate rate to quantitatively describe the difference between two genomes of comparison. Moreover, we demonstrate the applications through case studies on various genome sequences, giving tremendous insights into the critical issues in microbial genomics and taxonomy. Conclusions We have created a method, GenomeFingerprinter, for rapidly computing, geometrically visualizing, intuitively comparing a set of genomes at genome fingerprint level, and hence established a method called the universal genome fingerprint analysis, as well as developed a method of quantitative analysis of the

  11. Transcription Factors Exhibit Differential Conservation in Bacteria with Reduced Genomes

    PubMed Central

    Galán-Vásquez, Edgardo; Sánchez-Osorio, Ismael; Martínez-Antonio, Agustino

    2016-01-01

    The description of transcriptional regulatory networks has been pivotal in the understanding of operating principles under which organisms respond and adapt to varying conditions. While the study of the topology and dynamics of these networks has been the subject of considerable work, the investigation of the evolution of their topology, as a result of the adaptation of organisms to different environmental conditions, has received little attention. In this work, we study the evolution of transcriptional regulatory networks in bacteria from a genome reduction perspective, which manifests itself as the loss of genes at different degrees. We used the transcriptional regulatory network of Escherichia coli as a reference to compare 113 smaller, phylogenetically-related γ-proteobacteria, including 19 genomes of symbionts. We found that the type of regulatory action exerted by transcription factors, as genomes get progressively smaller, correlates well with their degree of conservation, with dual regulators being more conserved than repressors and activators in conditions of extreme reduction. In addition, we found that the preponderant conservation of dual regulators might be due to their role as both global regulators and nucleoid-associated proteins. We summarize our results in a conceptual model of how each TF type is gradually lost as genomes become smaller and give a rationale for the order in which this phenomenon occurs. PMID:26766575

  12. Comparative Pathogenomics of Bacteria Causing Infectious Diseases in Fish

    PubMed Central

    Sudheesh, Ponnerassery S.; Al-Ghabshi, Aliya; Al-Mazrooei, Nashwa; Al-Habsi, Saoud

    2012-01-01

    Fish living in the wild as well as reared in the aquaculture facilities are susceptible to infectious diseases caused by a phylogenetically diverse collection of bacterial pathogens. Control and treatment options using vaccines and drugs are either inadequate, inefficient, or impracticable. The classical approach in studying fish bacterial pathogens has been looking at individual or few virulence factors. Recently, genome sequencing of a number of bacterial fish pathogens has tremendously increased our understanding of the biology, host adaptation, and virulence factors of these important pathogens. This paper attempts to compile the scattered literature on genome sequence information of fish pathogenic bacteria published and available to date. The genome sequencing has uncovered several complex adaptive evolutionary strategies mediated by horizontal gene transfer, insertion sequence elements, mutations and prophage sequences operating in fish pathogens, and how their genomes evolved from generalist environmental strains to highly virulent obligatory pathogens. In addition, the comparative genomics has allowed the identification of unique pathogen-specific gene clusters. The paper focuses on the comparative analysis of the virulogenomes of important fish bacterial pathogens, and the genes involved in their evolutionary adaptation to different ecological niches. The paper also proposes some new directions on finding novel vaccine and chemotherapeutic targets in the genomes of bacterial pathogens of fish. PMID:22675651

  13. Genome-scale rates of evolutionary change in bacteria

    PubMed Central

    Duchêne, Sebastian; Holt, Kathryn E.; Weill, François-Xavier; Le Hello, Simon; Hawkey, Jane; Edwards, David J.; Fourment, Mathieu

    2016-01-01

    Estimating the rates at which bacterial genomes evolve is critical to understanding major evolutionary and ecological processes such as disease emergence, long-term host–pathogen associations and short-term transmission patterns. The surge in bacterial genomic data sets provides a new opportunity to estimate these rates and reveal the factors that shape bacterial evolutionary dynamics. For many organisms estimates of evolutionary rate display an inverse association with the time-scale over which the data are sampled. However, this relationship remains unexplored in bacteria due to the difficulty in estimating genome-wide evolutionary rates, which are impacted by the extent of temporal structure in the data and the prevalence of recombination. We collected 36 whole genome sequence data sets from 16 species of bacterial pathogens to systematically estimate and compare their evolutionary rates and assess the extent of temporal structure in the absence of recombination. The majority (28/36) of data sets possessed sufficient clock-like structure to robustly estimate evolutionary rates. However, in some species reliable estimates were not possible even with ‘ancient DNA’ data sampled over many centuries, suggesting that they evolve very slowly or that they display extensive rate variation among lineages. The robustly estimated evolutionary rates spanned several orders of magnitude, from approximately 10−5 to 10−8 nucleotide substitutions per site year−1. This variation was negatively associated with sampling time, with this relationship best described by an exponential decay curve. To avoid potential estimation biases, such time-dependency should be considered when inferring evolutionary time-scales in bacteria. PMID:28348834

  14. Comparative Genomics of the Eukaryotes

    PubMed Central

    Rubin, Gerald M.; Yandell, Mark D.; Wortman, Jennifer R.; Gabor Miklos, George L.; Nelson, Catherine R.; Hariharan, Iswar K.; Fortini, Mark E.; Li, Peter W.; Apweiler, Rolf; Fleischmann, Wolfgang; Cherry, J. Michael; Henikoff, Steven; Skupski, Marian P.; Misra, Sima; Ashburner, Michael; Birney, Ewan; Boguski, Mark S.; Brody, Thomas; Brokstein, Peter; Celniker, Susan E.; Chervitz, Stephen A.; Coates, David; Cravchik, Anibal; Gabrielian, Andrei; Galle, Richard F.; Gelbart, William M.; George, Reed A.; Goldstein, Lawrence S. B.; Gong, Fangcheng; Guan, Ping; Harris, Nomi L.; Hay, Bruce A.; Hoskins, Roger A.; Li, Jiayin; Li, Zhenya; Hynes, Richard O.; Jones, S. J. M.; Kuehl, Peter M.; Lemaitre, Bruno; Littleton, J. Troy; Morrison, Deborah K.; Mungall, Chris; O'Farrell, Patrick H.; Pickeral, Oxana K.; Shue, Chris; Vosshall, Leslie B.; Zhang, Jiong; Zhao, Qi; Zheng, Xiangqun H.; Zhong, Fei; Zhong, Wenyan; Gibbs, Richard; Venter, J. Craig; Adams, Mark D.; Lewis, Suzanna

    2009-01-01

    A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae—and the proteins they are predicted to encode—was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease. PMID:10731134

  15. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    PubMed Central

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  16. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    SciTech Connect

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  17. Microeconomic principles explain an optimal genome size in bacteria.

    PubMed

    Ranea, Juan A G; Grant, Alastair; Thornton, Janet M; Orengo, Christine A

    2005-01-01

    Bacteria can clearly enhance their survival by expanding their genetic repertoire. However, the tight packing of the bacterial genome and the fact that the most evolved species do not necessarily have the biggest genomes suggest there are other evolutionary factors limiting their genome expansion. To clarify these restrictions on size, we studied those protein families contributing most significantly to bacterial-genome complexity. We found that all bacteria apply the same basic and ancestral 'molecular technology' to optimize their reproductive efficiency. The same microeconomics principles that define the optimum size in a factory can also explain the existence of a statistical optimum in bacterial genome size. This optimum is reached when the bacterial genome obtains the maximum metabolic complexity (revenue) for minimal regulatory genes (logistic cost).

  18. Comparative primate genomics: emerging patterns of genome content and dynamics.

    PubMed

    Rogers, Jeffrey; Gibbs, Richard A

    2014-05-01

    Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future.

  19. Comparative primate genomics: emerging patterns of genome content and dynamics

    PubMed Central

    Rogers, Jeffrey; Gibbs, Richard A.

    2014-01-01

    Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

  20. Comparative genomics tools applied to bioterrorism defence.

    PubMed

    Slezak, Tom; Kuczmarski, Tom; Ott, Linda; Torres, Clinton; Medeiros, Dan; Smith, Jason; Truitt, Brian; Mulakken, Nisha; Lam, Marisa; Vitalis, Elizabeth; Zemla, Adam; Zhou, Carol Ecale; Gardner, Shea

    2003-06-01

    Rapid advances in the genomic sequencing of bacteria and viruses over the past few years have made it possible to consider sequencing the genomes of all pathogens that affect humans and the crops and livestock upon which our lives depend. Recent events make it imperative that full genome sequencing be accomplished as soon as possible for pathogens that could be used as weapons of mass destruction or disruption. This sequence information must be exploited to provide rapid and accurate diagnostics to identify pathogens and distinguish them from harmless near-neighbours and hoaxes. The Chem-Bio Non-Proliferation (CBNP) programme of the US Department of Energy (DOE) began a large-scale effort of pathogen detection in early 2000 when it was announced that the DOE would be providing bio-security at the 2002 Winter Olympic Games in Salt Lake City, Utah. Our team at the Lawrence Livermore National Lab (LLNL) was given the task of developing reliable and validated assays for a number of the most likely bioterrorist agents. The short timeline led us to devise a novel system that utilised whole-genome comparison methods to rapidly focus on parts of the pathogen genomes that had a high probability of being unique. Assays developed with this approach have been validated by the Centers for Disease Control (CDC). They were used at the 2002 Winter Olympics, have entered the public health system, and have been in continual use for non-publicised aspects of homeland defence since autumn 2001. Assays have been developed for all major threat list agents for which adequate genomic sequence is available, as well as for other pathogens requested by various government agencies. Collaborations with comparative genomics algorithm developers have enabled our LLNL team to make major advances in pathogen detection, since many of the existing tools simply did not scale well enough to be of practical use for this application. It is hoped that a discussion of a real-life practical application of

  1. Expansion of the Genomic Encyclopedia of Bacteria and Archaea

    SciTech Connect

    Rinke, Christian; Sczyrba, Alex; Malfatti, Stephanie; Lee, Janey; Cheng, Jan-Fang; Stepanauskas, Ramunas; Eisen, Jonathan A.; Hallam, Steven; Inskeep, William P.; Hedlund, Brian P.; Sievert, Stefan M.; Liu, Wen-Tso; Tsiamis, George; Hugenholtz, Philip; Woyke, Tanja

    2011-06-02

    To date the vast majority of bacterial and archaeal genomes sequenced are of rather limited phylogenetic diversity as they were chosen based on their physiology and/ or medical importance. The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project (Wu et al. 2009) is aimed at systematically filling the gaps of the tree of life with phylogenetically diverse reference genomes. However more than 99 percent of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes of these largely mysterious species. These limitations gave rise to the GEBA uncultured project. Here we propose to use single cell genomics to massively expand the Genomic Encyclopedia of Bacteria and Archaea by targeting 80 single cell representatives of uncultured candidate phyla which have no or very few cultured representatives. Generating these reference genomes of uncultured microbes will dramatically increase the discovery rate of novel protein families and biological functions, shed light on the numerous underrepresented phyla that likely play important roles in the environment, and will assist in improving the reconstruction of the evolutionary history of Bacteria and Archaea. Moreover, these data will improve our ability to interpret metagenomics sequence data from diverse environments, which will be of tremendous value for microbial ecology and evolutionary studies to come.

  2. Expansion of the Genomic Encyclopedia of Bacteria and Archaea

    SciTech Connect

    Rinke, Christian; Sczyrba, Alex; Malfatti, Stephanie; Lee, Janye; Cheng, Jan-Fang; Stepanauskas, Ramunas; Eisen, Jonathan A.; Hallam, Steven; Inskeep, William P.; Hedlund, Brian P.; Sievert, Stefan M.; Liu, Wen-Tso; Tsiamis, George; Hugenholtz, Philip; Woyke, Tanja

    2011-03-20

    To date the vast majority of bacterial and archaeal genomes sequenced are of rather limited phylogenetic diversity as they were chosen based on their physiology and/ or medical importance. The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project (Wu et al. 2009) is aimed to systematically filling the gaps of the tree of life with phylogenetically diverse reference genomes. However more than 99percent of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes of these largely mysterious species. These limitations gave rise to the GEBA uncultured project. Here we propose to use single cell genomics to massively expand the Genomic Encyclopedia of Bacteria and Archaea by targeting 80 single cell representatives of uncultured candidate phyla which have no or very few cultured representatives. Generating these reference genomes of uncultured microbes will dramatically increase the discovery rate of novel protein families and biological functions, shed light on the numerous underrepresented phyla that likely play important roles in the environment, and will assist in improving the reconstruction of the evolutionary history of Bacteria and Archaea. Moreover, these data will improve our ability to interpret metagenomics sequence data from diverse environments, which will be of tremendous value for microbial ecology and evolutionary studies to come.

  3. Comparative mitochondrial genomics in zygomycetes: bacteria-like RNase P RNAs, mobile elements and a close source of the group I intron invasion in angiosperms

    PubMed Central

    Seif, Elias; Leigh, Jessica; Liu, Yu; Roewer, Ingeborg; Forget, Lise; Lang, B. Franz

    2005-01-01

    To generate data for comparative analyses of zygomycete mitochondrial gene expression, we sequenced mtDNAs of three distantly related zygomycetes, Rhizopus oryzae, Mortierella verticillata and Smittium culisetae. They all contain the standard fungal mitochondrial gene set, plus rnpB, the gene encoding the RNA subunit of the mitochondrial RNase P (mtP-RNA) and rps3, encoding ribosomal protein S3 (the latter lacking in R.oryzae). The mtP-RNAs of R.oryzae and of additional zygomycete relatives have the most eubacteria-like RNA structures among fungi. Precise mapping of the 5′ and 3′ termini of the R.oryzae and M.verticillata mtP-RNAs confirms their expression and processing at the exact sites predicted by secondary structure modeling. The 3′ RNA processing of zygomycete mitochondrial mRNAs, SSU-rRNA and mtP-RNA occurs at the C-rich sequence motifs similar to those identified in fission yeast and basidiomycete mtDNAs. The C-rich motifs are included in the mature transcripts, and are likely generated by exonucleolytic trimming of RNA 3′ termini. Zygomycete mtDNAs feature a variety of insertion elements: (i) mtDNAs of R.oryzae and M.verticillata were subject to invasions by double hairpin elements; (ii) genes of all three species contain numerous mobile group I introns, including one that is closest to an intron that invaded angiosperm mtDNAs; and (iii) at least one additional case of a mobile element, characterized by a homing endonuclease insertion between partially duplicated genes [Paquin,B., Laforest,M.J., Forget,L., Roewer,I., Wang,Z., Longcore,J. and Lang,B.F. (1997) Curr. Genet., 31, 380–395]. The combined mtDNA-encoded proteins contain insufficient phylogenetic signal to demonstrate monophyly of zygomycetes. PMID:15689432

  4. Comparative genomics of Mortierella elongata and its bacterial endosymbiont Mycoavidus cysteinexigens: Comparative genomics of Mortierella elongata

    DOE PAGES

    Uehling, J.; Gryganskyi, A.; Hameed, K.; ...

    2017-01-01

    Endosymbiosis of bacteria by eukaryotes is a defining feature of cellular evolution. In addition to well-known bacterial origins for mitochondria and chloroplasts, multiple origins of bacterial endosymbiosis are known within the cells of diverse animals, plants and fungi. Early-diverging lineages of terrestrial fungi harbor endosymbiotic bacteria belonging to the Burkholderiaceae. Furthermore, we sequenced the metagenome of the soil-inhabiting fungus Mortierella elongata and assembled the complete circular chromosome of its endosymbiont, Mycoavidus cysteinexigens, which we place within a lineage of endofungal symbionts that are sister clade to Burkholderia. The genome of M. elongata strain AG77 features a core set of primarymore » metabolic pathways for degradation of simple carbohydrates and lipid biosynthesis, while the M. cysteinexigens (AG77) genome is reduced in size and function. Experiments using antibiotics to cure the endobacterium from the host demonstrate that the fungal host metabolism is highly modulated by presence/ absence of M. cysteinexigens. In independent comparative phylogenomic analyses of fungal and bacterial genomes we find that they are consistent with an ancient origin for M. elongata M. cysteinexigens symbiosis, most likely over 350 million years ago and concomitant with the terrestrialization of Earth and diversification of land fungi and plants.« less

  5. Datasets for evolutionary comparative genomics

    PubMed Central

    Liberles, David A

    2005-01-01

    Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856

  6. Gramene database: navigating plant comparative genomics resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationship...

  7. Elevated Rate of Genome Rearrangements in Radiation-Resistant Bacteria

    PubMed Central

    Repar, Jelena; Supek, Fran; Klanjscek, Tin; Warnecke, Tobias; Zahradka, Ksenija; Zahradka, Davor

    2017-01-01

    A number of bacterial, archaeal, and eukaryotic species are known for their resistance to ionizing radiation. One of the challenges these species face is a potent environmental source of DNA double-strand breaks, potential drivers of genome structure evolution. Efficient and accurate DNA double-strand break repair systems have been demonstrated in several unrelated radiation-resistant species and are putative adaptations to the DNA damaging environment. Such adaptations are expected to compensate for the genome-destabilizing effect of environmental DNA damage and may be expected to result in a more conserved gene order in radiation-resistant species. However, here we show that rates of genome rearrangements, measured as loss of gene order conservation with time, are higher in radiation-resistant species in multiple, phylogenetically independent groups of bacteria. Comparison of indicators of selection for genome organization between radiation-resistant and phylogenetically matched, nonresistant species argues against tolerance to disruption of genome structure as a strategy for radiation resistance. Interestingly, an important mechanism affecting genome rearrangements in prokaryotes, the symmetrical inversions around the origin of DNA replication, shapes genome structure of both radiation-resistant and nonresistant species. In conclusion, the opposing effects of environmental DNA damage and DNA repair result in elevated rates of genome rearrangements in radiation-resistant bacteria. PMID:28188144

  8. Cocoa/Cotton Comparative Genomics

    USDA-ARS?s Scientific Manuscript database

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  9. Whole genome sequence of Desulfovibrio magneticus strain RS-1 revealed common gene clusters in magnetotactic bacteria

    PubMed Central

    Nakazawa, Hidekazu; Arakaki, Atsushi; Narita-Yamada, Sachiko; Yashiro, Isao; Jinno, Koji; Aoki, Natsuko; Tsuruyama, Ai; Okamura, Yoshiko; Tanikawa, Satoshi; Fujita, Nobuyuki; Takeyama, Haruko; Matsunaga, Tadashi

    2009-01-01

    Magnetotactic bacteria are ubiquitous microorganisms that synthesize intracellular magnetite particles (magnetosomes) by accumulating Fe ions from aquatic environments. Recent molecular studies, including comprehensive proteomic, transcriptomic, and genomic analyses, have considerably improved our hypotheses of the magnetosome-formation mechanism. However, most of these studies have been conducted using pure-cultured bacterial strains of α-proteobacteria. Here, we report the whole-genome sequence of Desulfovibrio magneticus strain RS-1, the only isolate of magnetotactic microorganisms classified under δ-proteobacteria. Comparative genomics of the RS-1 and four α-proteobacterial strains revealed the presence of three separate gene regions (nuo and mamAB-like gene clusters, and gene region of a cryptic plasmid) conserved in all magnetotactic bacteria. The nuo gene cluster, encoding NADH dehydrogenase (complex I), was also common to the genomes of three iron-reducing bacteria exhibiting uncontrolled extracellular and/or intracellular magnetite synthesis. A cryptic plasmid, pDMC1, encodes three homologous genes that exhibit high similarities with those of other magnetotactic bacterial strains. In addition, the mamAB-like gene cluster, encoding the key components for magnetosome formation such as iron transport and magnetosome alignment, was conserved only in the genomes of magnetotactic bacteria as a similar genomic island-like structure. Our findings suggest the presence of core genetic components for magnetosome biosynthesis; these genes may have been acquired into the magnetotactic bacterial genomes by multiple gene-transfer events during proteobacterial evolution. PMID:19675025

  10. Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

    PubMed

    Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

    2016-01-01

    One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    PubMed Central

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  12. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    PubMed Central

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Michael S.; Yang, Zamin K.; Klingeman, Dawn M.; Land, Miriam L.; Allman, Steve L.; Lu, Tse-Yuan S.; Brown, Steven D.; Schadt, Christopher W.; Podar, Mircea; Doktycz, Mitchel J.

    2016-01-01

    ABSTRACT Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. Here, we describe a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from the plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria. IMPORTANCE Plant roots harbor a diverse collection of microbes that live within host tissues. To gain a comprehensive understanding of microbial adaptations to this endophytic lifestyle from strains that cannot be cultivated, it is necessary to separate bacterial cells from the predominance of plant tissue. This study provides a valuable approach for the separation and isolation of endophytic bacteria from plant root tissue. Isolated live bacteria provide material for microbiome sequencing, single-cell genomics, and analyses

  13. Taxonomy of lice and their endosymbiotic bacteria in the post-genomic era.

    PubMed

    Boyd, B M; Reed, D L

    2012-04-01

    Recent studies of molecular and genomic data from the parasitic lice of birds and mammals, as well as their mutualistic endosymbiotic bacteria, are changing the phylogenetic relationships and taxonomy of these organisms. Phylogenetic studies of lice suggest that vertebrate parasitism arose multiple times from free-living book and bark lice. Molecular clocks show that the major families of lice arose in the late Mesozoic and radiated in the early Cenozoic, following the radiation of mammals and birds. The recent release of the human louse genome has provided new opportunities for research. The genome is being used to find new genetic markers for phylogenetics and population genetics, to understand the complex evolutionary relationships of mitochondrial genes, and to study genome evolution. Genomes are informing us not only about lice, but also about their obligate endosymbiotic bacteria. In contrast to lice and their hosts, lice and their endosymbionts do not share common evolutionary histories, suggesting that endosymbionts are either replaced over time or that there are multiple independent origins of symbiosis in lice. Molecular phylogenetics and whole genome sequencing have recently provided the first insights into the phylogenetic placement and metabolic characteristics of these distantly related bacteria. Comparative genomics between distantly related louse symbionts can provide insights into conserved metabolic functions and can help to explain how distantly related species are fulfilling their role as mutualistic symbionts. In lice and their endosymbionts, molecular data and genome sequencing are driving our understanding of evolutionary relationships and classification, and will for the foreseeable future.

  14. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    PubMed

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  15. Comparative genomics of protoploid Saccharomycetaceae

    PubMed Central

    Souciet, Jean-Luc; Dujon, Bernard; Gaillardin, Claude; Johnston, Mark; Baret, Philippe V.; Cliften, Paul; Sherman, David J.; Weissenbach, Jean; Westhof, Eric; Wincker, Patrick; Jubin, Claire; Poulain, Julie; Barbe, Valérie; Ségurens, Béatrice; Artiguenave, François; Anthouard, Véronique; Vacherie, Benoit; Val, Marie-Eve; Fulton, Robert S.; Minx, Patrick; Wilson, Richard; Durrens, Pascal; Jean, Géraldine; Marck, Christian; Martin, Tiphaine; Nikolski, Macha; Rolland, Thomas; Seret, Marie-Line; Casarégola, Serge; Despons, Laurence; Fairhead, Cécile; Fischer, Gilles; Lafontaine, Ingrid; Leh, Véronique; Lemaire, Marc; de Montigny, Jacky; Neuvéglise, Cécile; Thierry, Agnès; Blanc-Lenfle, Isabelle; Bleykasten, Claudine; Diffels, Julie; Fritsch, Emilie; Frangeul, Lionel; Goëffon, Adrien; Jauniaux, Nicolas; Kachouri-Lafond, Rym; Payen, Célia; Potier, Serge; Pribylova, Lenka; Ozanne, Christophe; Richard, Guy-Franck; Sacerdot, Christine; Straub, Marie-Laure; Talla, Emmanuel

    2009-01-01

    Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call “protoploid” because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified. PMID:19525356

  16. By their genes ye shall know them: genomic signatures of predatory bacteria

    PubMed Central

    Pasternak, Zohar; Pietrokovski, Shmuel; Rotem, Or; Gophna, Uri; Lurie-Weinberger, Mor N; Jurkevitch, Edouard

    2013-01-01

    Predatory bacteria are taxonomically disparate, exhibit diverse predatory strategies and are widely distributed in varied environments. To date, their predatory phenotypes cannot be discerned in genome sequence data thereby limiting our understanding of bacterial predation, and of its impact in nature. Here, we define the ‘predatome,' that is, sets of protein families that reflect the phenotypes of predatory bacteria. The proteomes of all sequenced 11 predatory bacteria, including two de novo sequenced genomes, and 19 non-predatory bacteria from across the phylogenetic and ecological landscapes were compared. Protein families discriminating between the two groups were identified and quantified, demonstrating that differences in the proteomes of predatory and non-predatory bacteria are large and significant. This analysis allows predictions to be made, as we show by confirming from genome data an over-looked bacterial predator. The predatome exhibits deficiencies in riboflavin and amino acids biosynthesis, suggesting that predators obtain them from their prey. In contrast, these genomes are highly enriched in adhesins, proteases and particular metabolic proteins, used for binding to, processing and consuming prey, respectively. Strikingly, predators and non-predators differ in isoprenoid biosynthesis: predators use the mevalonate pathway, whereas non-predators, like almost all bacteria, use the DOXP pathway. By defining predatory signatures in bacterial genomes, the predatory potential they encode can be uncovered, filling an essential gap for measuring bacterial predation in nature. Moreover, we suggest that full-genome proteomic comparisons are applicable to other ecological interactions between microbes, and provide a convenient and rational tool for the functional classification of bacteria. PMID:23190728

  17. Absence of genome reduction in diverse, facultative endohyphal bacteria

    PubMed Central

    Dougherty, Kevin; Arendt, Kayla R.; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T. B. K.; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Woyke, Tanja; Arnold, A. Elizabeth

    2017-01-01

    Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, and generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi. PMID:28348879

  18. Genomics of lactic acid bacteria: Current status and potential applications.

    PubMed

    Wu, Chongde; Huang, Jun; Zhou, Rongqing

    2017-08-01

    Lactic acid bacteria (LAB) are widely used for the production of a variety of foods and feed raw materials where they contribute to flavor and texture of the fermented products. In addition, specific LAB strains are considered as probiotic due to their health-promoting effects in consumers. Recently, the genome sequencing of LAB is booming and the increased amount of published genomics data brings unprecedented opportunity for us to reveal the important traits of LAB. This review describes the recent progress on LAB genomics and special emphasis is placed on understanding the industry-related physiological features based on genomics analysis. Moreover, strategies to engineer metabolic capacity and stress tolerance of LAB with improved industrial performance are also discussed.

  19. Comparative Reannotation of 21 Aspergillus Genomes

    SciTech Connect

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  20. Inference of self-regulated transcriptional networks by comparative genomics.

    PubMed

    Cornish, Joseph P; Matthews, Fialelei; Thomas, Julien R; Erill, Ivan

    2012-01-01

    The assumption of basic properties, like self-regulation, in simple transcriptional regulatory networks can be exploited to infer regulatory motifs from the growing amounts of genomic and meta-genomic data. These motifs can in principle be used to elucidate the nature and scope of transcriptional networks through comparative genomics. Here we assess the feasibility of this approach using the SOS regulatory network of Gram-positive bacteria as a test case. Using experimentally validated data, we show that the known regulatory motif can be inferred through the assumption of self-regulation. Furthermore, the inferred motif provides a more robust search pattern for comparative genomics than the experimental motifs defined in reference organisms. We take advantage of this robustness to generate a functional map of the SOS response in Gram-positive bacteria. Our results reveal definite differences in the composition of the LexA regulon between Firmicutes and Actinobacteria, and confirm that regulation of cell-division inhibition is a widespread characteristic of this network among Gram-positive bacteria.

  1. Polynucleobacter necessarius, a model for genome reduction in both free-living and symbiotic bacteria

    PubMed Central

    Boscaro, Vittorio; Felletti, Michele; Vannini, Claudia; Ackerman, Matthew S.; Chain, Patrick S. G.; Malfatti, Stephanie; Vergez, Lisa M.; Shin, Maria; Doak, Thomas G.; Lynch, Michael; Petroni, Giulio

    2013-01-01

    We present the complete genomic sequence of the essential symbiont Polynucleobacter necessarius (Betaproteobacteria), which is a valuable case study for several reasons. First, it is hosted by a ciliated protist, Euplotes; bacterial symbionts of ciliates are still poorly known because of a lack of extensive molecular data. Second, the single species P. necessarius contains both symbiotic and free-living strains, allowing for a comparison between closely related organisms with different ecologies. Third, free-living P. necessarius strains are exceptional by themselves because of their small genome size, reduced metabolic flexibility, and high worldwide abundance in freshwater systems. We provide a comparative analysis of P. necessarius metabolism and explore the peculiar features of a genome reduction that occurred on an already streamlined genome. We compare this unusual system with current hypotheses for genome erosion in symbionts and free-living bacteria, propose modifications to the presently accepted model, and discuss the potential consequences of translesion DNA polymerase loss. PMID:24167248

  2. Orthology for comparative genomics in the mouse genome database.

    PubMed

    Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A

    2015-08-01

    The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.

  3. Gramene 2013: Comparative plant genomics resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework fo...

  4. Gramene: a growing plant comparative genomics resource

    USDA-ARS?s Scientific Manuscript database

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  5. Comparative genomic analysis of esophageal cancers.

    PubMed

    Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang

    2014-09-01

    The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.

  6. Absence of genome reduction in diverse, facultative endohyphal bacteria

    DOE PAGES

    Baltrus, David A.; Dougherty, Kevin; Arendt, Kayla R.; ...

    2017-02-28

    Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, andmore » generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi.« less

  7. Genome informatics and vaccine targets in Corynebacterium urealyticum using two whole genomes, comparative genomics, and reverse vaccinology.

    PubMed

    Guimarães, Luis; Soares, Siomar; Trost, Eva; Blom, Jochen; Ramos, Rommel; Silva, Artur; Barh, Debmalya; Azevedo, Vasco

    2015-01-01

    Corynebacterium urealyticum is an opportunistic pathogen that normally lives on skin and mucous membranes in humans. This high Gram-positive bacteria can cause acute or encrusted cystitis, encrusted pyelitis, and pyelonephritis in immunocompromised patients. The bacteria is multi-drug resistant, and knowledge about the genes that contribute to its virulence is very limited. Two complete genome sequences were used in this comparative genomic study: C. urealyticum DSM 7109 and C. urealyticum DSM 7111. We used comparative genomics strategies to compare the two strains, DSM 7109 and DSM 7111, and to analyze their metabolic pathways, genome plasticity, and to predict putative antigenic targets. The genomes of these two strains together encode 2,115 non-redundant coding sequences, 1,823 of which are common to both genomes. We identified 188 strain-specific genes in DSM 7109 and 104 strain-specific genes in DSM 7111. The high number of strain-specific genes may be a result of horizontal gene transfer triggered by the large number of transposons in the genomes of these two strains. Screening for virulence factors revealed the presence of the spaDEF operon that encodes pili forming proteins. Therefore, spaDEF may play a pivotal role in facilitating the adhesion of the pathogen to the host tissue. Application of the reverse vaccinology method revealed 19 putative antigenic proteins that may be used in future studies as candidate drug or vaccine targets. The genome features and the presence of virulence factors in genomic islands in the two strains of C. urealyticum provide insights in the lifestyle of this opportunistic pathogen and may be useful in developing future therapeutic strategies.

  8. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

    PubMed Central

    Lindsey, Amelia R. I.; Werren, John H.; Richards, Stephen; Stouthamer, Richard

    2016-01-01

    Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801

  9. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world

    PubMed Central

    Koonin, Eugene V.; Wolf, Yuri I.

    2008-01-01

    The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution. PMID:18948295

  10. Mycobacterial species as case-study of comparative genome analysis.

    PubMed

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  11. Comparative assembly hubs: Web-accessible browsers for comparative genomics

    PubMed Central

    Nguyen, Ngan; Hickey, Glenn; Raney, Brian J.; Armstrong, Joel; Clawson, Hiram; Zweig, Ann; Karolchik, Donna; Kent, William James; Haussler, David; Paten, Benedict

    2014-01-01

    Motivation: Researchers now have access to large volumes of genome sequences for comparative analysis, some generated by the plethora of public sequencing projects and, increasingly, from individual efforts. It is not possible, or necessarily desirable, that the public genome browsers attempt to curate all these data. Instead, a wealth of powerful tools is emerging to empower users to create their own visualizations and browsers. Results: We introduce a pipeline to easily generate collections of Web-accessible UCSC Genome Browsers interrelated by an alignment. It is intended to democratize our comparative genomic browser resources, serving the broad and growing community of evolutionary genomicists and facilitating easy public sharing via the Internet. Using the alignment, all annotations and the alignment itself can be efficiently viewed with reference to any genome in the collection, symmetrically. A new, intelligently scaled alignment display makes it simple to view all changes between the genomes at all levels of resolution, from substitutions to complex structural rearrangements, including duplications. To demonstrate this work, we create a comparative assembly hub containing 57 Escherichia coli and 9 Shigella genomes and show examples that highlight their unique biology. Availability and implementation: The source code is available as open source at: https://github.com/glennhickey/progressiveCactus The E.coli and Shigella genome hub is now a public hub listed on the UCSC browser public hubs Web page. Contact: benedict@soe.ucsc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25138168

  12. Homology-independent metrics for comparative genomics.

    PubMed

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference.

  13. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    DOE PAGES

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.; ...

    2016-07-15

    Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from themore » plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.« less

  14. Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

    SciTech Connect

    Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.; Yang, Zamin Koo; Klingeman, Dawn Marie; Land, Miriam L.; Allman, Steve L.; Lu, Tse-Yuan S.; Brown, Steven D.; Schadt, Christopher Warren; Podar, Mircea; Doktycz, Mitchel J.; Pelletier, Dale A.

    2016-07-15

    Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from the plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.

  15. Prevalent genome streamlining and latitudinal divergence of planktonic bacteria in the surface ocean.

    PubMed

    Swan, Brandon K; Tupper, Ben; Sczyrba, Alexander; Lauro, Federico M; Martinez-Garcia, Manuel; González, José M; Luo, Haiwei; Wright, Jody J; Landry, Zachary C; Hanson, Niels W; Thompson, Brian P; Poulton, Nicole J; Schwientek, Patrick; Acinas, Silvia G; Giovannoni, Stephen J; Moran, Mary Ann; Hallam, Steven J; Cavicchioli, Ricardo; Woyke, Tanja; Stepanauskas, Ramunas

    2013-07-09

    Planktonic bacteria dominate surface ocean biomass and influence global biogeochemical processes, but remain poorly characterized owing to difficulties in cultivation. Using large-scale single cell genomics, we obtained insight into the genome content and biogeography of many bacterial lineages inhabiting the surface ocean. We found that, compared with existing cultures, natural bacterioplankton have smaller genomes, fewer gene duplications, and are depleted in guanine and cytosine, noncoding nucleotides, and genes encoding transcription, signal transduction, and noncytoplasmic proteins. These findings provide strong evidence that genome streamlining and oligotrophy are prevalent features among diverse, free-living bacterioplankton, whereas existing laboratory cultures consist primarily of copiotrophs. The apparent ubiquity of metabolic specialization and mixotrophy, as predicted from single cell genomes, also may contribute to the difficulty in bacterioplankton cultivation. Using metagenome fragment recruitment against single cell genomes, we show that the global distribution of surface ocean bacterioplankton correlates with temperature and latitude and is not limited by dispersal at the time scales required for nucleotide substitution to exceed the current operational definition of bacterial species. Single cell genomes with highly similar small subunit rRNA gene sequences exhibited significant genomic and biogeographic variability, highlighting challenges in the interpretation of individual gene surveys and metagenome assemblies in environmental microbiology. Our study demonstrates the utility of single cell genomics for gaining an improved understanding of the composition and dynamics of natural microbial assemblages.

  16. Prevalent genome streamlining and latitudinal divergence of planktonic bacteria in the surface ocean

    PubMed Central

    Swan, Brandon K.; Tupper, Ben; Sczyrba, Alexander; Lauro, Federico M.; Martinez-Garcia, Manuel; González, José M.; Luo, Haiwei; Wright, Jody J.; Landry, Zachary C.; Hanson, Niels W.; Thompson, Brian P.; Poulton, Nicole J.; Schwientek, Patrick; Acinas, Silvia G.; Giovannoni, Stephen J.; Moran, Mary Ann; Hallam, Steven J.; Cavicchioli, Ricardo; Woyke, Tanja; Stepanauskas, Ramunas

    2013-01-01

    Planktonic bacteria dominate surface ocean biomass and influence global biogeochemical processes, but remain poorly characterized owing to difficulties in cultivation. Using large-scale single cell genomics, we obtained insight into the genome content and biogeography of many bacterial lineages inhabiting the surface ocean. We found that, compared with existing cultures, natural bacterioplankton have smaller genomes, fewer gene duplications, and are depleted in guanine and cytosine, noncoding nucleotides, and genes encoding transcription, signal transduction, and noncytoplasmic proteins. These findings provide strong evidence that genome streamlining and oligotrophy are prevalent features among diverse, free-living bacterioplankton, whereas existing laboratory cultures consist primarily of copiotrophs. The apparent ubiquity of metabolic specialization and mixotrophy, as predicted from single cell genomes, also may contribute to the difficulty in bacterioplankton cultivation. Using metagenome fragment recruitment against single cell genomes, we show that the global distribution of surface ocean bacterioplankton correlates with temperature and latitude and is not limited by dispersal at the time scales required for nucleotide substitution to exceed the current operational definition of bacterial species. Single cell genomes with highly similar small subunit rRNA gene sequences exhibited significant genomic and biogeographic variability, highlighting challenges in the interpretation of individual gene surveys and metagenome assemblies in environmental microbiology. Our study demonstrates the utility of single cell genomics for gaining an improved understanding of the composition and dynamics of natural microbial assemblages. PMID:23801761

  17. A White Paper on Nematode Comparative Genomics

    PubMed Central

    Bird, David McK.; Blaxter, Mark L.; McCarter, James P.; Mitreva, Makedonka; Sternberg, Paul W.; Thomas, W. Kelley

    2005-01-01

    In response to the new opportunities for genome sequencing and comparative genomics, the Society of Nematology (SON) formed a committee to develop a white paper in support of the broad scientific needs associated with this phylum and interests of SON members. Although genome sequencing is expensive, the data generated are unique in biological systems in that genomes have the potential to be complete (every base of the genome can be accounted for), accurate (the data are digital and not subject to stochastic variation), and permanent (once obtained, the genome of a species does not need to be experimentally re-sampled). The availability of complete, accurate, and permanent genome sequences from diverse nematode species will underpin future studies into the biology and evolution of this phylum and the ecological associations (particularly parasitic) nematodes have with other organisms. We anticipate that upwards of 100 nematode genomes will be solved to varying levels of completion in the coming decade and suggest biological and practical considerations to guide the selection of the most informative taxa for sequencing. PMID:19262884

  18. A white paper on nematode comparative genomics.

    PubMed

    Bird, David McK; Blaxter, Mark L; McCarter, James P; Mitreva, Makedonka; Sternberg, Paul W; Thomas, W Kelley

    2005-12-01

    In response to the new opportunities for genome sequencing and comparative genomics, the Society of Nematology (SON) formed a committee to develop a white paper in support of the broad scientific needs associated with this phylum and interests of SON members. Although genome sequencing is expensive, the data generated are unique in biological systems in that genomes have the potential to be complete (every base of the genome can be accounted for), accurate (the data are digital and not subject to stochastic variation), and permanent (once obtained, the genome of a species does not need to be experimentally re-sampled). The availability of complete, accurate, and permanent genome sequences from diverse nematode species will underpin future studies into the biology and evolution of this phylum and the ecological associations (particularly parasitic) nematodes have with other organisms. We anticipate that upwards of 100 nematode genomes will be solved to varying levels of completion in the coming decade and suggest biological and practical considerations to guide the selection of the most informative taxa for sequencing.

  19. Genomes at the interface between bacteria and organelles.

    PubMed Central

    Douglas, Angela E; Raven, John A

    2003-01-01

    The topic of the transition of the genome of a free-living bacterial organism to that of an organelle is addressed by considering three cases. Two of these are relatively clear-cut as involving respectively organisms (cyanobacteria) and organelles (plastids). Cyanobacteria are usually free-living but some are involved in symbioses with a range of eukaryotes in which the cyanobacterial partner contributes photosynthesis, nitrogen fixation, or both of these. In several of these symbioses the cyanobacterium is vertically transmitted, and in a few instances, sufficient unsuccessful attempts have been made to culture the cyanobiont independently for the association to be considered obligate for the cyanobacterium. Plastids clearly had a cyanobacterial ancestor but cannot grow independently of the host eukaryote. Plastid genomes have at most 15% of the number of genes encoded by the cyanobacterium with the smallest number of genes; more genes than are retained in the plastid genome have been transferred to the eukaryote nuclear genome, while the rest of the cyanobacterial genes have been lost. Even the most cyanobacteria-like plastids, for example the "cyanelles" of glaucocystophyte algae, are functionally and genetically very similar to other plastids and give little help in indicating intermediates in the evolution of plastids. The third case considered is the vertically transmitted intracellular bacterial symbionts of insects where the symbiosis is usually obligate for both partners. The number of genes encoded by the genomes of these obligate symbionts is intermediate between that of organelles and that of free-living bacteria, and the genomes of the insect symbionts also show rapid rates of sequence evolution and AT (adenine, thymine) bias. Genetically and functionally, these insect symbionts show considerable similarity to organelles. PMID:12594915

  20. The plant growth-promoting bacteria Azospirillum amazonense: genomic versatility and phytohormone pathway.

    PubMed

    Cecagno, Ricardo; Fritsch, Tiago Ebert; Schrank, Irene Silveira

    2015-01-01

    The rhizosphere bacterium Azospirillum amazonense associates with plant roots to promote plant growth. Variation in replicon numbers and rearrangements is common among Azospirillum strains, and characterization of these naturally occurring differences can improve our understanding of genome evolution. We performed an in silico comparative genomic analysis to understand the genomic plasticity of A. amazonense. The number of A. amazonense-specific coding sequences was similar when compared with the six closely related bacteria regarding belonging or not to the Azospirillum genus. Our results suggest that the versatile gene repertoire found in A. amazonense genome could have been acquired from distantly related bacteria from horizontal transfer. Furthermore, the identification of coding sequence related to phytohormone production, such as flavin-monooxygenase and aldehyde oxidase, is likely to represent the tryptophan-dependent TAM pathway for auxin production in this bacterium. Moreover, the presence of the coding sequence for nitrilase indicates the presence of the alternative route that uses IAN as an intermediate for auxin synthesis, but it remains to be established whether the IAN pathway is the Trp-independent route. Future investigations are necessary to support the hypothesis that its genomic structure has evolved to meet the requirement for adaptation to the rhizosphere and interaction with host plants.

  1. The Plant Growth-Promoting Bacteria Azospirillum amazonense: Genomic Versatility and Phytohormone Pathway

    PubMed Central

    Cecagno, Ricardo; Fritsch, Tiago Ebert; Schrank, Irene Silveira

    2015-01-01

    The rhizosphere bacterium Azospirillum amazonense associates with plant roots to promote plant growth. Variation in replicon numbers and rearrangements is common among Azospirillum strains, and characterization of these naturally occurring differences can improve our understanding of genome evolution. We performed an in silico comparative genomic analysis to understand the genomic plasticity of A. amazonense. The number of A. amazonense-specific coding sequences was similar when compared with the six closely related bacteria regarding belonging or not to the Azospirillum genus. Our results suggest that the versatile gene repertoire found in A. amazonense genome could have been acquired from distantly related bacteria from horizontal transfer. Furthermore, the identification of coding sequence related to phytohormone production, such as flavin-monooxygenase and aldehyde oxidase, is likely to represent the tryptophan-dependent TAM pathway for auxin production in this bacterium. Moreover, the presence of the coding sequence for nitrilase indicates the presence of the alternative route that uses IAN as an intermediate for auxin synthesis, but it remains to be established whether the IAN pathway is the Trp-independent route. Future investigations are necessary to support the hypothesis that its genomic structure has evolved to meet the requirement for adaptation to the rhizosphere and interaction with host plants. PMID:25866821

  2. Comparative genomic hybridization with single cells after whole genome amplification

    SciTech Connect

    Haddad, B.R.; Baldini, A.; Hughes, M.R.

    1994-09-01

    Conventional karyotype analysis is the ideal way to diagnose chromosomal imbalances. However it requires cell culture and chromosome preparation. There are instances where a very small number of cells are available for cytogenetic evaluation and chromosomes cannot be obtained. Comparative genomic hybridization (CGH) is a novel molecular cytogenetic technique that provides information about genetic imbalances affecting the genome. The power of this technique lies in its ability to detect genetic imbalances using total genomic DNA. We have previously demonstrated the feasibility of whole genome amplification from single cells for subsequent analysis of multiple genetic loci by PCR. In this present work, we combine whole genome amplification with CGH to detect chromosomal imbalances from small numbers of cells. Both cytogenetically normal and abnormal cells were individually picked by micromanipulation and subjected to whole genome amplification using random oligonucleotide primers. Amplified test and control DNA were differentially labeled by incorporation of digoxigenin or biotin, mixed together and hybridized to normal male metaphase spreads. Hybridization was detected with two fluorochromes, rhodamine-anti-digoxigenin and FITC -Avidin. Ratio of intensities of the two fluorochromes along the target chromosomes was analyzed using locally developed computer imaging software. Using the combination of whole genome amplification and CGH, we were able to detect different chromosomal aneuploidies from 30, 20, and 10 cells. It can also be applied to the analysis of fetal cells sorted from maternal circulation, or to tumor cells obtained from needle biopsies or from different body fluids and effusions. Finally, its successful application to single cells will have a great impact on preimplantation diagnosis.

  3. Sequencing and comparing whole mitochondrial genomes ofanimals

    SciTech Connect

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  4. Genome level analysis of bacteriocins of lactic acid bacteria.

    PubMed

    Singh, Neetigyata Pratap; Tiwari, Abhay; Bansal, Ankiti; Thakur, Shruti; Sharma, Garima; Gabrani, Reema

    2015-06-01

    Bacteriocins are antimicrobial peptides which are ribosomally synthesized by mainly all bacterial species. LABs (lactic acid bacteria) are a diverse group of bacteria that include around 20 genera of various species. Though LABs have a tremendous potential for production of anti-microbial peptides, this group of bacteria is still underexplored for bacteriocins. To study the diversity among bacteriocin encoding clusters and the putative bacteriocin precursors, genome mining was performed on 20 different species of LAB not reported to be bacteriocin producers. The phylogenetic tree of gyrB, rpoB, and 16S rRNA were constructed using MEGA6 software to analyze the diversity among strains. Putative bacteriocins operons identified were found to be diverse and were further characterized on the basis of physiochemical properties and the secondary structure. The presence of at least two cysteine residues in most of the observed putative bacteriocins leads to disulphide bond formation and provide stability. Our data suggests that LABs are prolific source of low molecular weight non modified peptides. Copyright © 2015. Published by Elsevier Ltd.

  5. Comparative genomics and evolution of transcriptional regulons in Proteobacteria

    PubMed Central

    Kazakov, Alexey E.; Ravcheev, Dmitry A.; Stepanova, Vita V.; Novichkov, Pavel S.

    2016-01-01

    Comparative genomics approaches are broadly used for analysis of transcriptional regulation in bacterial genomes. In this work, we identified binding sites and reconstructed regulons for 33 orthologous groups of transcription factors (TFs) in 196 reference genomes from 21 taxonomic groups of Proteobacteria. Overall, we predict over 10 600 TF binding sites and identified more than 15 600 target genes for 1896 TFs constituting the studied orthologous groups of regulators. These include a set of orthologues for 21 metabolism-associated TFs from Escherichia coli and/or Shewanella that are conserved in five or more taxonomic groups and several additional TFs that represent non-orthologous substitutions of the metabolic regulators in some lineages of Proteobacteria. By comparing gene contents of the reconstructed regulons, we identified the core, taxonomy-specific and genome-specific TF regulon members and classified them by their metabolic functions. Detailed analysis of ArgR, TyrR, TrpR, HutC, HypR and other amino-acid-specific regulons demonstrated remarkable differences in regulatory strategies used by various lineages of Proteobacteria. The obtained genomic collection of in silico reconstructed TF regulons contains a large number of new regulatory interactions that await future experimental validation. The collection provides a framework for future evolutionary studies of transcriptional regulatory networks in Bacteria. It can be also used for functional annotation of putative metabolic transporters and enzymes that are abundant in the reconstructed regulons. PMID:28348857

  6. Comparative genomics and evolution of transcriptional regulons in Proteobacteria.

    PubMed

    Leyn, Semen A; Suvorova, Inna A; Kazakov, Alexey E; Ravcheev, Dmitry A; Stepanova, Vita V; Novichkov, Pavel S; Rodionov, Dmitry A

    2016-07-01

    Comparative genomics approaches are broadly used for analysis of transcriptional regulation in bacterial genomes. In this work, we identified binding sites and reconstructed regulons for 33 orthologous groups of transcription factors (TFs) in 196 reference genomes from 21 taxonomic groups of Proteobacteria. Overall, we predict over 10 600 TF binding sites and identified more than 15 600 target genes for 1896 TFs constituting the studied orthologous groups of regulators. These include a set of orthologues for 21 metabolism-associated TFs from Escherichia coli and/or Shewanella that are conserved in five or more taxonomic groups and several additional TFs that represent non-orthologous substitutions of the metabolic regulators in some lineages of Proteobacteria. By comparing gene contents of the reconstructed regulons, we identified the core, taxonomy-specific and genome-specific TF regulon members and classified them by their metabolic functions. Detailed analysis of ArgR, TyrR, TrpR, HutC, HypR and other amino-acid-specific regulons demonstrated remarkable differences in regulatory strategies used by various lineages of Proteobacteria. The obtained genomic collection of in silico reconstructed TF regulons contains a large number of new regulatory interactions that await future experimental validation. The collection provides a framework for future evolutionary studies of transcriptional regulatory networks in Bacteria. It can be also used for functional annotation of putative metabolic transporters and enzymes that are abundant in the reconstructed regulons.

  7. VISTA - computational tools for comparative genomics

    SciTech Connect

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  8. Ebolavirus comparative genomics

    SciTech Connect

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Ussery, David W.

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.

  9. Gramene 2013: comparative plant genomics resources

    PubMed Central

    Monaco, Marcela K.; Stein, Joshua; Naithani, Sushma; Wei, Sharon; Dharmawardhana, Palitha; Kumari, Sunita; Amarasinghe, Vindhya; Youens-Clark, Ken; Thomason, James; Preece, Justin; Pasternak, Shiran; Olson, Andrew; Jiao, Yinping; Lu, Zhenyuan; Bolser, Dan; Kerhornou, Arnaud; Staines, Dan; Walts, Brandon; Wu, Guanming; D’Eustachio, Peter; Haw, Robin; Croft, David; Kersey, Paul J.; Stein, Lincoln; Jaiswal, Pankaj; Ware, Doreen

    2014-01-01

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. Whole-genome alignments complemented by phylogenetic gene family trees help infer syntenic and orthologous relationships. Genetic variation data, sequences and genome mappings available for 10 species, including Arabidopsis, rice and maize, help infer putative variant effects on genes and transcripts. The pathways section also hosts 10 species-specific metabolic pathways databases developed in-house or by our collaborators using Pathway Tools software, which facilitates searches for pathway, reaction and metabolite annotations, and allows analyses of user-defined expression datasets. Recently, we released a Plant Reactome portal featuring 133 curated rice pathways. This portal will be expanded for Arabidopsis, maize and other plant species. We continue to provide genetic and QTL maps and marker datasets developed by crop researchers. The project provides a unique community platform to support scientific research in plant genomics including studies in evolution, genetics, plant breeding, molecular biology, biochemistry and systems biology. PMID:24217918

  10. Comparative and demographic analysis of orangutan genomes

    PubMed Central

    Locke, Devin P.; Hillier, LaDeana W.; Warren, Wesley C.; Worley, Kim C.; Nazareth, Lynne V.; Muzny, Donna M.; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T.; Minx, Pat; Mitreva, Makedonka; Cook, Lisa; Delehaunty, Kim D.; Fronick, Catrina; Schmidt, Heather; Fulton, Lucinda A.; Fulton, Robert S.; Nelson, Joanne O.; Magrini, Vincent; Pohl, Craig; Graves, Tina A.; Markovic, Chris; Cree, Andy; Dinh, Huyen H.; Hume, Jennifer; Kovar, Christie L.; Fowler, Gerald R.; Lunter, Gerton; Meader, Stephen; Heger, Andreas; Ponting, Chris P.; Marques-Bonet, Tomas; Alkan, Can; Chen, Lin; Cheng, Ze; Kidd, Jeffrey M.; Eichler, Evan E.; White, Simon; Searle, Stephen; Vilella, Albert J.; Chen, Yuan; Flicek, Paul; Ma, Jian; Raney, Brian; Suh, Bernard; Burhans, Richard; Herrero, Javier; Haussler, David; Faria, Rui; Fernando, Olga; Darré, Fleur; Farré, Domènec; Gazave, Elodie; Oliva, Meritxell; Navarro, Arcadi; Roberto, Roberta; Capozzi, Oronzo; Archidiacono, Nicoletta; Valle, Giuliano Della; Purgato, Stefania; Rocchi, Mariano; Konkel, Miriam K.; Walker, Jerilyn A.; Ullmer, Brygg; Batzer, Mark A.; Smit, Arian F. A.; Hubley, Robert; Casola, Claudio; Schrider, Daniel R.; Hahn, Matthew W.; Quesada, Victor; Puente, Xose S.; Ordoñez, Gonzalo R.; López-Otín, Carlos; Vinar, Tomas; Brejova, Brona; Ratan, Aakrosh; Harris, Robert S.; Miller, Webb; Kosiol, Carolin; Lawson, Heather A.; Taliwal, Vikas; Martins, André L.; Siepel, Adam; RoyChoudhury, Arindam; Ma, Xin; Degenhardt, Jeremiah; Bustamante, Carlos D.; Gutenkunst, Ryan N.; Mailund, Thomas; Dutheil, Julien Y.; Hobolth, Asger; Schierup, Mikkel H.; Chemnick, Leona; Ryder, Oliver A.; Yoshinaga, Yuko; de Jong, Pieter J.; Weinstock, George M.; Rogers, Jeffrey; Mardis, Elaine R.; Gibbs, Richard A.; Wilson, Richard K.

    2011-01-01

    “Orangutan” is derived from the Malay term “man of the forest” and aptly describes the Southeast Asian great apes native to Sumatra and Borneo. The orangutan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orangutan draft genome assembly and short read sequence data from five Sumatran and five Bornean orangutan genomes. Our analyses reveal that, compared to other primates, the orangutan genome has many unique features. Structural evolution of the orangutan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe the first primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orangutan genome structure. Orangutans have extremely low energy usage for a eutherian mammal1, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400k years ago (ya), is more recent than most previous studies and underscores the complexity of the orangutan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (Ne) expanded exponentially relative to the ancestral Ne after the split, while Bornean Ne declined over the same period. Overall, the resources and analyses presented here offer new opportunities

  11. Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea.

    PubMed

    Chun, Jongsik; Rainey, Fred A

    2014-02-01

    The polyphasic approach used today in the taxonomy and systematics of the Bacteria and Archaea includes the use of phenotypic, chemotaxonomic and genotypic data. The use of 16S rRNA gene sequence data has revolutionized our understanding of the microbial world and led to a rapid increase in the number of descriptions of novel taxa, especially at the species level. It has allowed in many cases for the demarcation of taxa into distinct species, but its limitations in a number of groups have resulted in the continued use of DNA-DNA hybridization. As technology has improved, next-generation sequencing (NGS) has provided a rapid and cost-effective approach to obtaining whole-genome sequences of microbial strains. Although some 12,000 bacterial or archaeal genome sequences are available for comparison, only 1725 of these are of actual type strains, limiting the use of genomic data in comparative taxonomic studies when there are nearly 11,000 type strains. Efforts to obtain complete genome sequences of all type strains are critical to the future of microbial systematics. The incorporation of genomics into the taxonomy and systematics of the Bacteria and Archaea coupled with computational advances will boost the credibility of taxonomy in the genomic era. This special issue of International Journal of Systematic and Evolutionary Microbiology contains both original research and review articles covering the use of genomic sequence data in microbial taxonomy and systematics. It includes contributions on specific taxa as well as outlines of approaches for incorporating genomics into new strain isolation to new taxon description workflows.

  12. Comparative genomics of Shiga toxin encoding bacteriophages

    PubMed Central

    2012-01-01

    Background Stx bacteriophages are responsible for driving the dissemination of Stx toxin genes (stx) across their bacterial host range. Lysogens carrying Stx phages can cause severe, life-threatening disease and Stx toxin is an integral virulence factor. The Stx-bacteriophage vB_EcoP-24B, commonly referred to as Ф24B, is capable of multiply infecting a single bacterial host cell at a high frequency, with secondary infection increasing the rate at which subsequent bacteriophage infections can occur. This is biologically unusual, therefore determining the genomic content and context of Ф24B compared to other lambdoid Stx phages is important to understanding the factors controlling this phenomenon and determining whether they occur in other Stx phages. Results The genome of the Stx2 encoding phage, Ф24B was sequenced and annotated. The genomic organisation and general features are similar to other sequenced Stx bacteriophages induced from Enterohaemorrhagic Escherichia coli (EHEC), however Ф24B possesses significant regions of heterogeneity, with implications for phage biology and behaviour. The Ф24B genome was compared to other sequenced Stx phages and the archetypal lambdoid phage, lambda, using the Circos genome comparison tool and a PCR-based multi-loci comparison system. Conclusions The data support the hypothesis that Stx phages are mosaic, and recombination events between the host, phages and their remnants within the same infected bacterial cell will continue to drive the evolution of Stx phage variants and the subsequent dissemination of shigatoxigenic potential. PMID:22799768

  13. Genomics Encyclopedia of Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB): a resource for microsymbiont genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Reeve, Wayne

    2013-03-01

    Wayne Reeve of Murdoch University on "Genomics Encyclopedia of Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB): a resource for microsymbiont genomes" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  14. Comparative genomics of biotechnologically important yeasts

    USDA-ARS?s Scientific Manuscript database

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the...

  15. Comparative Genomics and the Gene Complement of a Minimal Cell

    NASA Astrophysics Data System (ADS)

    Islas, Sara; Becerra, Arturo; Luisi, P. Luigi; Lazcano, Antonio

    2004-02-01

    The concept of a minimal cell is discussed from the viewpoint of comparative genomics. Analysis of published DNA content values determined for 641 different archaeal and bacterial species by pulsed field gel electrophoresis has lead to a more precise definition of the genome size ranges of free-living and host-associated organisms. DNA content is not an indicator of phylogenetic position. However, the smallest genomes in our sample do not have a random distribution in rRNA-based evolutionary trees, and are found mostly in (a) the basal branches of the tree where thermophiles are located; and (b) in late clades, such as those of Gram positive bacteria. While the smallest-known genome size for an endosymbiont is only 450 kb, no free-living prokaryote has been described to have genomes <1450 kb. Estimates of the size of minimal gene complement can provide important insights in the primary biological functions required for a sustainable, reproducing cell nowadays and throughout evolutionary times, but definitions of the minimum cell is dependent on specific environments.

  16. Comparative genomics of Blattabacterium cuenoti: the frozen legacy of an ancient endosymbiont genome.

    PubMed

    Patiño-Navarrete, Rafael; Moya, Andrés; Latorre, Amparo; Peretó, Juli

    2013-01-01

    Many insect species have established long-term symbiotic relationships with intracellular bacteria. Symbiosis with bacteria has provided insects with novel ecological capabilities, which have allowed them colonize previously unexplored niches. Despite its importance to the understanding of the emergence of biological complexity, the evolution of symbiotic relationships remains hitherto a mystery in evolutionary biology. In this study, we contribute to the investigation of the evolutionary leaps enabled by mutualistic symbioses by sequencing the genome of Blattabacterium cuenoti, primary endosymbiont of the omnivorous cockroach Blatta orientalis, and one of the most ancient symbiotic associations. We perform comparative analyses between the Blattabacterium cuenoti genome and that of previously sequenced endosymbionts, namely those from the omnivorous hosts the Blattella germanica (Blattelidae) and Periplaneta americana (Blattidae), and the endosymbionts harbored by two wood-feeding hosts, the subsocial cockroach Cryptocercus punctulatus (Cryptocercidae) and the termite Mastotermes darwiniensis (Termitidae). Our study shows a remarkable evolutionary stasis of this symbiotic system throughout the evolutionary history of cockroaches and the deepest branching termite M. darwiniensis, in terms of not only chromosome architecture but also gene content, as revealed by the striking conservation of the Blattabacterium core genome. Importantly, the architecture of central metabolic network inferred from the endosymbiont genomes was established very early in Blattabacterium evolutionary history and could be an outcome of the essential role played by this endosymbiont in the host's nitrogen economy.

  17. Comparative Genomics of Blattabacterium cuenoti: The Frozen Legacy of an Ancient Endosymbiont Genome

    PubMed Central

    Patiño-Navarrete, Rafael; Moya, Andrés; Latorre, Amparo; Peretó, Juli

    2013-01-01

    Many insect species have established long-term symbiotic relationships with intracellular bacteria. Symbiosis with bacteria has provided insects with novel ecological capabilities, which have allowed them colonize previously unexplored niches. Despite its importance to the understanding of the emergence of biological complexity, the evolution of symbiotic relationships remains hitherto a mystery in evolutionary biology. In this study, we contribute to the investigation of the evolutionary leaps enabled by mutualistic symbioses by sequencing the genome of Blattabacterium cuenoti, primary endosymbiont of the omnivorous cockroach Blatta orientalis, and one of the most ancient symbiotic associations. We perform comparative analyses between the Blattabacterium cuenoti genome and that of previously sequenced endosymbionts, namely those from the omnivorous hosts the Blattella germanica (Blattelidae) and Periplaneta americana (Blattidae), and the endosymbionts harbored by two wood-feeding hosts, the subsocial cockroach Cryptocercus punctulatus (Cryptocercidae) and the termite Mastotermes darwiniensis (Termitidae). Our study shows a remarkable evolutionary stasis of this symbiotic system throughout the evolutionary history of cockroaches and the deepest branching termite M. darwiniensis, in terms of not only chromosome architecture but also gene content, as revealed by the striking conservation of the Blattabacterium core genome. Importantly, the architecture of central metabolic network inferred from the endosymbiont genomes was established very early in Blattabacterium evolutionary history and could be an outcome of the essential role played by this endosymbiont in the host’s nitrogen economy. PMID:23355305

  18. Comparative genomics of chondrichthyan Hoxa clusters.

    PubMed

    Mulley, John F; Zhong, Ying-Fu; Holland, Peter Wh

    2009-09-02

    The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays) occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea) and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci) and to available data from the Elephant Shark (Callorhinchus milii) genome project. A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp) found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements.

  19. Comparative genomics of chondrichthyan Hoxa clusters

    PubMed Central

    Mulley, John F; Zhong, Ying-Fu; Holland, Peter WH

    2009-01-01

    Background The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays) occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea) and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci) and to available data from the Elephant Shark (Callorhinchus milii) genome project. Results A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp) found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. Conclusion We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements PMID:19725973

  20. Comparative genomics of Wolbachia and the bacterial species concept.

    PubMed

    Ellegaard, Kirsten Maren; Klasson, Lisa; Näslund, Kristina; Bourtzis, Kostas; Andersson, Siv G E

    2013-04-01

    The importance of host-specialization to speciation processes in obligate host-associated bacteria is well known, as is also the ability of recombination to generate cohesion in bacterial populations. However, whether divergent strains of highly recombining intracellular bacteria, such as Wolbachia, can maintain their genetic distinctness when infecting the same host is not known. We first developed a protocol for the genome sequencing of uncultivable endosymbionts. Using this method, we have sequenced the complete genomes of the Wolbachia strains wHa and wNo, which occur as natural double infections in Drosophila simulans populations on the Seychelles and in New Caledonia. Taxonomically, wHa belong to supergroup A and wNo to supergroup B. A comparative genomics study including additional strains supported the supergroup classification scheme and revealed 24 and 33 group-specific genes, putatively involved in host-adaptation processes. Recombination frequencies were high for strains of the same supergroup despite different host-preference patterns, leading to genomic cohesion. The inferred recombination fragments for strains of different supergroups were of short sizes, and the genomes of the co-infecting Wolbachia strains wHa and wNo were not more similar to each other and did not share more genes than other A- and B-group strains that infect different hosts. We conclude that Wolbachia strains of supergroup A and B represent genetically distinct clades, and that strains of different supergroups can co-exist in the same arthropod host without converging into the same species. This suggests that the supergroups are irreversibly separated and that barriers other than host-specialization are able to maintain distinct clades in recombining endosymbiont populations. Acquiring a good knowledge of the barriers to genetic exchange in Wolbachia will advance our understanding of how endosymbiont communities are constructed from vertically and horizontally transmitted genes.

  1. Comparative Genomics of Wolbachia and the Bacterial Species Concept

    PubMed Central

    Näslund, Kristina; Bourtzis, Kostas; Andersson, Siv G. E.

    2013-01-01

    The importance of host-specialization to speciation processes in obligate host-associated bacteria is well known, as is also the ability of recombination to generate cohesion in bacterial populations. However, whether divergent strains of highly recombining intracellular bacteria, such as Wolbachia, can maintain their genetic distinctness when infecting the same host is not known. We first developed a protocol for the genome sequencing of uncultivable endosymbionts. Using this method, we have sequenced the complete genomes of the Wolbachia strains wHa and wNo, which occur as natural double infections in Drosophila simulans populations on the Seychelles and in New Caledonia. Taxonomically, wHa belong to supergroup A and wNo to supergroup B. A comparative genomics study including additional strains supported the supergroup classification scheme and revealed 24 and 33 group-specific genes, putatively involved in host-adaptation processes. Recombination frequencies were high for strains of the same supergroup despite different host-preference patterns, leading to genomic cohesion. The inferred recombination fragments for strains of different supergroups were of short sizes, and the genomes of the co-infecting Wolbachia strains wHa and wNo were not more similar to each other and did not share more genes than other A- and B-group strains that infect different hosts. We conclude that Wolbachia strains of supergroup A and B represent genetically distinct clades, and that strains of different supergroups can co-exist in the same arthropod host without converging into the same species. This suggests that the supergroups are irreversibly separated and that barriers other than host-specialization are able to maintain distinct clades in recombining endosymbiont populations. Acquiring a good knowledge of the barriers to genetic exchange in Wolbachia will advance our understanding of how endosymbiont communities are constructed from vertically and horizontally transmitted genes

  2. A Comparative Map of the Zebrafish Genome

    PubMed Central

    Woods, Ian G.; Kelly, Peter D.; Chu, Felicia; Ngo-Hazelett, Phuong; Yan, Yi-Lin; Huang, Hui; Postlethwait, John H.; Talbot, William S.

    2000-01-01

    Zebrafish mutations define the functions of hundreds of essential genes in the vertebrate genome. To accelerate the molecular analysis of zebrafish mutations and to facilitate comparisons among the genomes of zebrafish and other vertebrates, we used a homozygous diploid meiotic mapping panel to localize polymorphisms in 691 previously unmapped genes and expressed sequence tags (ESTs). Together with earlier efforts, this work raises the total number of markers scored in the mapping panel to 2119, including 1503 genes and ESTs and 616 previously characterized simple-sequence length polymorphisms. Sequence analysis of zebrafish genes mapped in this study and in prior work identified putative human orthologs for 804 zebrafish genes and ESTs. Map comparisons revealed 139 new conserved syntenies, in which two or more genes are on the same chromosome in zebrafish and human. Although some conserved syntenies are quite large, there were changes in gene order within conserved groups, apparently reflecting the relatively frequent occurrence of inversions and other intrachromosomal rearrangements since the divergence of teleost and tetrapod ancestors. Comparative mapping also shows that there is not a one-to-one correspondence between zebrafish and human chromosomes. Mapping of duplicate gene pairs identified segments of 20 linkage groups that may have arisen during a genome duplication that occurred early in the evolution of teleosts after the divergence of teleost and mammalian ancestors. This comparative map will accelerate the molecular analysis of zebrafish mutations and enhance the understanding of the evolution of the vertebrate genome. PMID:11116086

  3. African relapsing Fever borreliae genomospecies revealed by comparative genomics.

    PubMed

    Elbir, Haitham; Abi-Rached, Laurent; Pontarotti, Pierre; Yoosuf, Niyaz; Drancourt, Michel

    2014-01-01

    Relapsing fever borreliae are vector-borne bacteria responsible for febrile infection in humans in North America, Africa, Asia, and in the Iberian Peninsula in Europe. Relapsing fever borreliae are phylogenetically closely related, yet they differ in pathogenicity and vectors. Their long-term taxonomy, based on geography and vector grouping, needs to be re-apprised in a genomic context. We therefore embarked into genomic analyses of relapsing fever borreliae, focusing on species found in Africa. Genome-wide phylogenetic analyses group Old World Borrelia crocidurae, Borrelia hispanica, B. duttonii, and B. recurrentis in one clade, and New World Borrelia turicatae and Borrelia hermsii in a second clade. Accordingly, average nucleotide identity is 99% among B. duttonii, B. recurrentis, and B. crocidurae and 96% between latter borreliae and B. hispanica while the similarity is 86% between Old World and New World borreliae. Comparative genomics indicates that the Old World relapsing fever B. duttonii, B. recurrentis, B. crocidurae, and B. hispanica have a 2,514-gene pan genome and a 933-gene core genome that includes 788 chromosomal and 145 plasmidic genes. Analyzing the role that natural selection has played in the evolution of Old World borreliae species revealed that 55 loci were under positive diversifying selection, including loci coding for membrane, flagellar, and chemotaxis proteins, three categories associated with adaption to specific niches. Genomic analyses led to a reappraisal of the taxonomy of relapsing fever borreliae in Africa. These analyses suggest that B. crocidurae, B. duttonii, and B. recurrentis are ecotypes of a unique genomospecies, while B. hispanica is a distinct species.

  4. African Relapsing Fever Borreliae Genomospecies Revealed by Comparative Genomics

    PubMed Central

    Elbir, Haitham; Abi-Rached, Laurent; Pontarotti, Pierre; Yoosuf, Niyaz; Drancourt, Michel

    2014-01-01

    Background: Relapsing fever borreliae are vector-borne bacteria responsible for febrile infection in humans in North America, Africa, Asia, and in the Iberian Peninsula in Europe. Relapsing fever borreliae are phylogenetically closely related, yet they differ in pathogenicity and vectors. Their long-term taxonomy, based on geography and vector grouping, needs to be re-apprised in a genomic context. We therefore embarked into genomic analyses of relapsing fever borreliae, focusing on species found in Africa. Results: Genome-wide phylogenetic analyses group Old World Borrelia crocidurae, Borrelia hispanica, B. duttonii, and B. recurrentis in one clade, and New World Borrelia turicatae and Borrelia hermsii in a second clade. Accordingly, average nucleotide identity is 99% among B. duttonii, B. recurrentis, and B. crocidurae and 96% between latter borreliae and B. hispanica while the similarity is 86% between Old World and New World borreliae. Comparative genomics indicates that the Old World relapsing fever B. duttonii, B. recurrentis, B. crocidurae, and B. hispanica have a 2,514-gene pan genome and a 933-gene core genome that includes 788 chromosomal and 145 plasmidic genes. Analyzing the role that natural selection has played in the evolution of Old World borreliae species revealed that 55 loci were under positive diversifying selection, including loci coding for membrane, flagellar, and chemotaxis proteins, three categories associated with adaption to specific niches. Conclusion: Genomic analyses led to a reappraisal of the taxonomy of relapsing fever borreliae in Africa. These analyses suggest that B. crocidurae, B. duttonii, and B. recurrentis are ecotypes of a unique genomospecies, while B. hispanica is a distinct species. PMID:25229054

  5. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    PubMed

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  6. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  7. Comparative genome analysis of Basidiomycete fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  8. Comparative genomics of biotechnologically important yeasts

    PubMed Central

    Riley, Robert; Haridas, Sajeet; Wolfe, Kenneth H.; Lopes, Mariana R.; Hittinger, Chris Todd; Göker, Markus; Salamov, Asaf A.; Wisecaver, Jennifer H.; Long, Tanya M.; Aerts, Andrea L.; Barry, Kerrie W.; Choi, Cindy; Clum, Alicia; Coughlan, Aisling Y.; Deshpande, Shweta; Douglass, Alexander P.; Hanson, Sara J.; Klenk, Hans-Peter; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lipzen, Anna M.; Meier-Kolthoff, Jan P.; Ohm, Robin A.; Otillar, Robert P.; Pangilinan, Jasmyn L.; Peng, Yi; Rosa, Carlos A.; Scheuner, Carmen; Sibirny, Andriy A.; Slot, Jason C.; Stielow, J. Benjamin; Sun, Hui; Kurtzman, Cletus P.; Blackwell, Meredith; Grigoriev, Igor V.

    2016-01-01

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the clade sister to the known CUG-Ser clade. Our well-resolved yeast phylogeny shows that some traits, such as methylotrophy, are restricted to single clades, whereas others, such as l-rhamnose utilization, have patchy phylogenetic distributions. Gene clusters, with variable organization and distribution, encode many pathways of interest. Genomics can predict some biochemical traits precisely, but the genomic basis of others, such as xylose utilization, remains unresolved. Our data also provide insight into early evolution of ascomycetes. We document the loss of H3K9me2/3 heterochromatin, the origin of ascomycete mating-type switching, and panascomycete synteny at the MAT locus. These data and analyses will facilitate the engineering of efficient biosynthetic and degradative pathways and gateways for genomic manipulation. PMID:27535936

  9. Comparative genomics reveals evidence of marine adaptation in Salinispora species.

    PubMed

    Penn, Kevin; Jensen, Paul R

    2012-03-08

    Actinobacteria represent a consistent component of most marine bacterial communities yet little is known about the mechanisms by which these Gram-positive bacteria adapt to life in the marine environment. Here we employed a phylogenomic approach to identify marine adaptation genes in marine Actinobacteria. The focus was on the obligate marine actinomycete genus Salinispora and the identification of marine adaptation genes that have been acquired from other marine bacteria. Functional annotation, comparative genomics, and evidence of a shared evolutionary history with bacteria from hyperosmotic environments were used to identify a pool of more than 50 marine adaptation genes. An Actinobacterial species tree was used to infer the likelihood of gene gain or loss in accounting for the distribution of each gene. Acquired marine adaptation genes were associated with electron transport, sodium and ABC transporters, and channels and pores. In addition, the loss of a mechanosensitive channel gene appears to have played a major role in the inability of Salinispora strains to grow following transfer to low osmotic strength media. The marine Actinobacteria for which genome sequences are available are broadly distributed throughout the Actinobacterial phylogenetic tree and closely related to non-marine forms suggesting they have been independently introduced relatively recently into the marine environment. It appears that the acquisition of transporters in Salinispora spp. represents a major marine adaptation while gene loss is proposed to play a role in the inability of this genus to survive outside of the marine environment. This study reveals fundamental differences between marine adaptations in Gram-positive and Gram-negative bacteria and no common genetic basis for marine adaptation among the Actinobacteria analyzed.

  10. Comparative genomics reveals evidence of marine adaptation in Salinispora species

    PubMed Central

    2012-01-01

    Background Actinobacteria represent a consistent component of most marine bacterial communities yet little is known about the mechanisms by which these Gram-positive bacteria adapt to life in the marine environment. Here we employed a phylogenomic approach to identify marine adaptation genes in marine Actinobacteria. The focus was on the obligate marine actinomycete genus Salinispora and the identification of marine adaptation genes that have been acquired from other marine bacteria. Results Functional annotation, comparative genomics, and evidence of a shared evolutionary history with bacteria from hyperosmotic environments were used to identify a pool of more than 50 marine adaptation genes. An Actinobacterial species tree was used to infer the likelihood of gene gain or loss in accounting for the distribution of each gene. Acquired marine adaptation genes were associated with electron transport, sodium and ABC transporters, and channels and pores. In addition, the loss of a mechanosensitive channel gene appears to have played a major role in the inability of Salinispora strains to grow following transfer to low osmotic strength media. Conclusions The marine Actinobacteria for which genome sequences are available are broadly distributed throughout the Actinobacterial phylogenetic tree and closely related to non-marine forms suggesting they have been independently introduced relatively recently into the marine environment. It appears that the acquisition of transporters in Salinispora spp. represents a major marine adaptation while gene loss is proposed to play a role in the inability of this genus to survive outside of the marine environment. This study reveals fundamental differences between marine adaptations in Gram-positive and Gram-negative bacteria and no common genetic basis for marine adaptation among the Actinobacteria analyzed. PMID:22401625

  11. COMPARISON OF COMPARATIVE GENOMIC HYBRIDIZATIONS TECHNOLOGIES ACROSS MICROARRAY PLATFORMS

    EPA Science Inventory

    Comparative Genomic Hybridization (CGH) measures DNA copy number differences between a reference genome and a test genome. The DNA samples are differentially labeled and hybridized to an immobilized substrate. In early CGH experiments, the DNA targets were hybridized to metaphase...

  12. COMPARISON OF COMPARATIVE GENOMIC HYBRIDIZATIONS TECHNOLOGIES ACROSS MICROARRAY PLATFORMS

    EPA Science Inventory

    Comparative Genomic Hybridization (CGH) measures DNA copy number differences between a reference genome and a test genome. The DNA samples are differentially labeled and hybridized to an immobilized substrate. In early CGH experiments, the DNA targets were hybridized to metaphase...

  13. Ecology of marine Bacteroidetes: a comparative genomics approach

    PubMed Central

    Fernández-Gómez, Beatriz; Richter, Michael; Schüler, Margarete; Pinhassi, Jarone; Acinas, Silvia G; González, José M; Pedrós-Alió, Carlos

    2013-01-01

    Bacteroidetes are commonly assumed to be specialized in degrading high molecular weight (HMW) compounds and to have a preference for growth attached to particles, surfaces or algal cells. The first sequenced genomes of marine Bacteroidetes seemed to confirm this assumption. Many more genomes have been sequenced recently. Here, a comparative analysis of marine Bacteroidetes genomes revealed a life strategy different from those of other important phyla of marine bacterioplankton such as Cyanobacteria and Proteobacteria. Bacteroidetes have many adaptations to grow attached to particles, have the capacity to degrade polymers, including a large number of peptidases, glycoside hydrolases (GHs), glycosyl transferases, adhesion proteins, as well as the genes for gliding motility. Several of the polymer degradation genes are located in close association with genes for TonB-dependent receptors and transducers, suggesting an integrated regulation of adhesion and degradation of polymers. This confirmed the role of this abundant group of marine bacteria as degraders of particulate matter. Marine Bacteroidetes had a significantly larger number of proteases than GHs, while non-marine Bacteroidetes had equal numbers of both. Proteorhodopsin containing Bacteroidetes shared two characteristics: small genome size and a higher number of genes involved in CO2 fixation per Mb. The latter may be important in order to survive when floating freely in the illuminated, but nutrient-poor, ocean surface. PMID:23303374

  14. Image analysis in comparative genomic hybridization

    SciTech Connect

    Lundsteen, C.; Maahr, J.; Christensen, B.

    1995-01-01

    Comparative genomic hybridization (CGH) is a new technique by which genomic imbalances can be detected by combining in situ suppression hybridization of whole genomic DNA and image analysis. We have developed software for rapid, quantitative CGH image analysis by a modification and extension of the standard software used for routine karyotyping of G-banded metaphase spreads in the Magiscan chromosome analysis system. The DAPI-counterstained metaphase spread is karyotyped interactively. Corrections for image shifts between the DAPI, FITC, and TRITC images are done manually by moving the three images relative to each other. The fluorescence background is subtracted. A mean filter is applied to smooth the FITC and TRITC images before the fluorescence ratio between the individual FITC and TRITC-stained chromosomes is computed pixel by pixel inside the area of the chromosomes determined by the DAPI boundaries. Fluorescence intensity ratio profiles are generated, and peaks and valleys indicating possible gains and losses of test DNA are marked if they exceed ratios below 0.75 and above 1.25. By combining the analysis of several metaphase spreads, consistent findings of gains and losses in all or almost all spreads indicate chromosomal imbalance. Chromosomal imbalances are detected either by visual inspection of fluorescence ratio (FR) profiles or by a statistical approach that compares FR measurements of the individual case with measurements of normal chromosomes. The complete analysis of one metaphase can be carried out in approximately 10 minutes. 8 refs., 7 figs., 1 tab.

  15. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    SciTech Connect

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  16. Comparative genome map of human and cattle

    SciTech Connect

    Solinas-Toldo, S.; Fries, R.; Lengauer, C.

    1995-06-10

    Chromosomal homologies between individual human chromosomes and the bovine karyotype have been established by using a new approach termed Zoo-FISH. Labeled DNA libraries from flow-sorted human chromosomes were used as probes for fluorescence in situ hybridization on cattle chromosomes. All human DNA libraries, except the Y chromosome library, hybridized to one or more cattle chromosomes, identifying and delineating 50 segments of homology, most of them corresponding to the regions of homology as identified by the previous mapping of individual conserved loci. However, Zoo-FISH refines the comparative maps constructed by molecular gene mapping of individual loci by providing information on the boundaries of conserved regions in the absence of obvious cytogenetic homologies of human and bovine chromosomes. It allows study of karyotypic evolution and opens new avenues for genomic analysis by facilitating the extrapolation of results from the human genome initiative. 50 refs., 3 figs., 1 tab.

  17. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria.

    PubMed

    Geissler, Andreas J; Behr, Jürgen; Vogel, Rudi F

    2016-10-06

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. Copyright © 2016 Geissler et al.

  18. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  19. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

    PubMed Central

    Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  20. Comparative proteogenomics: Combining mass spectrometry and comparative genomics to analyze multiple genomes

    PubMed Central

    Gupta, Nitin; Benhamida, Jamal; Bhargava, Vipul; Goodman, Daniel; Kain, Elisabeth; Kerman, Ian; Nguyen, Ngan; Ollikainen, Noah; Rodriguez, Jesse; Wang, Jian; Lipton, Mary S.; Romine, Margaret; Bafna, Vineet; Smith, Richard D.; Pevzner, Pavel A.

    2008-01-01

    Recent proliferation of low-cost DNA sequencing techniques will soon lead to an explosive growth in the number of sequenced genomes and will turn manual annotations into a luxury. Mass spectrometry recently emerged as a valuable technique for proteogenomic annotations that improves on the state-of-the-art in predicting genes and other features. However, previous proteogenomic approaches were limited to a single genome and did not take advantage of analyzing mass spectrometry data from multiple genomes at once. We show that such a comparative proteogenomics approach (like comparative genomics) allows one to address the problems that remained beyond the reach of the traditional “single proteome” approach in mass spectrometry. In particular, we show how comparative proteogenomics addresses the notoriously difficult problem of “one-hit-wonders” in proteomics, improves on the existing gene prediction tools in genomics, and allows identification of rare post-translational modifications. We therefore argue that complementing DNA sequencing projects by comparative proteogenomics projects can be a viable approach to improve both genomic and proteomic annotations. PMID:18426904

  1. Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes.

    PubMed

    Gupta, Nitin; Benhamida, Jamal; Bhargava, Vipul; Goodman, Daniel; Kain, Elisabeth; Kerman, Ian; Nguyen, Ngan; Ollikainen, Noah; Rodriguez, Jesse; Wang, Jian; Lipton, Mary S; Romine, Margaret; Bafna, Vineet; Smith, Richard D; Pevzner, Pavel A

    2008-07-01

    Recent proliferation of low-cost DNA sequencing techniques will soon lead to an explosive growth in the number of sequenced genomes and will turn manual annotations into a luxury. Mass spectrometry recently emerged as a valuable technique for proteogenomic annotations that improves on the state-of-the-art in predicting genes and other features. However, previous proteogenomic approaches were limited to a single genome and did not take advantage of analyzing mass spectrometry data from multiple genomes at once. We show that such a comparative proteogenomics approach (like comparative genomics) allows one to address the problems that remained beyond the reach of the traditional "single proteome" approach in mass spectrometry. In particular, we show how comparative proteogenomics addresses the notoriously difficult problem of "one-hit-wonders" in proteomics, improves on the existing gene prediction tools in genomics, and allows identification of rare post-translational modifications. We therefore argue that complementing DNA sequencing projects by comparative proteogenomics projects can be a viable approach to improve both genomic and proteomic annotations.

  2. A process for analysis of microarray comparative genomics hybridisation studies for bacterial genomes

    PubMed Central

    Carter, Ben; Wu, Guanghui; Woodward, Martin J; Anjum, Muna F

    2008-01-01

    Background Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes. PMID:18230148

  3. Genome sequence of the β-rhizobium Cupriavidus taiwanensis and comparative genomics of rhizobia

    PubMed Central

    Amadou, Claire; Pascal, Géraldine; Mangenot, Sophie; Glew, Michelle; Bontemps, Cyril; Capela, Delphine; Carrère, Sébastien; Cruveiller, Stéphane; Dossat, Carole; Lajus, Aurélie; Marchetti, Marta; Poinsot, Véréna; Rouy, Zoé; Servin, Bertrand; Saad, Maged; Schenowitz, Chantal; Barbe, Valérie; Batut, Jacques; Médigue, Claudine; Masson-Boivin, Catherine

    2008-01-01

    We report the first complete genome sequence of a β-proteobacterial nitrogen-fixing symbiont of legumes, Cupriavidus taiwanensis LMG19424. The genome consists of two chromosomes of size 3.42 Mb and 2.50 Mb, and a large symbiotic plasmid of 0.56 Mb. The C. taiwanensis genome displays an unexpected high similarity with the genome of the saprophytic bacterium C. eutrophus H16, despite being 0.94 Mb smaller. Both organisms harbor two chromosomes with large regions of synteny interspersed by specific regions. In contrast, the two species host highly divergent plasmids, with the consequence that C. taiwanensis is symbiotically proficient and less metabolically versatile. Altogether, specific regions in C. taiwanensis compared with C. eutrophus cover 1.02 Mb and are enriched in genes associated with symbiosis or virulence in other bacteria. C. taiwanensis reveals characteristics of a minimal rhizobium, including the most compact (35-kb) symbiotic island (nod and nif) identified so far in any rhizobium. The atypical phylogenetic position of C. taiwanensis allowed insightful comparative genomics of all available rhizobium genomes. We did not find any gene that was both common and specific to all rhizobia, thus suggesting that a unique shared genetic strategy does not support symbiosis of rhizobia with legumes. Instead, phylodistribution analysis of more than 200 Sinorhizobium meliloti known symbiotic genes indicated large and complex variations of their occurrence in rhizobia and non-rhizobia. This led us to devise an in silico method to extract genes preferentially associated with rhizobia. We discuss how the novel genes we have identified may contribute to symbiotic adaptation. PMID:18490699

  4. MGcV: the microbial genomic context viewer for comparative genome analysis

    PubMed Central

    2013-01-01

    Background Conserved gene context is used in many types of comparative genome analyses. It is used to provide leads on gene function, to guide the discovery of regulatory sequences, but also to aid in the reconstruction of metabolic networks. We present the Microbial Genomic context Viewer (MGcV), an interactive, web-based application tailored to strengthen the practice of manual comparative genome context analysis for bacteria. Results MGcV is a versatile, easy-to-use tool that renders a visualization of the genomic context of any set of selected genes, genes within a phylogenetic tree, genomic segments, or regulatory elements. It is tailored to facilitate laborious tasks such as the interactive annotation of gene function, the discovery of regulatory elements, or the sequence-based reconstruction of gene regulatory networks. We illustrate that MGcV can be used in gene function annotation by visually integrating information on prokaryotic genes, like their annotation as available from NCBI with other annotation data such as Pfam domains, sub-cellular location predictions and gene-sequence characteristics such as GC content. We also illustrate the usefulness of the interactive features that allow the graphical selection of genes to facilitate data gathering (e.g. upstream regions, ID’s or annotation), in the analysis and reconstruction of transcription regulation. Moreover, putative regulatory elements and their corresponding scores or data from RNA-seq and microarray experiments can be uploaded, visualized and interpreted in (ranked-) comparative context maps. The ranked maps allow the interpretation of predicted regulatory elements and experimental data in light of each other. Conclusion MGcV advances the manual comparative analysis of genes and regulatory elements by providing fast and flexible integration of gene related data combined with straightforward data retrieval. MGcV is available at http://mgcv.cmbi.ru.nl. PMID:23547764

  5. Genome transplantation in bacteria: changing one species to another.

    PubMed

    Lartigue, Carole; Glass, John I; Alperovich, Nina; Pieper, Rembert; Parmar, Prashanth P; Hutchison, Clyde A; Smith, Hamilton O; Venter, J Craig

    2007-08-03

    As a step toward propagation of synthetic genomes, we completely replaced the genome of a bacterial cell with one from another species by transplanting a whole genome as naked DNA. Intact genomic DNA from Mycoplasma mycoides large colony (LC), virtually free of protein, was transplanted into Mycoplasma capricolum cells by polyethylene glycol-mediated transformation. Cells selected for tetracycline resistance, carried by the M. mycoides LC chromosome, contain the complete donor genome and are free of detectable recipient genomic sequences. These cells that result from genome transplantation are phenotypically identical to the M. mycoides LC donor strain as judged by several criteria.

  6. Comparative genomics of parasitic silkworm microsporidia reveal an association between genome expansion and host adaptation

    PubMed Central

    2013-01-01

    Background Microsporidian Nosema bombycis has received much attention because the pébrine disease of domesticated silkworms results in great economic losses in the silkworm industry. So far, no effective treatment could be found for pébrine. Compared to other known Nosema parasites, N. bombycis can unusually parasitize a broad range of hosts. To gain some insights into the underlying genetic mechanism of pathological ability and host range expansion in this parasite, a comparative genomic approach is conducted. The genome of two Nosema parasites, N. bombycis and N. antheraeae (an obligatory parasite to undomesticated silkworms Antheraea pernyi), were sequenced and compared with their distantly related species, N. ceranae (an obligatory parasite to honey bees). Results Our comparative genomics analysis show that the N. bombycis genome has greatly expanded due to the following three molecular mechanisms: 1) the proliferation of host-derived transposable elements, 2) the acquisition of many horizontally transferred genes from bacteria, and 3) the production of abundnant gene duplications. To our knowledge, duplicated genes derived not only from small-scale events (e.g., tandem duplications) but also from large-scale events (e.g., segmental duplications) have never been seen so abundant in any reported microsporidia genomes. Our relative dating analysis further indicated that these duplication events have arisen recently over very short evolutionary time. Furthermore, several duplicated genes involving in the cytotoxic metabolic pathway were found to undergo positive selection, suggestive of the role of duplicated genes on the adaptive evolution of pathogenic ability. Conclusions Genome expansion is rarely considered as the evolutionary outcome acting on those highly reduced and compact parasitic microsporidian genomes. This study, for the first time, demonstrates that the parasitic genomes can expand, instead of shrink, through several common molecular mechanisms

  7. The proteolytic system of lactic acid bacteria revisited: a genomic comparison

    PubMed Central

    2010-01-01

    Background Lactic acid bacteria (LAB) are a group of gram-positive, lactic acid producing Firmicutes. They have been extensively used in food fermentations, including the production of various dairy products. The proteolytic system of LAB converts proteins to peptides and then to amino acids, which is essential for bacterial growth and also contributes significantly to flavor compounds as end-products. Recent developments in high-throughput genome sequencing and comparative genomics hybridization arrays provide us with opportunities to explore the diversity of the proteolytic system in various LAB strains. Results We performed a genome-wide comparative genomics analysis of proteolytic system components, including cell-wall bound proteinase, peptide transporters and peptidases, in 22 sequenced LAB strains. The peptidase families PepP/PepQ/PepM, PepD and PepI/PepR/PepL are described as examples of our in silico approach to refine the distinction of subfamilies with different enzymatic activities. Comparison of protein 3D structures of proline peptidases PepI/PepR/PepL and esterase A allowed identification of a conserved core structure, which was then used to improve phylogenetic analysis and functional annotation within this protein superfamily. The diversity of proteolytic system components in 39 Lactococcus lactis strains was explored using pangenome comparative genome hybridization analysis. Variations were observed in the proteinase PrtP and its maturation protein PrtM, in one of the Opp transport systems and in several peptidases between strains from different Lactococcus subspecies or from different origin. Conclusions The improved functional annotation of the proteolytic system components provides an excellent framework for future experimental validations of predicted enzymatic activities. The genome sequence data can be coupled to other "omics" data e.g. transcriptomics and metabolomics for prediction of proteolytic and flavor-forming potential of LAB strains

  8. Complete Genome Sequence and Comparative Genomics of Shigella flexneri Serotype 2a Strain 2457T†

    PubMed Central

    Wei, J.; Goldberg, M. B.; Burland, V.; Venkatesan, M. M.; Deng, W.; Fournier, G.; Mayhew, G. F.; Plunkett, G.; Rose, D. J.; Darling, A.; Mau, B.; Perna, N. T.; Payne, S. M.; Runyen-Janecky, L. J.; Zhou, S.; Schwartz, D. C.; Blattner, F. R.

    2003-01-01

    We determined the complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp). Shigella species cause >1 million deaths per year from dysentery and diarrhea and have a lifestyle that is markedly different from those of closely related bacteria, including Escherichia coli. The genome exhibits the backbone and island mosaic structure of E. coli pathogens, albeit with much less horizontally transferred DNA and lacking 357 genes present in E. coli. The strain is distinctive in its large complement of insertion sequences, with several genomic rearrangements mediated by insertion sequences, 12 cryptic prophages, 372 pseudogenes, and 195 S. flexneri-specific genes. The 2457T genome was also compared with that of a recently sequenced S. flexneri 2a strain, 301. Our data are consistent with Shigella being phylogenetically indistinguishable from E. coli. The S. flexneri-specific regions contain many genes that could encode proteins with roles in virulence. Analysis of these will reveal the genetic basis for aspects of this pathogenic organism's distinctive lifestyle that have yet to be explained. PMID:12704152

  9. Comparative genomics approaches to study organism similarities and differences

    SciTech Connect

    Wei, Liping; Liu, Yueyi; Dubchak, Inna; Shon, John; Park, John

    2002-06-01

    Comparative genomics is a large-scale, holistic approach that compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparative studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms. We discuss in detail the type of analyses that offer significant biological insights in the comparisons of (1) genome structure including overall genome statistics, repeats, genome rearrangement at both DNA and gene level, synteny, and breakpoints; (2) coding regions including gene content, protein content, orthologs, and paralogs; and (3) noncoding regions including the prediction of regulatory elements. We also briefly review the currently available computational tools in comparative genomics such as algorithms for genome-scale sequence alignment, gene identification, and nonhomology-based function prediction.

  10. [Sotos syndrome diagnosed by comparative genomic hybridisation].

    PubMed

    Saldarriaga, Wilmar; Molina-Barrera, Laura Camila; Ramírez-Cheyne, Julián

    2016-01-01

    Sotos Syndrome (SS) is a genetic disease with an autosomal dominant pattern caused by haplo-insufficiency of NSD1 gene secondary to point mutations or microdeletion of the 5q35 locus where the gene is located. It is a rare syndrome, occurring in 7 out of every 100,000 births. The objective of this report is to present the case of a 4 year-old patient with a global developmental delay, as well as specific physical findings suggesting a syndrome of genetic origin. Female patient, 4 years of age, thinning hair, triangular facie, long palpebral fissure, arched palate, prominent jaw, winged scapula and clinodactilia of the fifth finger both hands. The molecular test comparative genomic hybridisation test by microarray was subsequently performed, with the result showing 5q35.2 q35.3 region microdeletion of 2,082 MB, including the NSD1 gene. Finally, this article also proposes the performing of comparative genomic hybridisation as the first diagnostic option in cases where clinical findings are suggestive of SS. Copyright © 2015 Sociedad Chilena de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.

  11. Comparative Genomics of Ethanolamine Utilization▿ † ‡

    PubMed Central

    Tsoy, Olga; Ravcheev, Dmitry; Mushegian, Arcady

    2009-01-01

    Ethanolamine can be used as a source of carbon and nitrogen by phylogenetically diverse bacteria. Ethanolamine-ammonia lyase, the enzyme that breaks ethanolamine into acetaldehyde and ammonia, is encoded by the gene tandem eutBC. Despite extensive studies of ethanolamine utilization in Salmonella enterica serovar Typhimurium, much remains to be learned about EutBC structure and catalytic mechanism, about the evolutionary origin of ethanolamine utilization, and about regulatory links between the metabolism of ethanolamine itself and the ethanolamine-ammonia lyase cofactor adenosylcobalamin. We used computational analysis of sequences, structures, genome contexts, and phylogenies of ethanolamine-ammonia lyases to address these questions and to evaluate recent data-mining studies that have suggested an association between bacterial food poisoning and the diol utilization pathways. We found that EutBC evolution included recruitment of a TIM barrel and a Rossmann fold domain and their fusion to N-terminal α-helical domains to give EutB and EutC, respectively. This fusion was followed by recruitment and occasional loss of auxiliary ethanolamine utilization genes in Firmicutes and by several horizontal transfers, most notably from the firmicute stem to the Enterobacteriaceae and from Alphaproteobacteria to Actinobacteria. We identified a conserved DNA motif that likely represents the EutR-binding site and is shared by the ethanolamine and cobalamin operons in several enterobacterial species, suggesting a mechanism for coupling the biosyntheses of apoenzyme and cofactor in these species. Finally, we found that the food poisoning phenotype is associated with the structural components of metabolosome more strongly than with ethanolamine utilization genes or with paralogous propanediol utilization genes per se. PMID:19783625

  12. High-Density Transcriptional Initiation Signals Underline Genomic Islands in Bacteria

    PubMed Central

    Huang, Qianli; Cheng, Xuanjin; Cheung, Man Kit; Kiselev, Sergey S.; Ozoline, Olga N.; Kwan, Hoi Shan

    2012-01-01

    Genomic islands (GIs), frequently associated with the pathogenicity of bacteria and having a substantial influence on bacterial evolution, are groups of “alien” elements which probably undergo special temporal–spatial regulation in the host genome. Are there particular hallmark transcriptional signals for these “exotic” regions? We here explore the potential transcriptional signals that underline the GIs beyond the conventional views on basic sequence composition, such as codon usage and GC property bias. It showed that there is a significant enrichment of the transcription start positions (TSPs) in the GI regions compared to the whole genome of Salmonella enterica and Escherichia coli. There was up to a four-fold increase for the 70% GIs, implying high-density TSPs profile can potentially differentiate the GI regions. Based on this feature, we developed a new sliding window method GIST, Genomic-island Identification by Signals of Transcription, to identify these regions. Subsequently, we compared the known GI-associated features of the GIs detected by GIST and by the existing method Islandviewer to those of the whole genome. Our method demonstrates high sensitivity in detecting GIs harboring genes with biased GI-like function, preferred subcellular localization, skewed GC property, shorter gene length and biased “non-optimal” codon usage. The special transcriptional signals discovered here may contribute to the coordinate expression regulation of foreign genes. Finally, by using GIST, we detected many interesting GIs in the 2011 German E. coli O104:H4 outbreak strain TY-2482, including the microcin H47 system and gene cluster ycgXEFZ-ymgABC that activates the production of biofilm matrix. The aforesaid findings highlight the power of GIST to predict GIs with distinct intrinsic features to the genome. The heterogeneity of cumulative TSPs profiles may not only be a better identity for “alien” regions, but also provide hints to the special

  13. Horizontal gene transfer from diverse bacteria to an insect genome enables a tripartite nested mealybug symbiosis.

    PubMed

    Husnik, Filip; Nikoh, Naruo; Koga, Ryuichi; Ross, Laura; Duncan, Rebecca P; Fujie, Manabu; Tanaka, Makiko; Satoh, Nori; Bachtrog, Doris; Wilson, Alex C C; von Dohlen, Carol D; Fukatsu, Takema; McCutcheon, John P

    2013-06-20

    The smallest reported bacterial genome belongs to Tremblaya princeps, a symbiont of Planococcus citri mealybugs (PCIT). Tremblaya PCIT not only has a 139 kb genome, but possesses its own bacterial endosymbiont, Moranella endobia. Genome and transcriptome sequencing, including genome sequencing from a Tremblaya lineage lacking intracellular bacteria, reveals that the extreme genomic degeneracy of Tremblaya PCIT likely resulted from acquiring Moranella as an endosymbiont. In addition, at least 22 expressed horizontally transferred genes from multiple diverse bacteria to the mealybug genome likely complement missing symbiont genes. However, none of these horizontally transferred genes are from Tremblaya, showing that genome reduction in this symbiont has not been enabled by gene transfer to the host nucleus. Our results thus indicate that the functioning of this three-way symbiosis is dependent on genes from at least six lineages of organisms and reveal a path to intimate endosymbiosis distinct from that followed by organelles.

  14. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains

    PubMed Central

    da Silva, Vivian S; Shida, Cláudio S; Rodrigues, Fabiana B; Ribeiro, Diógenes CD; de Souza, Alessandra A; Coletta-Filho, Helvécio D; Machado, Marcos A; Nunes, Luiz R; de Oliveira, Regina Costa

    2007-01-01

    Background The xylem-inhabiting bacterium Xylella fastidiosa (Xf) is the causal agent of Pierce's disease (PD) in vineyards and citrus variegated chlorosis (CVC) in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains. Results This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH), identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Conclusion Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly identified ORFs, obtained by

  15. Comparative genomic analysis of prion genes

    PubMed Central

    Premzl, Marko; Gamulin, Vera

    2007-01-01

    Background The homologues of human disease genes are expected to contribute to better understanding of physiological and pathogenic processes. We made use of the present availability of vertebrate genomic sequences, and we have conducted the most comprehensive comparative genomic analysis of the prion protein gene PRNP and its homologues, shadow of prion protein gene SPRN and doppel gene PRND, and prion testis-specific gene PRNT so far. Results While the SPRN and PRNP homologues are present in all vertebrates, PRND is known in tetrapods, and PRNT is present in primates. PRNT could be viewed as a TE-associated gene. Using human as the base sequence for genomic sequence comparisons (VISTA), we annotated numerous potential cis-elements. The conserved regions in SPRNs harbour the potential Sp1 sites in promoters (mammals, birds), C-rich intron splicing enhancers and PTB intron splicing silencers in introns (mammals, birds), and hsa-miR-34a sites in 3'-UTRs (eutherians). We showed the conserved PRNP upstream regions, which may be potential enhancers or silencers (primates, dog). In the PRNP 3'-UTRs, there are conserved cytoplasmic polyadenylation element sites (mammals, birds). The PRND core promoters include highly conserved CCAAT, CArG and TATA boxes (mammals). We deduced 42 new protein primary structures, and performed the first phylogenetic analysis of all vertebrate prion genes. Using the protein alignment which included 122 sequences, we constructed the neighbour-joining tree which showed four major clusters, including shadoos, shadoo2s and prion protein-likes (cluster 1), fish prion proteins (cluster 2), tetrapode prion proteins (cluster 3) and doppels (cluster 4). We showed that the entire prion protein conformationally plastic region is well conserved between eutherian prion proteins and shadoos (18–25% identity and 28–34% similarity), and there could be a potential structural compatibility between shadoos and the left-handed parallel beta-helical fold

  16. ERCC1: a comparative genomic perspective.

    PubMed

    Wilson, M D; Ruttan, C C; Koop, B F; Glickman, B W

    2001-01-01

    ERCC1 plays an essential role in the nucleotide excision repair (NER) of DNA. We compare 37 kb of sequence from the ERCC1 region on human chromosome 19q13.3 to the orthologous region on mouse chromosome 7. In addition to showing the conserved gene structure between ERCC1, ASE-1, and their murine counterparts, this genomic comparison reveals a highly conserved 497 bp segment found 5 kb upstream of ERCC1 exon 1 that contains a CpG island and previously unidentified "classical" promoter elements. Additional putative regulatory elements are also found within a conserved LINE-1 (long interspersed nuclear element) sequence 800 bp upstream of exon 1 in both human and mouse. Expressed sequence tag (EST) assemblies for human ERCC1 identified numerous splice variants involving exons 1, 2, 3, 7, 8, and 9 that could affect DNA repair efficiencies of ERCC1. A previously undescribed transcript that reads through exon 9 and utilizes the polyadenylation signal of a neighboring Alu element accounts for nearly half of the total splice variants identified in the human EST database. This transcript would theoretically translate to a larger ERCC1 protein product containing a novel C-terminal end. Overall, approximately 18% of publicly available ERCC1 cDNA sequences were determined to be splice variants, while no variants were found in the mouse. The ability to assess novel transcripts and identify candidate regulatory regions demonstrates the potential utility for a catalogue archiving comparative analyses for all genes involved in DNA repair. Our comparative genomic analysis of ERCC1 can be viewed at http://web.uvic.ca/-bioweb/laj.html. Copyright 2001 Wiley-Liss, Inc.

  17. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  18. Strikingly Bacteria-Like and Gene-Rich Mitochondrial Genomes throughout Jakobid Protists

    PubMed Central

    Burger, Gertraud; Gray, Michael W.; Forget, Lise; Lang, B. Franz

    2013-01-01

    The most bacteria-like mitochondrial genome known is that of the jakobid flagellate Reclinomonas americana NZ. This genome also encodes the largest known gene set among mitochondrial DNAs (mtDNAs), including the RNA subunit of RNase P (transfer RNA processing), a reduced form of transfer–messenger RNA (translational control), and a four-subunit bacteria-like RNA polymerase, which in other eukaryotes is substituted by a nucleus-encoded, single-subunit, phage-like enzyme. Further, protein-coding genes are preceded by potential Shine–Dalgarno translation initiation motifs. Whether similarly ancestral mitochondrial characters also exist in relatives of R. americana NZ is unknown. Here, we report a comparative analysis of nine mtDNAs from five distant jakobid genera: Andalucia, Histiona, Jakoba, Reclinomonas, and Seculamonas. We find that Andalucia godoyi has an even larger mtDNA gene complement than R. americana NZ. The extra genes are rpl35 (a large subunit mitoribosomal protein) and cox15 (involved in cytochrome oxidase assembly), which are nucleus encoded throughout other eukaryotes. Andalucia cox15 is strikingly similar to its homolog in the free-living α-proteobacterium Tistrella mobilis. Similarly, a long, highly conserved gene cluster in jakobid mtDNAs, which is a clear vestige of prokaryotic operons, displays a gene order more closely resembling that in free-living α-proteobacteria than in Rickettsiales species. Although jakobid mtDNAs, overall, are characterized by bacteria-like features, they also display a few remarkably divergent characters, such as 3′-tRNA editing in Seculamonas ecuadoriensis and genome linearization in Jakoba libera. Phylogenetic analysis with mtDNA-encoded proteins strongly supports monophyly of jakobids with Andalucia as the deepest divergence. However, it remains unclear which α-proteobacterial group is the closest mitochondrial relative. PMID:23335123

  19. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    PubMed Central

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  20. Genome Sequence of Desulfurella amilsii Strain TR1 and Comparative Genomics of Desulfurellaceae Family

    PubMed Central

    Florentino, Anna P.; Stams, Alfons J. M.; Sánchez-Andrea, Irene

    2017-01-01

    genomes. Therefore, the regulation of those genes, or a mechanism not yet known, might be responsible for the unique ability of D. amilsii. This is the first report on comparative genomics of sulfur-reducing bacteria, which is valuable to give insight into this poorly understood metabolism, but of great potential for biotechnological purposes and of environmental significance. PMID:28265263

  1. Whole-Genome Sequence and Classification of 11 Endophytic Bacteria from Poison Ivy (Toxicodendron radicans)

    PubMed Central

    Tran, Phuong N.; Tan, Nicholas E. H.; Lee, Yin Peng; Gan, Han Ming; Polter, Steven J.; Dailey, Lucas K.; Hudson, André O.

    2015-01-01

    Here, we report the whole-genome sequences and annotation of 11 endophytic bacteria from poison ivy (Toxicodendron radicans) vine tissue. Five bacteria belong to the genus Pseudomonas, and six single members from other genera were found present in interior vine tissue of poison ivy. PMID:26586879

  2. Whole-Genome Sequence and Classification of 11 Endophytic Bacteria from Poison Ivy (Toxicodendron radicans).

    PubMed

    Tran, Phuong N; Tan, Nicholas E H; Lee, Yin Peng; Gan, Han Ming; Polter, Steven J; Dailey, Lucas K; Hudson, André O; Savka, Michael A

    2015-11-19

    Here, we report the whole-genome sequences and annotation of 11 endophytic bacteria from poison ivy (Toxicodendron radicans) vine tissue. Five bacteria belong to the genus Pseudomonas, and six single members from other genera were found present in interior vine tissue of poison ivy. Copyright © 2015 Tran et al.

  3. Genome Sequences of Three Spore-Forming Bacteria Isolated from the Feces of Organically Raised Chickens

    PubMed Central

    Kennedy, Victoria; Van Laar, Tricia A.; Aleru, Omoshola; Thomas, Michael; Ganci, Michelle

    2016-01-01

    Antibiotic feed supplements have been implicated in the rise of multidrug-resistant bacteria. An alternative to antibiotics is probiotics. Here, we report the genome sequences of two Bacillus and one Solibacillus species, all spore-forming, Gram-positive bacteria, isolated from the feces organically raised chicken feces, with potential to serve as probiotics. PMID:27587809

  4. A universal genomic coordinate translator for comparative genomics.

    PubMed

    Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

    2014-06-30

    Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across

  5. An evaluation of Comparative Genome Sequencing (CGS) by comparing two previously-sequenced bacterial genomes

    PubMed Central

    Herring, Christopher D; Palsson, Bernhard Ø

    2007-01-01

    Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions. PMID:17697331

  6. Draft Genome Sequences of Four Alkaliphilic Bacteria Belonging to the Anaerobacillus Genus

    PubMed Central

    2017-01-01

    ABSTRACT The draft genomes of the alkaliphilic, anaerobic bacteria, Anaerobacillus arseniciselenatis, A. alkalidiazotrophicus, and A. alkalilacustris, and a novel closely related isolate of the Anaerobacillus genus are reported here. These assembled genomes will help identify, at the molecular level, the phenotypic differences between the species of this poorly characterized genus. PMID:28104661

  7. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  8. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  9. The bonobo genome compared with the chimpanzee and human genomes

    PubMed Central

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R.; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R.; Mullikin, James C.; Meader, Stephen J.; Ponting, Chris P.; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E.; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M.; Fischer, Anne; Ptak, Susan E.; Lachmann, Michael; Symer, David E.; Mailund, Thomas; Schierup, Mikkel H.; Andrés, Aida M.; Kelso, Janet; Pääbo, Svante

    2012-01-01

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours1–4, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other. PMID:22722832

  10. The bonobo genome compared with the chimpanzee and human genomes.

    PubMed

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R; Mullikin, James C; Meader, Stephen J; Ponting, Chris P; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M; Fischer, Anne; Ptak, Susan E; Lachmann, Michael; Symer, David E; Mailund, Thomas; Schierup, Mikkel H; Andrés, Aida M; Kelso, Janet; Pääbo, Svante

    2012-06-28

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.

  11. Comparative Genomics of Ricketttsia prowazekii Madrid E and Breinl Strains

    DTIC Science & Technology

    2004-01-01

    gram -negative bacteria that belong to the alpha subdivision of Proteobacteria (48). Rickettsial diseases are widely distributed throughout the world...anhydride to minimize background staining . Hybridization of genomic DNA. Genomic DNAs from the Madrid E and Breinl strains were used as templates for direct...export system; ATP- binding protein RP007 lpxA UDP-GlcNAc acyltransferase RP771 pal Peptidoglycan -associated lipoprotein RP451 sca3 Cell surface

  12. SearchDOGS Bacteria, Software That Provides Automated Identification of Potentially Missed Genes in Annotated Bacterial Genomes

    PubMed Central

    ÓhÉigeartaigh, Seán S.; Armisén, David; Byrne, Kevin P.

    2014-01-01

    We report the development of SearchDOGS Bacteria, software to automatically detect missing genes in annotated bacterial genomes by combining BLAST searches with comparative genomics. Having successfully applied the approach to yeast genomes, we redeveloped SearchDOGS to function as a standalone, downloadable package, requiring only a set of GenBank annotation files as input. The software automatically generates a homology structure using reciprocal BLAST and a synteny-based method; this is followed by a scan of the entire genome of each species for unannotated genes. Results are provided in a HTML interface, providing coordinates, BLAST results, syntenic location, omega values (Ka/Ks, where Ks is the number of synonymous substitutions per synonymous site and Ka is the number of nonsynonymous substitutions per nonsynonymous site) for protein conservation estimates, and other information for each candidate gene. Using SearchDOGS Bacteria, we identified 155 gene candidates in the Shigella boydii sb227 genome, including 56 candidates of length < 60 codons. SearchDOGS Bacteria has two major advantages over currently available annotation software. First, it outperforms current methods in terms of sensitivity and is highly effective at identifying small or highly diverged genes. Second, as a freely downloadable package, it can be used with unpublished or confidential data. PMID:24659774

  13. Discovery of novel plant interaction determinants from the genomes of 163 root nodule bacteria

    SciTech Connect

    Seshadri, Rekha; Reeve, Wayne G.; Ardley, Julie K.; Tennessen, Kristin; Woyke, Tanja; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2015-11-20

    Root nodule bacteria (RNB) or “rhizobia” are a type of plant growth promoting bacteria, typified by their ability to fix nitrogen for their plant host, fixing nearly 65% of the nitrogen currently utilized in sustainable agricultural production of legume crops and pastures. In this study, we sequenced the genomes of 110 RNB from diverse hosts and biogeographical regions, and undertook a global exploration of all available RNB genera with the aim of identifying novel genetic determinants of symbiotic association and plant growth promotion. Specifically, we performed a subtractive comparative analysis with non-RNB genomes, employed relevant transcriptomic data, and leveraged phylogenetic distribution patterns and sequence signatures based on known precepts of symbioticand host-microbe interactions. A total of 184 protein families were delineated, including known factors for nodulation and nitrogen fixation, and candidates with previously unexplored functions, for which a role in host-interaction, -regulation, biocontrol, and more, could be posited. Lastly, these analyses expand our knowledge of the RNB purview and provide novel targets for strain improvement in the ultimate quest to enhance plant productivity and agricultural sustainability.

  14. Discovery of Novel Plant Interaction Determinants from the Genomes of 163 Root Nodule Bacteria

    PubMed Central

    Seshadri, Rekha; Reeve, Wayne G.; Ardley, Julie K.; Tennessen, Kristin; Woyke, Tanja; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2015-01-01

    Root nodule bacteria (RNB) or “rhizobia” are a type of plant growth promoting bacteria, typified by their ability to fix nitrogen for their plant host, fixing nearly 65% of the nitrogen currently utilized in sustainable agricultural production of legume crops and pastures. In this study, we sequenced the genomes of 110 RNB from diverse hosts and biogeographical regions, and undertook a global exploration of all available RNB genera with the aim of identifying novel genetic determinants of symbiotic association and plant growth promotion. Specifically, we performed a subtractive comparative analysis with non-RNB genomes, employed relevant transcriptomic data, and leveraged phylogenetic distribution patterns and sequence signatures based on known precepts of symbiotic- and host-microbe interactions. A total of 184 protein families were delineated, including known factors for nodulation and nitrogen fixation, and candidates with previously unexplored functions, for which a role in host-interaction, -regulation, biocontrol, and more, could be posited. These analyses expand our knowledge of the RNB purview and provide novel targets for strain improvement in the ultimate quest to enhance plant productivity and agricultural sustainability. PMID:26584898

  15. Discovery of novel plant interaction determinants from the genomes of 163 root nodule bacteria

    DOE PAGES

    Seshadri, Rekha; Reeve, Wayne G.; Ardley, Julie K.; ...

    2015-11-20

    Root nodule bacteria (RNB) or “rhizobia” are a type of plant growth promoting bacteria, typified by their ability to fix nitrogen for their plant host, fixing nearly 65% of the nitrogen currently utilized in sustainable agricultural production of legume crops and pastures. In this study, we sequenced the genomes of 110 RNB from diverse hosts and biogeographical regions, and undertook a global exploration of all available RNB genera with the aim of identifying novel genetic determinants of symbiotic association and plant growth promotion. Specifically, we performed a subtractive comparative analysis with non-RNB genomes, employed relevant transcriptomic data, and leveraged phylogeneticmore » distribution patterns and sequence signatures based on known precepts of symbioticand host-microbe interactions. A total of 184 protein families were delineated, including known factors for nodulation and nitrogen fixation, and candidates with previously unexplored functions, for which a role in host-interaction, -regulation, biocontrol, and more, could be posited. Lastly, these analyses expand our knowledge of the RNB purview and provide novel targets for strain improvement in the ultimate quest to enhance plant productivity and agricultural sustainability.« less

  16. In vivo function and comparative genomic analyses of the Drosophila gut microbiota identify candidate symbiosis factors

    PubMed Central

    Newell, Peter D.; Chaston, John M.; Wang, Yiping; Winans, Nathan J.; Sannino, David R.; Wong, Adam C. N.; Dobson, Adam J.; Kagle, Jeanne; Douglas, Angela E.

    2014-01-01

    Symbiosis is often characterized by co-evolutionary changes in the genomes of the partners involved. An understanding of these changes can provide insight into the nature of the relationship, including the mechanisms that initiate and maintain an association between organisms. In this study we examined the genome sequences of bacteria isolated from the Drosophila melanogaster gut with the objective of identifying genes that are important for function in the host. We compared microbiota isolates with con-specific or closely related bacterial species isolated from non-fly environments. First the phenotype of germ-free Drosophila (axenic flies) was compared to that of flies colonized with specific bacteria (gnotobiotic flies) as a measure of symbiotic function. Non-fly isolates were functionally distinct from bacteria isolated from flies, conferring slower development and an altered nutrient profile in the host, traits known to be microbiota-dependent. Comparative genomic methods were next employed to identify putative symbiosis factors: genes found in bacteria that restore microbiota-dependent traits to gnotobiotic flies, but absent from those that do not. Factors identified include riboflavin synthesis and stress resistance. We also used a phylogenomic approach to identify protein coding genes for which fly-isolate sequences were more similar to each other than to other sequences, reasoning that these genes may have a shared function unique to the fly environment. This method identified genes in Acetobacter species that cluster in two distinct genomic loci: one predicted to be involved in oxidative stress detoxification and another encoding an efflux pump. In summary, we leveraged genomic and in vivo functional comparisons to identify candidate traits that distinguish symbiotic bacteria. These candidates can serve as the basis for further work investigating the genetic requirements of bacteria for function and persistence in the Drosophila gut. PMID:25408687

  17. Genome Sequence of Brevibacillus formosus F12T for a Genome-Sequencing Project for Genomic Taxonomy and Phylogenomics of Bacillus-Like Bacteria

    PubMed Central

    Wang, Jie-Ping; Liu, Guo-Hong; Chen, Qian-qian; Zhu, Yu-jing; Chen, Zheng; Che, Jian-mei

    2015-01-01

    Brevibacillus formosus F12T is a Gram-positive, spore-forming, and strictly aerobic bacterium. Here, we report the draft 6.215-Mb genome sequence of B. formosus F12T, which will provide useful information for genomic taxonomy and phylogenomics of Bacillus-like bacteria, as well as for the functional gene mining and application of B. formosus. PMID:26205874

  18. An Integrated Metabolomic and Genomic Mining Workflow To Uncover the Biosynthetic Potential of Bacteria.

    PubMed

    Maansson, Maria; Vynne, Nikolaj G; Klitgaard, Andreas; Nybo, Jane L; Melchiorsen, Jette; Nguyen, Don D; Sanchez, Laura M; Ziemert, Nadine; Dorrestein, Pieter C; Andersen, Mikael R; Gram, Lone

    2016-01-01

    Microorganisms are a rich source of bioactives; however, chemical identification is a major bottleneck. Strategies that can prioritize the most prolific microbial strains and novel compounds are of great interest. Here, we present an integrated approach to evaluate the biosynthetic richness in bacteria and mine the associated chemical diversity. Thirteen strains closely related to Pseudoalteromonas luteoviolacea isolated from all over the Earth were analyzed using an untargeted metabolomics strategy, and metabolomic profiles were correlated with whole-genome sequences of the strains. We found considerable diversity: only 2% of the chemical features and 7% of the biosynthetic genes were common to all strains, while 30% of all features and 24% of the genes were unique to single strains. The list of chemical features was reduced to 50 discriminating features using a genetic algorithm and support vector machines. Features were dereplicated by tandem mass spectrometry (MS/MS) networking to identify molecular families of the same biosynthetic origin, and the associated pathways were probed using comparative genomics. Most of the discriminating features were related to antibacterial compounds, including the thiomarinols that were reported from P. luteoviolacea here for the first time. By comparative genomics, we identified the biosynthetic cluster responsible for the production of the antibiotic indolmycin, which could not be predicted with standard methods. In conclusion, we present an efficient, integrative strategy for elucidating the chemical richness of a given set of bacteria and link the chemistry to biosynthetic genes. IMPORTANCE We here combine chemical analysis and genomics to probe for new bioactive secondary metabolites based on their pattern of distribution within bacterial species. We demonstrate the usefulness of this combined approach in a group of marine Gram-negative bacteria closely related to Pseudoalteromonas luteoviolacea, which is a species known

  19. An Integrated Metabolomic and Genomic Mining Workflow To Uncover the Biosynthetic Potential of Bacteria

    PubMed Central

    Maansson, Maria; Vynne, Nikolaj G.; Klitgaard, Andreas; Nybo, Jane L.; Melchiorsen, Jette; Nguyen, Don D.; Sanchez, Laura M.; Ziemert, Nadine; Dorrestein, Pieter C.

    2016-01-01

    ABSTRACT Microorganisms are a rich source of bioactives; however, chemical identification is a major bottleneck. Strategies that can prioritize the most prolific microbial strains and novel compounds are of great interest. Here, we present an integrated approach to evaluate the biosynthetic richness in bacteria and mine the associated chemical diversity. Thirteen strains closely related to Pseudoalteromonas luteoviolacea isolated from all over the Earth were analyzed using an untargeted metabolomics strategy, and metabolomic profiles were correlated with whole-genome sequences of the strains. We found considerable diversity: only 2% of the chemical features and 7% of the biosynthetic genes were common to all strains, while 30% of all features and 24% of the genes were unique to single strains. The list of chemical features was reduced to 50 discriminating features using a genetic algorithm and support vector machines. Features were dereplicated by tandem mass spectrometry (MS/MS) networking to identify molecular families of the same biosynthetic origin, and the associated pathways were probed using comparative genomics. Most of the discriminating features were related to antibacterial compounds, including the thiomarinols that were reported from P. luteoviolacea here for the first time. By comparative genomics, we identified the biosynthetic cluster responsible for the production of the antibiotic indolmycin, which could not be predicted with standard methods. In conclusion, we present an efficient, integrative strategy for elucidating the chemical richness of a given set of bacteria and link the chemistry to biosynthetic genes. IMPORTANCE We here combine chemical analysis and genomics to probe for new bioactive secondary metabolites based on their pattern of distribution within bacterial species. We demonstrate the usefulness of this combined approach in a group of marine Gram-negative bacteria closely related to Pseudoalteromonas luteoviolacea, which is a

  20. Patterns and architecture of genomic islands in marine bacteria

    PubMed Central

    2012-01-01

    Background Genomic Islands (GIs) have key roles since they modulate the structure and size of bacterial genomes displaying a diverse set of laterally transferred genes. Despite their importance, GIs in marine bacterial genomes have not been explored systematically to uncover possible trends and to analyze their putative ecological significance. Results We carried out a comprehensive analysis of GIs in 70 selected marine bacterial genomes detected with IslandViewer to explore the distribution, patterns and functional gene content in these genomic regions. We detected 438 GIs containing a total of 8152 genes. GI number per genome was strongly and positively correlated with the total GI size. In 50% of the genomes analyzed the GIs accounted for approximately 3% of the genome length, with a maximum of 12%. Interestingly, we found transposases particularly enriched within Alphaproteobacteria GIs, and site-specific recombinases in Gammaproteobacteria GIs. We described specific Homologous Recombination GIs (HR-GIs) in several genera of marine Bacteroidetes and in Shewanella strains among others. In these HR-GIs, we recurrently found conserved genes such as the β-subunit of DNA-directed RNA polymerase, regulatory sigma factors, the elongation factor Tu and ribosomal protein genes typically associated with the core genome. Conclusions Our results indicate that horizontal gene transfer mediated by phages, plasmids and other mobile genetic elements, and HR by site-specific recombinases play important roles in the mobility of clusters of genes between taxa and within closely related genomes, modulating the flexible pool of the genome. Our findings suggest that GIs may increase bacterial fitness under environmental changing conditions by acquiring novel foreign genes and/or modifying gene transcription and/or transduction. PMID:22839777

  1. Update on Comparative Genomics of Legumes

    USDA-ARS?s Scientific Manuscript database

    This year marks the essential completion of the genome sequences of Glycine max, Medicago truncatula, and Lotus japonicus (soybean, barrel medic, and birdsfoot trefoil, respectively). The impact of these assembled, annotated genomes will be enormous. L. japonicus and M. truncatula, both forage crop...

  2. Comparative Genomics of the Ubiquitous, Hydrocarbon-degrading Genus Marinobacter

    NASA Astrophysics Data System (ADS)

    Singer, E.; Webb, E.; Edwards, K. J.

    2012-12-01

    The genus Marinobacter is amongst the most ubiquitous in the global oceans and strains have been isolated from a wide variety of marine environments, including offshore oil-well heads, coastal thermal springs, Antarctic sea water, saline soils and associations with diatoms and dinoflagellates. Many strains have been recognized to be important hydrocarbon degraders in various marine habitats presenting sometimes extreme pH or salinity conditions. Analysis of the genome of M. aquaeolei revealed enormous adaptation versatility with an assortment of strategies for carbon and energy acquisition, sensation, and defense. In an effort to elucidate the ecological and biogeochemical significance of the Marinobacters, seven Marinobacter strains from diverse environments were included in a comparative genomics study. Genomes were screened for metabolic and adaptation potential to elucidate the strategies responsible for the omnipresence of the Marinobacter genus and their remedial action potential in hydrocarbon-polluted waters. The core genome predominantly encodes for key genes involved in hydrocarbon degradation, biofilm-relevant processes, including utilization of external DNA, halotolerance, as well as defense mechanisms against heavy metals, antibiotics, and toxins. All Marinobacter strains were observed to degrade a wide spectrum of hydrocarbon species, including aliphatic, polycyclic aromatic as well as acyclic isoprenoid compounds. Various genes predicted to facilitate hydrocarbon degradation, e.g. alkane 1-monooxygenase, appear to have originated from lateral gene transfer as they are located on gene clusters of 10-20% lower GC-content compared to genome averages and are flanked by transposases. Top ortholog hits are found in other hydrocarbon degrading organisms, e.g. Alcanivorax borkumensis. Strategies for hydrocarbon uptake encoded by various Marinobacter strains include cell surface hydrophobicity adaptation via capsular polysaccharide biosynthesis and attachment

  3. Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation.

    PubMed

    Qin, Qi-Long; Xie, Bin-Bin; Yu, Yong; Shu, Yan-Li; Rong, Jin-Cheng; Zhang, Yan-Jiao; Zhao, Dian-Li; Chen, Xiu-Lan; Zhang, Xi-Ying; Chen, Bo; Zhou, Bai-Cheng; Zhang, Yu-Zhong

    2014-06-01

    To what extent the genomes of different species belonging to one genus can be diverse and the relationship between genomic differentiation and environmental factor remain unclear for oceanic bacteria. With many new bacterial genera and species being isolated from marine environments, this question warrants attention. In this study, we sequenced all the type strains of the published species of Glaciecola, a recently defined cold-adapted genus with species from diverse marine locations, to study the genomic diversity and cold-adaptation strategy in this genus.The genome size diverged widely from 3.08 to 5.96 Mb, which can be explained by massive gene gain and loss events. Horizontal gene transfer and new gene emergence contributed substantially to the genome size expansion. The genus Glaciecola had an open pan-genome. Comparative genomic research indicated that species of the genus Glaciecola had high diversity in genome size, gene content and genetic relatedness. This may be prevalent in marine bacterial genera considering the dynamic and complex environments of the ocean. Species of Glaciecola had some common genomic features related to cold adaptation, which enable them to thrive and play a role in biogeochemical cycle in the cold marine environments.

  4. Comparative genomic hybridization in clinical cytogenetics

    SciTech Connect

    Bryndorf, T.; Kirchhoff, M.; Rose, H.

    1995-11-01

    We report the results of applying comparative genomic hybridization (CGH) in a cytogenetic service laboratory for (1) determination of the origin of extra and missing chromosomal material in intricate cases of unbalanced aberrations and (2) detection of common prenatal numerical chromosome aberrations. A total of 11 fetal samples were analyzed. Seven cases of complex unbalanced aberrations that could not be identified reliably by conventional cytogenetics were successfully resolved by CGH analysis. CGH results were validated by using FISH with chromosome-specific probes. Four cases representing common prenatal numerical aberrations (trisomy 21, 18, and 13 and monosomy X) were also successfully diagnosed by CGH. We conclude that CGH is a powerful adjunct to traditional cytogenetic techniques that makes it possible to solve clinical cases of intricate unbalanced aberrations in a single hybridization. CGH may also be a useful adjunct to screen for euchromatic involvement in marker chromosomes. Further technical development may render CGH applicable for routine aberration screening. 16 refs., 4 figs., 2 tabs.

  5. Identification of thermoacidophilic bacteria and a new Alicyclobacillus genomic species isolated from acidic environments in Japan.

    PubMed

    Goto, Keiichi; Tanimoto, Yasuhide; Tamura, Takashi; Mochida, Kaoru; Arai, Daisuke; Asahara, Mika; Suzuki, Masayuki; Tanaka, Hidehiko; Inagaki, Kenji

    2002-08-01

    Sixty strains of thermoacidophilic bacteria have been isolated from soil and water samples obtained from various acidic environments in Japan. An initial comparative sequence analysis of the hypervariable regions of the 16S rDNA revealed that all strains could be assigned to the Alicyclobacillus acidocaldarius- Alicyclobacillus genomic species 1 group, which could be further subdivided into three clusters (Clusters I-III). On the basis of phenotypic characteristics, chemotaxonomic profiles, and phylogenetic data of six selected strains, five strains were identified as either A. acidocaldarius or Alicyclobacillus genomic species 1; however, one strain (MIH 332) could not be determined to belong to either of these species. 16S rDNA sequence homology values between strain MIH 332 and the reference strains of A. acidocaldarius (ATCC 27009(T)) and Alicyclobacillus genomic species 1 (DSM 11984) were 98.8% and 99.1%, respectively, which were higher than the corresponding similarity between the reference strains (98.4%). On the other hand, DNA-DNA hybridization levels between strain MIH 332 and the reference strains were 39% and 44%, respectively, which were lower than the value between the reference strains (59% or 65%). However, the phenotype of strain MIH 332 was also similar to those of the reference strains, and a typical phenotype could not be found for the strain, thus indicating that the strain may be a new genomic species of A. acidocaldarius, for which the name Alicyclobacillus genomic species 2 is tentatively proposed. The results of this study suggest that A. acidocaldarius and its related species are widely distributed in acidic environments in Japan, with slight regional variations in morphological and genotypic characteristics.

  6. Genome Sequence and Comparative Analysis of the Solvent-Producing Bacterium Clostridium acetobutylicum

    PubMed Central

    Nölling, Jörk; Breton, Gary; Omelchenko, Marina V.; Makarova, Kira S.; Zeng, Qiandong; Gibson, Rene; Lee, Hong Mei; Dubois, JoAnn; Qiu, Dayong; Hitti, Joseph; Wolf, Yuri I.; Tatusov, Roman L.; Sabathe, Fabrice; Doucette-Stamm, Lynn; Soucaille, Philippe; Daly, Michael J.; Bennett, George N.; Koonin, Eugene V.; Smith, Douglas R.

    2001-01-01

    The genome sequence of the solvent-producing bacterium Clostridium acetobutylicum ATCC 824 has been determined by the shotgun approach. The genome consists of a 3.94-Mb chromosome and a 192-kb megaplasmid that contains the majority of genes responsible for solvent production. Comparison of C. acetobutylicum to Bacillus subtilis reveals significant local conservation of gene order, which has not been seen in comparisons of other genomes with similar, or, in some cases closer, phylogenetic proximity. This conservation allows the prediction of many previously undetected operons in both bacteria. However, the C. acetobutylicum genome also contains a significant number of predicted operons that are shared with distantly related bacteria and archaea but not with B. subtilis. Phylogenetic analysis is compatible with the dissemination of such operons by horizontal transfer. The enzymes of the solventogenesis pathway and of the cellulosome of C. acetobutylicum comprise a new set of metabolic capacities not previously represented in the collection of complete genomes. These enzymes show a complex pattern of evolutionary affinities, emphasizing the role of lateral gene exchange in the evolution of the unique metabolic profile of the bacterium. Many of the sporulation genes identified in B. subtilis are missing in C. acetobutylicum, which suggests major differences in the sporulation process. Thus, comparative analysis reveals both significant conservation of the genome organization and pronounced differences in many systems that reflect unique adaptive strategies of the two gram-positive bacteria. PMID:11466286

  7. Comparative genomics of bacterial and plant folate synthesis and salvage: predictions and validations

    PubMed Central

    de Crécy-Lagard, Valérie; El Yacoubi, Basma; de la Garza, Rocío Díaz; Noiriel, Alexandre; Hanson, Andrew D

    2007-01-01

    Background Folate synthesis and salvage pathways are relatively well known from classical biochemistry and genetics but they have not been subjected to comparative genomic analysis. The availability of genome sequences from hundreds of diverse bacteria, and from Arabidopsis thaliana, enabled such an analysis using the SEED database and its tools. This study reports the results of the analysis and integrates them with new and existing experimental data. Results Based on sequence similarity and the clustering, fusion, and phylogenetic distribution of genes, several functional predictions emerged from this analysis. For bacteria, these included the existence of novel GTP cyclohydrolase I and folylpolyglutamate synthase gene families, and of a trifunctional p-aminobenzoate synthesis gene. For plants and bacteria, the predictions comprised the identities of a 'missing' folate synthesis gene (folQ) and of a folate transporter, and the absence from plants of a folate salvage enzyme. Genetic and biochemical tests bore out these predictions. Conclusion For bacteria, these results demonstrate that much can be learnt from comparative genomics, even for well-explored primary metabolic pathways. For plants, the findings particularly illustrate the potential for rapid functional assignment of unknown genes that have prokaryotic homologs, by analyzing which genes are associated with the latter. More generally, our data indicate how combined genomic analysis of both plants and prokaryotes can be more powerful than isolated examination of either group alone. PMID:17645794

  8. Comparative Genomics of the Campylobacter lari Group

    PubMed Central

    Miller, William G.; Yee, Emma; Chapman, Mary H.; Smith, Timothy P.L.; Bono, James L.; Huynh, Steven; Parker, Craig T.; Vandamme, Peter; Luong, Khai; Korlach, Jonas

    2014-01-01

    The Campylobacter lari group is a phylogenetic clade within the epsilon subdivision of the Proteobacteria and is part of the thermotolerant Campylobacter spp., a division within the genus that includes the human pathogen Campylobacter jejuni. The C. lari group is currently composed of five species (C. lari, Campylobacter insulaenigrae, Campylobacter volucris, Campylobacter subantarcticus, and Campylobacter peloridis), as well as a group of strains termed the urease-positive thermophilic Campylobacter (UPTC) and other C. lari-like strains. Here we present the complete genome sequences of 11 C. lari group strains, including the five C. lari group species, four UPTC strains, and a lari-like strain isolated in this study. The genome of C. lari subsp. lari strain RM2100 was described previously. Analysis of the C. lari group genomes indicates that this group is highly related at the genome level. Furthermore, these genomes are strongly syntenic with minor rearrangements occurring only in 4 of the 12 genomes studied. The C. lari group can be bifurcated, based on the flagella and flagellar modification genes. Genomic analysis of the UPTC strains indicated that these organisms are variable but highly similar, closely related to but distinct from C. lari. Additionally, the C. lari group contains multiple genes encoding hemagglutination domain proteins, which are either contingency genes or linked to conserved contingency genes. Many of the features identified in strain RM2100, such as major deficiencies in amino acid biosynthesis and energy metabolism, are conserved across all 12 genomes, suggesting that these common features may play a role in the association of the C. lari group with coastal environments and watersheds. PMID:25381664

  9. Comparative Genomics of Large Mitochondria in Placozoans

    PubMed Central

    Signorovitch, Ana Y; Buss, Leo W; Dellaporta, Stephen L

    2007-01-01

    The first sequenced mitochondrial genome of a placozoan, Trichoplax adhaerens, challenged the conventional wisdom that a compact mitochondrial genome is a common feature among all animals. Three additional placozoan mitochondrial genomes representing highly divergent clades have been sequenced to determine whether the large Trichoplax mtDNA is a shared feature among members of the phylum Placozoa or a uniquely derived condition. All three mitochondrial genomes were found to be very large, 32- to 37-kb, circular molecules, having the typical 12 respiratory chain genes, 24 tRNAs, rnS, and rnL. They share with the Trichoplax mitochondrial genome the absence of atp8, atp9, and all ribosomal protein genes, the presence of several cox1 introns, and a large open reading frame containing an intron group I LAGLIDADG endonuclease domain. The differences in mtDNA size within Placozoa are due to variation in intergenic spacer regions and the presence or absence of long open reading frames of unknown function. Phylogenetic analyses of the 12 respiratory chain genes support the monophyly of Placozoa. The similarities in composition and structure between the three mitochondrial genomes reported here and that of Trichoplax's mtDNA suggest that their uncompacted state is a shared ancestral feature to other nonmetazoans while their gene content is a derived feature shared only among the Metazoa. PMID:17222063

  10. RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

    PubMed

    Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

    2013-11-01

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in

  11. Comparative genomic hybridization: Detection of segmental aneusomies

    SciTech Connect

    Cronin, J.E.; Magrane, G.G.; Gray, J.W.

    1994-09-01

    Comparative genomic hybridization (CGH) has been used successfully to detect whole chromosome and segmental aneusomies. However, its sensitivity for detection of segmental aneusomies is still not well known. We present here an analysis of CGH sensitivity with emphasis on detection of abnormalities commonly found during pre-and neo-natal diagnosis. CGH is performed by hybridizing green and red fluorescing test and normal DNA samples, respectively, to normal metaphase spreads and measuring green:red fluorescence ratios along all chromosomes. The ratios are normalized such that 2 copies of a normal chromosome region in the test sample gives a ratio of 1.0. Alterations in test vs. control gene copy number range from 1.5 [trisomy] to 0.5 [monosomy]. Clinical samples analyzed included Wolf Hirschhorn (4p-), Cri du Chat (5p-) and DiGeorge (22q-). In addition, 7 cell lines with chromosome 21 segmental aneusomies were analyzed. These included 3 with terminal duplications, 1 with a terminal deletion, 1 with an interstitial deletion and 2 with interstitial amplifications. The DiGeorge deletion was the only deletion not deleted by CGH. This is not surprising as standard G banding does not routinely detect this 1-2 megabase deletion. The 4p- and 5p- monosomies were detected and breakpoints correctly assigned prospectively. Proximal alterations involving 21q22.11 are unambiguously defined. Specifically, two interstitial aneusomies involving this region are detected. Studies involving late prophase chromosome normal spreads gave identical breakpoints. Thus, analysis of extended chromosomes did not improve the sensitivity of the technique. Taken together, these data suggest that CGH can detect segmental aneusomies greater than 8 megabases in extent. Smaller aneusomies can, at times, be detected. Work is now underway to modify the analysis software to increase sensitivity and to decrease the amount of material needed for analysis.

  12. Comparative analysis and visualization of multiple collinear genomes

    PubMed Central

    2012-01-01

    Background Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. Results We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. Conclusions Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains. PMID:22536897

  13. Comparative genomics and genome biology of invasive Campylobacter jejuni.

    PubMed

    Skarp, C P A; Akinrinade, O; Nilsson, A J E; Ellström, P; Myllykangas, S; Rautelin, H

    2015-11-25

    Campylobacter jejuni is a major pathogen in bacterial gastroenteritis worldwide and can cause bacteremia in severe cases. C. jejuni is highly structured into clonal lineages of which the ST677CC lineage has been overrepresented among C. jejuni isolates derived from blood. In this study, we characterized the genomes of 31 C. jejuni blood isolates and 24 faecal isolates belonging to ST677CC in order to study the genome biology related to C. jejuni invasiveness. We combined the genome analyses with phenotypical evidence on serum resistance which was associated with phase variation of wcbK; a GDP-mannose 4,6-dehydratase involved in capsular biosynthesis. We also describe the finding of a Type III restriction-modification system unique to the ST-794 sublineage. However, features previously considered to be related to pathogenesis of C. jejuni were either absent or disrupted among our strains. Our results refine the role of capsule features associated with invasive disease and accentuate the possibility of methylation and restriction enzymes in the potential of C. jejuni to establish invasive infections. Our findings underline the importance of studying clinically relevant well-characterized bacterial strains in order to understand pathogenesis mechanisms important in human infections.

  14. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis

    PubMed Central

    Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del

    2015-01-01

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096

  15. Faustoviruses: Comparative Genomics of New Megavirales Family Members

    PubMed Central

    Benamar, Samia; Reteno, Dorine G. I.; Bandaly, Victor; Labas, Noémie; Raoult, Didier; La Scola, Bernard

    2016-01-01

    An emerging interest for the giant virus discovery process, genome sequencing and analysis has allowed an expansion of the number of known Megavirales members. Using the protist Vermamoeba sp. as cell support, a new giant virus named Faustovirus has been isolated. In this study, we describe the genome sequences of nine Faustoviruses and build a genomic comparison in order to have a comprehensive overview of genomic composition and diversity among this new virus family. The average sequence length of these viruses is 467,592.44 bp (ranging from 455,803 to 491,024 bp), making them the fourth largest Megavirales genome after Mimiviruses, Pandoraviruses, and Pithovirus sibericum. Faustovirus genomes displayed an average G+C content of 37.14 % (ranging from 36.22 to 39.59%) which is close to the G+C content range of the Asfarviridae genomes (38%). The proportion of best matches and the phylogenetic analysis suggest a shared origin with Asfarviridae without belonging to the same family. The core-gene-based phylogeny of Faustoviruses study has identified four lineages. These results were confirmed by the analysis of amino acids and COGs category distribution. The diversity of the gene composition of these lineages is mainly explained by gene deletion or acquisition and some exceptions for gene duplications. The high proportion of best matches from Bacteria and Phycodnaviridae on the pan-genome and unique genes may be explained by an interaction occurring after the separation of the lineages. The Faustovirus core-genome appears to consolidate the surrounding of 207 genes whereas the pan-genome is described as an open pan-genome, its enrichment via the discovery of new Faustoviruses is required to better seize all the genomic diversity of this family. PMID:26903952

  16. Ecological genomics of mutualism decline in nitrogen-fixing bacteria

    PubMed Central

    Klinger, Christie R.; Lau, Jennifer A.

    2016-01-01

    Anthropogenic changes can influence mutualism evolution; however, the genomic regions underpinning mutualism that are most affected by environmental change are generally unknown, even in well-studied model mutualisms like the interaction between legumes and their nitrogen (N)-fixing rhizobia. Such genomic information can shed light on the agents and targets of selection maintaining cooperation in nature. We recently demonstrated that N-fertilization has caused an evolutionary decline in mutualistic partner quality in the rhizobia that form symbiosis with clover. Here, population genomic analyses of N-fertilized versus control rhizobium populations indicate that evolutionary differentiation at a key symbiosis gene region on the symbiotic plasmid (pSym) contributes to partner quality decline. Moreover, patterns of genetic variation at selected loci were consistent with recent positive selection within N-fertilized environments, suggesting that N-rich environments might select for less beneficial rhizobia. By studying the molecular population genomics of a natural bacterial population within a long-term ecological field experiment, we find that: (i) the N environment is indeed a potent selective force mediating mutualism evolution in this symbiosis, (ii) natural variation in rhizobium partner quality is mediated in part by key symbiosis genes on the symbiotic plasmid, and (iii) differentiation at selected genes occurred in the context of otherwise recombining genomes, resembling eukaryotic models of adaptation. PMID:26962142

  17. Ecological genomics of mutualism decline in nitrogen-fixing bacteria.

    PubMed

    Klinger, Christie R; Lau, Jennifer A; Heath, Katy D

    2016-03-16

    Anthropogenic changes can influence mutualism evolution; however, the genomic regions underpinning mutualism that are most affected by environmental change are generally unknown, even in well-studied model mutualisms like the interaction between legumes and their nitrogen (N)-fixing rhizobia. Such genomic information can shed light on the agents and targets of selection maintaining cooperation in nature. We recently demonstrated that N-fertilization has caused an evolutionary decline in mutualistic partner quality in the rhizobia that form symbiosis with clover. Here, population genomic analyses of N-fertilized versus control rhizobium populations indicate that evolutionary differentiation at a key symbiosis gene region on the symbiotic plasmid (pSym) contributes to partner quality decline. Moreover, patterns of genetic variation at selected loci were consistent with recent positive selection within N-fertilized environments, suggesting that N-rich environments might select for less beneficial rhizobia. By studying the molecular population genomics of a natural bacterial population within a long-term ecological field experiment, we find that: (i) the N environment is indeed a potent selective force mediating mutualism evolution in this symbiosis, (ii) natural variation in rhizobium partner quality is mediated in part by key symbiosis genes on the symbiotic plasmid, and (iii) differentiation at selected genes occurred in the context of otherwise recombining genomes, resembling eukaryotic models of adaptation. © 2016 The Author(s).

  18. Comparative Genomics of Cluster O Mycobacteriophages

    PubMed Central

    Cresawn, Steven G.; Pope, Welkin H.; Jacobs-Sera, Deborah; Bowman, Charles A.; Russell, Daniel A.; Dedrick, Rebekah M.; Adair, Tamarah; Anders, Kirk R.; Ball, Sarah; Bollivar, David; Breitenberger, Caroline; Burnett, Sandra H.; Butela, Kristen; Byrnes, Deanna; Carzo, Sarah; Cornely, Kathleen A.; Cross, Trevor; Daniels, Richard L.; Dunbar, David; Findley, Ann M.; Gissendanner, Chris R.; Golebiewska, Urszula P.; Hartzog, Grant A.; Hatherill, J. Robert; Hughes, Lee E.; Jalloh, Chernoh S.; De Los Santos, Carla; Ekanem, Kevin; Khambule, Sphindile L.; King, Rodney A.; King-Smith, Christina; Klyczek, Karen; Krukonis, Greg P.; Laing, Christian; Lapin, Jonathan S.; Lopez, A. Javier; Mkhwanazi, Sipho M.; Molloy, Sally D.; Moran, Deborah; Munsamy, Vanisha; Pacey, Eddie; Plymale, Ruth; Poxleitner, Marianne; Reyna, Nathan; Schildbach, Joel F.; Stukey, Joseph; Taylor, Sarah E.; Ware, Vassie C.; Wellmann, Amanda L.; Westholm, Daniel; Wodarski, Donna; Zajko, Michelle; Zikalala, Thabiso S.; Hendrix, Roger W.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophages – viruses of mycobacterial hosts – are genetically diverse but morphologically are all classified in the Caudovirales with double-stranded DNA and tails. We describe here a group of five closely related mycobacteriophages – Corndog, Catdawg, Dylan, Firecracker, and YungJamal – designated as Cluster O with long flexible tails but with unusual prolate capsids. Proteomic analysis of phage Corndog particles, Catdawg particles, and Corndog-infected cells confirms expression of half of the predicted gene products and indicates a non-canonical mechanism for translation of the Corndog tape measure protein. Bioinformatic analysis identifies 8–9 strongly predicted SigA promoters and all five Cluster O genomes contain more than 30 copies of a 17 bp repeat sequence with dyad symmetry located throughout the genomes. Comparison of the Cluster O phages provides insights into phage genome evolution including the processes of gene flux by horizontal genetic exchange. PMID:25742016

  19. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  20. Initial sequencing and comparative analysis of the mouse genome.

    PubMed

    Waterston, Robert H; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R; Brown, Daniel G; Brown, Stephen D; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T; Church, Deanna M; Clamp, Michele; Clee, Christopher; Collins, Francis S; Cook, Lisa L; Copley, Richard R; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D; Deri, Justin; Dermitzakis, Emmanouil T; Dewey, Colin; Dickens, Nicholas J; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M; Eddy, Sean R; Elnitski, Laura; Emes, Richard D; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A; Flicek, Paul; Foley, Karen; Frankel, Wayne N; Fulton, Lucinda A; Fulton, Robert S; Furey, Terrence S; Gage, Diane; Gibbs, Richard A; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A; Green, Eric D; Gregory, Simon; Guigó, Roderic; Guyer, Mark; Hardison, Ross C; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B; Johnson, L Steven; Jones, Matthew; Jones, Thomas A; Joy, Ann; Kamal, Michael; Karlsson, Elinor K; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W James; Kirby, Andrew; Kolbe, Diana L; Korf, Ian; Kucherlapati, Raju S; Kulbokas, Edward J; Kulp, David; Landers, Tom; Leger, J P; Leonard, Steven; Letunic, Ivica; Levine, Rosie; Li, Jia; Li, Ming; Lloyd, Christine; Lucas, Susan; Ma, Bin; Maglott, Donna R; Mardis, Elaine R; Matthews, Lucy; Mauceli, Evan; Mayer, John H; McCarthy, Megan; McCombie, W Richard; McLaren, Stuart; McLay, Kirsten; McPherson, John D; Meldrim, Jim; Meredith, Beverley; Mesirov, Jill P; Miller, Webb; Miner, Tracie L; Mongin, Emmanuel; Montgomery, Kate T; Morgan, Michael; Mott, Richard; Mullikin, James C; Muzny, Donna M; Nash, William E; Nelson, Joanne O; Nhan, Michael N; Nicol, Robert; Ning, Zemin; Nusbaum, Chad; O'Connor, Michael J; Okazaki, Yasushi; Oliver, Karen; Overton-Larty, Emma; Pachter, Lior; Parra, Genís; Pepin, Kymberlie H; Peterson, Jane; Pevzner, Pavel; Plumb, Robert; Pohl, Craig S; Poliakov, Alex; Ponce, Tracy C; Ponting, Chris P; Potter, Simon; Quail, Michael; Reymond, Alexandre; Roe, Bruce A; Roskin, Krishna M; Rubin, Edward M; Rust, Alistair G; Santos, Ralph; Sapojnikov, Victor; Schultz, Brian; Schultz, Jörg; Schwartz, Matthias S; Schwartz, Scott; Scott, Carol; Seaman, Steven; Searle, Steve; Sharpe, Ted; Sheridan, Andrew; Shownkeen, Ratna; Sims, Sarah; Singer, Jonathan B; Slater, Guy; Smit, Arian; Smith, Douglas R; Spencer, Brian; Stabenau, Arne; Stange-Thomann, Nicole; Sugnet, Charles; Suyama, Mikita; Tesler, Glenn; Thompson, Johanna; Torrents, David; Trevaskis, Evanne; Tromp, John; Ucla, Catherine; Ureta-Vidal, Abel; Vinson, Jade P; Von Niederhausern, Andrew C; Wade, Claire M; Wall, Melanie; Weber, Ryan J; Weiss, Robert B; Wendl, Michael C; West, Anthony P; Wetterstrand, Kris; Wheeler, Raymond; Whelan, Simon; Wierzbowski, Jamey; Willey, David; Williams, Sophie; Wilson, Richard K; Winter, Eitan; Worley, Kim C; Wyman, Dudley; Yang, Shan; Yang, Shiaw-Pyng; Zdobnov, Evgeny M; Zody, Michael C; Lander, Eric S

    2002-12-05

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  1. Single-cell genomics reveal low recombination frequencies in freshwater bacteria of the SAR11 clade

    PubMed Central

    2013-01-01

    Background The SAR11 group of Alphaproteobacteria is highly abundant in the oceans. It contains a recently diverged freshwater clade, which offers the opportunity to compare adaptations to salt- and freshwaters in a monophyletic bacterial group. However, there are no cultivated members of the freshwater SAR11 group and no genomes have been sequenced yet. Results We isolated ten single SAR11 cells from three freshwater lakes and sequenced and assembled their genomes. A phylogeny based on 57 proteins indicates that the cells are organized into distinct microclusters. We show that the freshwater genomes have evolved primarily by the accumulation of nucleotide substitutions and that they have among the lowest ratio of recombination to mutation estimated for bacteria. In contrast, members of the marine SAR11 clade have one of the highest ratios. Additional metagenome reads from six lakes confirm low recombination frequencies for the genome overall and reveal lake-specific variations in microcluster abundances. We identify hypervariable regions with gene contents broadly similar to those in the hypervariable regions of the marine isolates, containing genes putatively coding for cell surface molecules. Conclusions We conclude that recombination rates differ dramatically in phylogenetic sister groups of the SAR11 clade adapted to freshwater and marine ecosystems. The results suggest that the transition from marine to freshwater systems has purged diversity and resulted in reduced opportunities for recombination with divergent members of the clade. The low recombination frequencies of the LD12 clade resemble the low genetic divergence of host-restricted pathogens that have recently shifted to a new host. PMID:24286338

  2. A bioinformatic approach to understanding antibiotic resistance in intracellular bacteria through whole genome analysis.

    PubMed

    Biswas, Silpak; Raoult, Didier; Rolain, Jean-Marc

    2008-09-01

    Intracellular bacteria survive within eukaryotic host cells and are difficult to kill with certain antibiotics. As a result, antibiotic resistance in intracellular bacteria is becoming commonplace in healthcare institutions. Owing to the lack of methods available for transforming these bacteria, we evaluated the mechanisms of resistance using molecular methods and in silico genome analysis. The objective of this review was to understand the molecular mechanisms of antibiotic resistance through in silico comparisons of the genomes of obligate and facultative intracellular bacteria. The available data on in vitro mutants reported for intracellular bacteria were also reviewed. These genomic data were analysed to find natural mutations in known target genes involved in antibiotic resistance and to look for the presence or absence of different resistance determinants. Our analysis revealed the presence of tetracycline resistance protein (Tet) in Bartonella quintana, Francisella tularensis and Brucella ovis; moreover, most of the Francisella strains possessed the blaA gene, AmpG protein and metallo-beta-lactamase family protein. The presence or absence of folP (dihydropteroate synthase) and folA (dihydrofolate reductase) genes in the genome could explain natural resistance to co-trimoxazole. Finally, multiple genes encoding different efflux pumps were studied. This in silico approach was an effective method for understanding the mechanisms of antibiotic resistance in intracellular bacteria. The whole genome sequence analysis will help to predict several important phenotypic characteristics, in particular resistance to different antibiotics. In the future, stable mutants should be obtained through transformation methods in order to demonstrate experimentally the determinants of resistance in intracellular bacteria.

  3. Transferring whole genomes from bacteria to yeast spheroplasts using entire bacterial cells to reduce DNA shearing.

    PubMed

    Karas, Bogumil J; Jablanovic, Jelena; Irvine, Edward; Sun, Lijie; Ma, Li; Weyman, Philip D; Gibson, Daniel G; Glass, John I; Venter, J Craig; Hutchison, Clyde A; Smith, Hamilton O; Suzuki, Yo

    2014-04-01

    Direct cell-to-cell transfer of genomes from bacteria to yeast facilitates genome engineering for bacteria that are not amenable to genetic manipulation by allowing instead for the utilization of the powerful yeast genetic tools. Here we describe a protocol for transferring whole genomes from bacterial cells to yeast spheroplasts without any DNA purification process. The method is dependent on the treatment of the bacterial and yeast cellular mixture with PEG, which induces cell fusion, engulfment, aggregation or lysis. Over 80% of the bacterial genomes transferred in this way are complete, on the basis of structural and functional tests. Excluding the time required for preparing starting cultures and for incubating cells to form final colonies, the protocol can be completed in 3 h.

  4. Comparative genomics of actinomycetes with a focus on natural product biosynthetic genes

    PubMed Central

    2013-01-01

    Background Actinomycetes are a diverse group of medically, industrially and ecologically important bacteria, studied as much for the diseases they cause as for the cures they hold. The genomes of actinomycetes revealed that these bacteria have a large number of natural product gene clusters, although many of these are difficult to tie to products in the laboratory. Large scale comparisons of these clusters are difficult to perform due to the presence of highly similar repeated domains in the most common biosynthetic machinery: polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs). Results We have used comparative genomics to provide an overview of the genomic features of a set of 102 closed genomes from this important group of bacteria with a focus on natural product biosynthetic genes. We have focused on well-represented genera and determine the occurrence of gene cluster families therein. Conservation of natural product gene clusters within Mycobacterium, Streptomyces and Frankia suggest crucial roles for natural products in the biology of each genus. The abundance of natural product classes is also found to vary greatly between genera, revealing underlying patterns that are not yet understood. Conclusions A large-scale analysis of natural product gene clusters presents a useful foundation for hypothesis formulation that is currently underutilized in the field. Such studies will be increasingly necessary to study the diversity and ecology of natural products as the number of genome sequences available continues to grow. PMID:24020438

  5. Comparative genomics of actinomycetes with a focus on natural product biosynthetic genes.

    PubMed

    Doroghazi, James R; Metcalf, William W

    2013-09-11

    Actinomycetes are a diverse group of medically, industrially and ecologically important bacteria, studied as much for the diseases they cause as for the cures they hold. The genomes of actinomycetes revealed that these bacteria have a large number of natural product gene clusters, although many of these are difficult to tie to products in the laboratory. Large scale comparisons of these clusters are difficult to perform due to the presence of highly similar repeated domains in the most common biosynthetic machinery: polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs). We have used comparative genomics to provide an overview of the genomic features of a set of 102 closed genomes from this important group of bacteria with a focus on natural product biosynthetic genes. We have focused on well-represented genera and determine the occurrence of gene cluster families therein. Conservation of natural product gene clusters within Mycobacterium, Streptomyces and Frankia suggest crucial roles for natural products in the biology of each genus. The abundance of natural product classes is also found to vary greatly between genera, revealing underlying patterns that are not yet understood. A large-scale analysis of natural product gene clusters presents a useful foundation for hypothesis formulation that is currently underutilized in the field. Such studies will be increasingly necessary to study the diversity and ecology of natural products as the number of genome sequences available continues to grow.

  6. Detection of chromosomal abnormalities by comparative genomic hybridization.

    PubMed

    Lapierre, Jean-Michel; Tachdjian, Gérard

    2005-04-01

    Comparative genomic hybridization (CGH) is a modified in-situ hybridization technique. In this type of analysis, two differentially labeled genomic DNAs (study and reference) are cohybridized to normal metaphase spreads or to microarray. Chromosomal locations of copy number changes in the DNA segments of the study genome are revealed by a variable fluorescence intensity ratio along each target chromosome. Thus, CGH allows detection and mapping of DNA sequence copy differences between two genomes in a single experiment. Since its development, comparative genomic hybridization has been applied mostly as a research tool in the field of cancer cytogenetics to identify genetic changes in many previously unknown regions. It is also a powerful tool for detection and identification of unbalanced chromosomal abnormalities in prenatal, postnatal and preimplantation diagnostics. The development of comparative genomic hybridization and increase in resolution analysis by using the microarray-based technique offer new information on chromosomal pathologies and thus better management of patients.

  7. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    USDA-ARS?s Scientific Manuscript database

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  8. The case of horizontal gene transfer from bacteria to the peculiar dinoflagellate plastid genome

    PubMed Central

    Mackiewicz, Paweł; Bodył, Andrzej; Moszczyński, Krzysztof

    2013-01-01

    Organelle genomes lose their genes by transfer to host nuclear genomes, but only occasionally are enriched by foreign genes from other sources. In contrast to mitochondria, plastid genomes are especially resistant to such horizontal gene transfer (HGT), and thus every gene acquired in this way is notable. An exceptional case of HGT was recently recognized in the peculiar peridinin plastid genome of dinoflagellates, which is organized in plasmid-like minicircles. Genomic and phylogenetic analyses of Ceratium horridum and Pyrocystis lunula minicircles revealed four genes and one unannotated open reading frame that probably were gained from bacteria belonging to the Bacteroidetes. Such bacteria seem to be a good source of genes because close endosymbiotic associations between them and dinoflagellates have been observed. The HGT-acquired genes are involved in plastid functions characteristic of other photosynthetic eukaryotes, and their arrangement resembles bacterial operons. These studies indicate that the peridinin plastid genome, usually regarded as having resulted from reduction and fragmentation of a typical plastid genome derived from red algae, may have a chimeric origin that includes bacterial contributions. Potential contamination of the Ceratium and Pyrocystis plastid genomes by bacterial sequences and the controversial localization of their minicircles in the nucleus are also discussed. PMID:24195014

  9. The case of horizontal gene transfer from bacteria to the peculiar dinoflagellate plastid genome.

    PubMed

    Mackiewicz, Paweł; Bodył, Andrzej; Moszczyński, Krzysztof

    2013-07-01

    Organelle genomes lose their genes by transfer to host nuclear genomes, but only occasionally are enriched by foreign genes from other sources. In contrast to mitochondria, plastid genomes are especially resistant to such horizontal gene transfer (HGT), and thus every gene acquired in this way is notable. An exceptional case of HGT was recently recognized in the peculiar peridinin plastid genome of dinoflagellates, which is organized in plasmid-like minicircles. Genomic and phylogenetic analyses of Ceratium horridum and Pyrocystis lunula minicircles revealed four genes and one unannotated open reading frame that probably were gained from bacteria belonging to the Bacteroidetes. Such bacteria seem to be a good source of genes because close endosymbiotic associations between them and dinoflagellates have been observed. The HGT-acquired genes are involved in plastid functions characteristic of other photosynthetic eukaryotes, and their arrangement resembles bacterial operons. These studies indicate that the peridinin plastid genome, usually regarded as having resulted from reduction and fragmentation of a typical plastid genome derived from red algae, may have a chimeric origin that includes bacterial contributions. Potential contamination of the Ceratium and Pyrocystis plastid genomes by bacterial sequences and the controversial localization of their minicircles in the nucleus are also discussed.

  10. Human-mouse comparative genomics: successes and failures to reveal functional regions of the human genome

    SciTech Connect

    Pennacchio, Len A.; Baroukh, Nadine; Rubin, Edward M.

    2003-05-15

    Deciphering the genetic code embedded within the human genome remains a significant challenge despite the human genome consortium's recent success at defining its linear sequence (Lander et al. 2001; Venter et al. 2001). While useful strategies exist to identify a large percentage of protein encoding regions, efforts to accurately define functional sequences in the remaining {approx}97 percent of the genome lag. Our primary interest has been to utilize the evolutionary relationship and the universal nature of genomic sequence information in vertebrates to reveal functional elements in the human genome. This has been achieved through the combined use of vertebrate comparative genomics to pinpoint highly conserved sequences as candidates for biological activity and transgenic mouse studies to address the functionality of defined human DNA fragments. Accordingly, we describe strategies and insights into functional sequences in the human genome through the use of comparative genomics coupled wit h functional studies in the mouse.

  11. Homologous recombination in Agrobacterium: potential implications for the genomic species concept in bacteria.

    PubMed

    Costechareyre, Denis; Bertolla, Franck; Nesme, Xavier

    2009-01-01

    According to current taxonomical rules, a bona fide bacterial species is a genomic species characterized by the genomic similarity of its members. It has been proposed that the genomic cohesion of such clusters may be related to sexual isolation, which limits gene flow between too divergent bacteria. Homologous recombination is one of the most studied mechanisms responsible for this genetic isolation. Previous studies on several bacterial models showed that recombination frequencies decreased exponentially with increasing DNA sequence divergence. In the present study, we investigated this relationship in the Agrobacterium tumefaciens species complex, which allowed us to focus on sequence divergence in the vicinity of the genetic boundaries of genomic species. We observed that the sensitivity of the recombination frequency to DNA divergence fitted a log-linear function until approximately 10% sequence divergence. The results clearly revealed that there was no sharp drop in recombination frequencies at the point where the sequence divergence distribution showed a "gap" delineating genomic species. The ratio of the recombination frequency in homogamic conditions relative to this frequency in heterogamic conditions, that is, sexual isolation, was found to decrease from 8 between the most distant strains within a species to 9 between the most closely related species, for respective increases from 4.3% to 6.4% mismatches in the marker gene chvA. This means that there was only a 1.13-fold decrease in recombination frequencies for recombination events at both edges of the species border. Hence, from the findings of this investigation, we conclude that--at least in this taxon--sexual isolation based on homologous recombination is likely not high enough to strongly hamper gene flow between species as compared with gene flow between distantly related members of the same species. The 70% relative binding ratio cutoff used to define bacterial species is likely correlated to

  12. AcCNET (Accessory Genome Constellation Network): comparative genomics software for accessory genome analysis using bipartite networks.

    PubMed

    Lanza, Val F; Baquero, Fernando; de la Cruz, Fernando; Coque, Teresa M

    2017-01-15

    AcCNET (Accessory genome Constellation Network) is a Perl application that aims to compare accessory genomes of a large number of genomic units, both at qualitative and quantitative levels. Using the proteomes extracted from the analysed genomes, AcCNET creates a bipartite network compatible with standard network analysis platforms. AcCNET allows merging phylogenetic and functional information about the concerned genomes, thus improving the capability of current methods of network analysis. The AcCNET bipartite network opens a new perspective to explore the pangenome of bacterial species, focusing on the accessory genome behind the idiosyncrasy of a particular strain and/or population.

  13. Comparative genomics of trypanosomatid parasitic protozoa.

    PubMed

    El-Sayed, Najib M; Myler, Peter J; Blandin, Gaëlle; Berriman, Matthew; Crabtree, Jonathan; Aggarwal, Gautam; Caler, Elisabet; Renauld, Hubert; Worthey, Elizabeth A; Hertz-Fowler, Christiane; Ghedin, Elodie; Peacock, Christopher; Bartholomeu, Daniella C; Haas, Brian J; Tran, Anh-Nhi; Wortman, Jennifer R; Alsmark, U Cecilia M; Angiuoli, Samuel; Anupama, Atashi; Badger, Jonathan; Bringaud, Frederic; Cadag, Eithon; Carlton, Jane M; Cerqueira, Gustavo C; Creasy, Todd; Delcher, Arthur L; Djikeng, Appolinaire; Embley, T Martin; Hauser, Christopher; Ivens, Alasdair C; Kummerfeld, Sarah K; Pereira-Leal, Jose B; Nilsson, Daniel; Peterson, Jeremy; Salzberg, Steven L; Shallom, Joshua; Silva, Joana C; Sundaram, Jaideep; Westenberger, Scott; White, Owen; Melville, Sara E; Donelson, John E; Andersson, Björn; Stuart, Kenneth D; Hall, Neil

    2005-07-15

    A comparison of gene content and genome architecture of Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major, three related pathogens with different life cycles and disease pathology, revealed a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters. Many species-specific genes, especially large surface antigen families, occur at nonsyntenic chromosome-internal and subtelomeric regions. Retroelements, structural RNAs, and gene family expansion are often associated with syntenic discontinuities that-along with gene divergence, acquisition and loss, and rearrangement within the syntenic regions-have shaped the genomes of each parasite. Contrary to recent reports, our analyses reveal no evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont.

  14. Comparative genomics of Enterococcus spp. isolated from bovine feces.

    PubMed

    Beukers, Alicia G; Zaheer, Rahat; Goji, Noriko; Amoako, Kingsley K; Chaves, Alexandre V; Ward, Michael P; McAllister, Tim A

    2017-03-08

    Enterococcus is ubiquitous in nature and is a commensal of both the bovine and human gastrointestinal (GI) tract. It is also associated with clinical infections in humans. Subtherapeutic administration of antibiotics to cattle selects for antibiotic resistant enterococci in the bovine GI tract. Antibiotic resistance genes (ARGs) may be present in enterococci following antibiotic use in cattle. If located on mobile genetic elements (MGEs) their dissemination between Enterococcus species and to pathogenic bacteria may be promoted, reducing the efficacy of antibiotics. We present a comparative genomic analysis of twenty-one Enterococcus spp. isolated from bovine feces including Enterococcus hirae (n = 10), Enterococcus faecium (n = 3), Enterococcus villorum (n = 2), Enterococcus casseliflavus (n = 2), Enterococcus faecalis (n = 1), Enterococcus durans (n = 1), Enterococcus gallinarum (n = 1) and Enterococcus thailandicus (n = 1). The analysis revealed E. faecium and E. faecalis from bovine feces share features with human clinical isolates, including virulence factors. The Tn917 transposon conferring macrolide-lincosamide-streptogramin B resistance was identified in both E. faecium and E. hirae, suggesting dissemination of ARGs on MGEs may occur in the bovine GI tract. An E. faecium isolate was also identified with two integrative conjugative elements (ICEs) belonging to the Tn916 family of ICE, Tn916 and Tn5801, both conferring tetracycline resistance. This study confirms the presence of enterococci in the bovine GI tract possessing ARGs on MGEs, but the predominant species in cattle, E. hirae is not commonly associated with infections in humans. Analysis using additional complete genomes of E. faecium from the NCBI database demonstrated differential clustering of commensal and clinical isolates, suggesting that these strains may be specifically adapted to their respective environments.

  15. GenColors-based comparative genome databases for small eukaryotic genomes.

    PubMed

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  16. GenColors-based comparative genome databases for small eukaryotic genomes

    PubMed Central

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources. PMID:23193285

  17. The chimeric nature of the genomes of marine magnetotactic coccoid-ovoid bacteria defines a novel group of Proteobacteria.

    PubMed

    Ji, Boyang; Zhang, Sheng-Da; Zhang, Wei-Jia; Rouy, Zoe; Alberto, François; Santini, Claire-Lise; Mangenot, Sophie; Gagnot, Séverine; Philippe, Nadège; Pradel, Nathalie; Zhang, Lichen; Tempel, Sébastien; Li, Ying; Médigue, Claudine; Henrissat, Bernard; Coutinho, Pedro M; Barbe, Valérie; Talla, Emmanuel; Wu, Long-Fei

    2017-03-01

    Magnetotactic bacteria (MTB) are a group of phylogenetically and physiologically diverse Gram-negative bacteria that synthesize intracellular magnetic crystals named magnetosomes. MTB are affiliated with three classes of Proteobacteria phylum, Nitrospirae phylum, Omnitrophica phylum and probably with the candidate phylum Latescibacteria. The evolutionary origin and physiological diversity of MTB compared with other bacterial taxonomic groups remain to be illustrated. Here, we analysed the genome of the marine magneto-ovoid strain MO-1 and found that it is closely related to Magnetococcus marinus MC-1. Detailed analyses of the ribosomal proteins and whole proteomes of 390 genomes reveal that, among the Proteobacteria analysed, only MO-1 and MC-1 have coding sequences (CDSs) with a similarly high proportion of origins from Alphaproteobacteria, Betaproteobacteria, Deltaproteobacteria and Gammaproteobacteria. Interestingly, a comparative metabolic network analysis with anoxic network enzymes from sequenced MTB and non-MTB successfully allows the eventual prediction of an organism with a metabolic profile compatible for magnetosome production. Altogether, our genomic analysis reveals multiple origins of MO-1 and M. marinus MC-1 genomes and suggests a metabolism-restriction model for explaining whether a bacterium could become an MTB upon acquisition of magnetosome encoding genes. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.

  18. Comparative Genomics of 12 Strains of Erwinia amylovora Identifies a Pan-Genome with a Large Conserved Core

    PubMed Central

    Mann, Rachel A.; Smits, Theo H. M.; Bühlmann, Andreas; Blom, Jochen; Goesmann, Alexander; Frey, Jürg E.; Plummer, Kim M.; Beer, Steven V.; Luck, Joanne; Duffy, Brion; Rodoni, Brendan

    2013-01-01

    The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus) and strains infecting Rubus (raspberries and blackberries). Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin) of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains), the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains. PMID:23409014

  19. Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

    PubMed

    Mann, Rachel A; Smits, Theo H M; Bühlmann, Andreas; Blom, Jochen; Goesmann, Alexander; Frey, Jürg E; Plummer, Kim M; Beer, Steven V; Luck, Joanne; Duffy, Brion; Rodoni, Brendan

    2013-01-01

    The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus) and strains infecting Rubus (raspberries and blackberries). Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin) of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains), the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea) and a putative secondary metabolite pathway only present in Rubus-infecting strains.

  20. Comparative genetics and genomics of nematodes: genome structure, development, and lifestyle.

    PubMed

    Sommer, Ralf J; Streit, Adrian

    2011-01-01

    Nematodes are found in virtually all habitats on earth. Many of them are parasites of plants and animals, including humans. The free-living nematode, Caenorhabditis elegans, is one of the genetically best-studied model organisms and was the first metazoan whose genome was fully sequenced. In recent years, the draft genome sequences of another six nematodes representing four of the five major clades of nematodes were published. Compared to mammalian genomes, all these genomes are very small. Nevertheless, they contain almost the same number of genes as the human genome. Nematodes are therefore a very attractive system for comparative genetic and genomic studies, with C. elegans as an excellent baseline. Here, we review the efforts that were made to extend genetic analysis to nematodes other than C. elegans, and we compare the seven available nematode genomes. One of the most striking findings is the unexpectedly high incidence of gene acquisition through horizontal gene transfer (HGT).

  1. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.

    PubMed

    Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

    2006-07-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this

  2. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    DOE PAGES

    Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; ...

    2017-08-08

    Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less

  3. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    PubMed Central

    Chari, Raj; Lockwood, William W.; Lam, Wan L.

    2006-01-01

    Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development. PMID:17992253

  4. Inference of homologous recombination in bacteria using whole-genome sequences.

    PubMed

    Didelot, Xavier; Lawson, Daniel; Darling, Aaron; Falush, Daniel

    2010-12-01

    Bacteria and archaea reproduce clonally, but sporadically import DNA into their chromosomes from other organisms. In many of these events, the imported DNA replaces an homologous segment in the recipient genome. Here we present a new method to reconstruct the history of recombination events that affected a given sample of bacterial genomes. We introduce a mathematical model that represents both the donor and the recipient of each DNA import as an ancestor of the genomes in the sample. The model represents a simplification of the previously described coalescent with gene conversion. We implement a Monte Carlo Markov chain algorithm to perform inference under this model from sequence data alignments and show that inference is feasible for whole-genome alignments through parallelization. Using simulated data, we demonstrate accurate and reliable identification of individual recombination events and global recombination rate parameters. We applied our approach to an alignment of 13 whole genomes from the Bacillus cereus group. We find, as expected from laboratory experiments, that the recombination rate is higher between closely related organisms and also that the genome contains several broad regions of elevated levels of recombination. Application of the method to the genomic data sets that are becoming available should reveal the evolutionary history and private lives of populations of bacteria and archaea. The methods described in this article have been implemented in a computer software package, ClonalOrigin, which is freely available from http://code.google.com/p/clonalorigin/.

  5. Chance and necessity in the genome evolution of endosymbiotic bacteria of insects.

    PubMed

    Sabater-Muñoz, Beatriz; Toft, Christina; Alvarez-Ponce, David; Fares, Mario A

    2017-06-01

    An open question in evolutionary biology is how does the selection-drift balance determine the fates of biological interactions. We searched for signatures of selection and drift in genomes of five endosymbiotic bacterial groups known to evolve under strong genetic drift. Although most genes in endosymbiotic bacteria showed evidence of relaxed purifying selection, many genes in these bacteria exhibited stronger selective constraints than their orthologs in free-living bacterial relatives. Remarkably, most of these highly constrained genes had no role in the host-symbiont interactions but were involved in either buffering the deleterious consequences of drift or other host-unrelated functions, suggesting that they have either acquired new roles or their role became more central in endosymbiotic bacteria. Experimental evolution of Escherichia coli under strong genetic drift revealed remarkable similarities in the mutational spectrum, genome reduction patterns and gene losses to endosymbiotic bacteria of insects. Interestingly, the transcriptome of the experimentally evolved lines showed a generalized deregulation of the genome that affected genes encoding proteins involved in mutational buffering, regulation and amino acid biosynthesis, patterns identical to those found in endosymbiotic bacteria. Our results indicate that drift has shaped endosymbiotic associations through a change in the functional landscape of bacterial genes and that the host had only a small role in such a shift.

  6. Comparative Genomics of an Emerging Amphibian Virus

    PubMed Central

    Epstein, Brendan; Storfer, Andrew

    2015-01-01

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination. PMID:26530419

  7. Comparative Genomics of an Emerging Amphibian Virus.

    PubMed

    Epstein, Brendan; Storfer, Andrew

    2015-11-03

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination.

  8. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium.

    PubMed

    Lo, Wen-Sui; Chen, Ling-Ling; Chung, Wan-Chia; Gasparich, Gail E; Kuo, Chih-Horng

    2013-01-16

    The genus Spiroplasma contains a group of helical, motile, and wall-less bacteria in the class Mollicutes. Similar to other members of this class, such as the animal-pathogenic Mycoplasma and the plant-pathogenic 'Candidatus Phytoplasma', all characterized Spiroplasma species were found to be associated with eukaryotic hosts. While most of the Spiroplasma species appeared to be harmless commensals of insects, a small number of species have evolved pathogenicity toward various arthropods and plants. In this study, we isolated a novel strain of honeybee-associated S. melliferum and investigated its genetic composition and evolutionary history by whole-genome shotgun sequencing and comparative analysis with other Mollicutes genomes. The whole-genome shotgun sequencing of S. melliferum IPMB4A produced a draft assembly that was ~1.1 Mb in size and covered ~80% of the chromosome. Similar to other Spiroplasma genomes that have been studied to date, we found that this genome contains abundant repetitive sequences that originated from plectrovirus insertions. These phage fragments represented a major obstacle in obtaining a complete genome sequence of Spiroplasma with the current sequencing technology. Comparative analysis of S. melliferum IPMB4A with other Spiroplasma genomes revealed that these phages may have facilitated extensive genome rearrangements in these bacteria and contributed to horizontal gene transfers that led to species-specific adaptation to different eukaryotic hosts. In addition, comparison of gene content with other Mollicutes suggested that the common ancestor of the SEM (Spiroplasma, Entomoplasma, and Mycoplasma) clade may have had a relatively large genome and flexible metabolic capacity; the extremely reduced genomes of present day Mycoplasma and 'Candidatus Phytoplasma' species are likely to be the result of independent gene losses in these lineages. The findings in this study highlighted the significance of phage insertions and horizontal gene

  9. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium

    PubMed Central

    2013-01-01

    Background The genus Spiroplasma contains a group of helical, motile, and wall-less bacteria in the class Mollicutes. Similar to other members of this class, such as the animal-pathogenic Mycoplasma and the plant-pathogenic ‘Candidatus Phytoplasma’, all characterized Spiroplasma species were found to be associated with eukaryotic hosts. While most of the Spiroplasma species appeared to be harmless commensals of insects, a small number of species have evolved pathogenicity toward various arthropods and plants. In this study, we isolated a novel strain of honeybee-associated S. melliferum and investigated its genetic composition and evolutionary history by whole-genome shotgun sequencing and comparative analysis with other Mollicutes genomes. Results The whole-genome shotgun sequencing of S. melliferum IPMB4A produced a draft assembly that was ~1.1 Mb in size and covered ~80% of the chromosome. Similar to other Spiroplasma genomes that have been studied to date, we found that this genome contains abundant repetitive sequences that originated from plectrovirus insertions. These phage fragments represented a major obstacle in obtaining a complete genome sequence of Spiroplasma with the current sequencing technology. Comparative analysis of S. melliferum IPMB4A with other Spiroplasma genomes revealed that these phages may have facilitated extensive genome rearrangements in these bacteria and contributed to horizontal gene transfers that led to species-specific adaptation to different eukaryotic hosts. In addition, comparison of gene content with other Mollicutes suggested that the common ancestor of the SEM (Spiroplasma, Entomoplasma, and Mycoplasma) clade may have had a relatively large genome and flexible metabolic capacity; the extremely reduced genomes of present day Mycoplasma and ‘Candidatus Phytoplasma’ species are likely to be the result of independent gene losses in these lineages. Conclusions The findings in this study highlighted the significance of

  10. Comparative Genomics of Bifidobacterium animalis subsp. lactis Reveals a Strict Monophyletic Bifidobacterial Taxon

    PubMed Central

    Milani, Christian; Duranti, Sabrina; Lugli, Gabriele Andrea; Bottacini, Francesca; Strati, Francesco; Arioli, Stefania; Foroni, Elena; Turroni, Francesca; van Sinderen, Douwe

    2013-01-01

    Strains of Bifidobacterium animalis subsp. lactis are extensively exploited by the food industry as health-promoting bacteria, although the genetic variability of members belonging to this taxon has so far not received much scientific attention. In this article, we describe the complete genetic makeup of the B. animalis subsp. lactis Bl12 genome and discuss the genetic relatedness of this strain with other sequenced strains belonging to this taxon. Moreover, a detailed comparative genomic analysis of B. animalis subsp. lactis genomes was performed, which revealed a closely related and isogenic nature of all currently available B. animalis subsp. lactis strains, thus strongly suggesting a closed pan-genome structure of this bacterial group. PMID:23645200

  11. Population genomics of early events in the ecological differentiation of bacteria

    SciTech Connect

    Shapiro, Jesse B.; Friedman, Jonatan; Cordero, Otto X.; Preheim, Sarah P..; Timberlake, Sonia C.; Szabo, Gitta; Polz, Martin F.; Alm, Eric J.

    2012-04-06

    Genetic exchange is common among bacteria, but its effect on population diversity during ecological differentiation remains controversial. A fundamental question is whether advantageous mutations lead to selection of clonal genomes or, as in sexual eukaryotes, sweep through populations on their own. Here, we show that in two recently diverged populations of ocean bacteria, ecological differentiation has occurred akin to a sexual mechanism: A few genome regions have swept through subpopulations in a habitat-specific manner, accompanied by gradual separation of gene pools as evidenced by increased habitat specificity of the most recent recombinations. These findings reconcile previous, seemingly contradictory empirical observations of the genetic structure of bacterial populations and point to a more unified process of differentiation in bacteria and sexual eukaryotes than previously thought.

  12. Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus.

    PubMed

    Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

    2016-01-01

    Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria.

  13. Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus

    PubMed Central

    Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

    2016-01-01

    Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria. PMID:26900859

  14. Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

    PubMed

    Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

    2015-10-01

    Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms. Copyright © 2015. Published by Elsevier Ltd.

  15. A phylogenetic foundation for comparative mammalian genomics.

    PubMed

    Waddell, P J; Kishino, H; Ota, R

    2001-01-01

    A major effort is being undertaken to sequence an array of mammalian genomes. Coincidentally, the evolutionary relationships of the 18 presently recognized orders of placental mammals are only just being resolved. In this work we construct and analyse the largest alignments of amino acid sequence data to date. Our findings allow us to set up a series of superordinal groups (clades) to act as prior hypotheses for further testing. Important findings include strong evidence for a clade of Euarchonta+Glires (=Supraprimates) comprised of primates, flying lemurs, tree shrews, lagomorphs and rodents. In addition, there is good evidence for a clade of all placental mammals except Xenarthra and Afrotheria (=Boreotheria) and for the previously recognised clades Laurasiatheria, Scrotifera, Fereuungulata, Ferae, Afrotheria, Euarchonta, Glires, and Eulipotyphla. Accordingly, a revised classification of the placental mammals is put forward. Using this and molecular divergence-time methods, the ages of the superordinal splits are estimated. While results are strongly consistent with the earliest superordinal divergences all being >65 mybp (Cretaceous period), they suffer from greater uncertainty than presently appreciated. The early primate split of tarsiers from the anthropoid lineage at ~55 mybp is seen to be an especially informative fossil calibration point. A statistical framework for testing clades using SINE data is presented and reveals significant support for the tarsier/anthropoid clade, as well as the clades Cetruminantia and Whippomorpha. Results also underline our thesis that while sequence analysis can help set up hypothesised clades, SINEs obtainable from sequencing 1-2 MB regions of placental genomes are essential to testing them. In contrast, derivations suggest that empirical Bayesian methods for sequence data may not be robust estimators of clades. Our findings, including the study of genes such as TP53, make a good case for the tree shrew as a closer relative

  16. Comparative genomics of autism and schizophrenia

    PubMed Central

    Crespi, Bernard; Stead, Philip; Elliot, Michael

    2010-01-01

    We used data from studies of copy-number variants (CNVs), single-gene associations, growth-signaling pathways, and intermediate phenotypes associated with brain growth to evaluate four alternative hypotheses for the genomic and developmental relationships between autism and schizophrenia: (i) autism subsumed in schizophrenia, (ii) independence, (iii) diametric, and (iv) partial overlap. Data from CNVs provides statistical support for the hypothesis that autism and schizophrenia are associated with reciprocal variants, such that at four loci, deletions predispose to one disorder, whereas duplications predispose to the other. Data from single-gene studies are inconsistent with a hypothesis based on independence, in that autism and schizophrenia share associated genes more often than expected by chance. However, differentiation between the partial overlap and diametric hypotheses using these data is precluded by limited overlap in the specific genetic markers analyzed in both autism and schizophrenia. Evidence from the effects of risk variants on growth-signaling pathways shows that autism-spectrum conditions tend to be associated with up-regulation of pathways due to loss of function mutations in negative regulators, whereas schizophrenia is associated with reduced pathway activation. Finally, data from studies of head and brain size phenotypes indicate that autism is commonly associated with developmentally-enhanced brain growth, whereas schizophrenia is characterized, on average, by reduced brain growth. These convergent lines of evidence appear most compatible with the hypothesis that autism and schizophrenia represent diametric conditions with regard to their genomic underpinnings, neurodevelopmental bases, and phenotypic manifestations as reflecting under-development versus dysregulated over-development of the human social brain. PMID:19955444

  17. Gramene 2016: comparative plant genomics and pathway resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...

  18. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    PubMed Central

    2011-01-01

    Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp). Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S. oneidensis MR-1. Analysis of

  19. Phytozome: a comparative platform for green plant genomics.

    PubMed

    Goodstein, David M; Shu, Shengqiang; Howson, Russell; Neupane, Rochak; Hayes, Richard D; Fazo, Joni; Mitros, Therese; Dirks, William; Hellsten, Uffe; Putnam, Nicholas; Rokhsar, Daniel S

    2012-01-01

    The number of sequenced plant genomes and associated genomic resources is growing rapidly with the advent of both an increased focus on plant genomics from funding agencies, and the application of inexpensive next generation sequencing. To interact with this increasing body of data, we have developed Phytozome (http://www.phytozome.net), a comparative hub for plant genome and gene family data and analysis. Phytozome provides a view of the evolutionary history of every plant gene at the level of sequence, gene structure, gene family and genome organization, while at the same time providing access to the sequences and functional annotations of a growing number (currently 25) of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute, as well as selected species sequenced elsewhere. Through a comprehensive plant genome database and web portal, these data and analyses are available to the broader plant science research community, providing powerful comparative genomics tools that help to link model systems with other plants of economic and ecological importance.

  20. Comparative genomics of closely related Salmonella enterica serovar Typhi strains reveals genome dynamics and the acquisition of novel pathogenic elements.

    PubMed

    Yap, Kien-Pong; Gan, Han Ming; Teh, Cindy Shuan Ju; Chai, Lay Ching; Thong, Kwai Lin

    2014-11-20

    Typhoid fever is an infectious disease of global importance that is caused by Salmonella enterica subsp. enterica serovar Typhi (S. Typhi). This disease causes an estimated 200,000 deaths per year and remains a serious global health threat. S. Typhi is strictly a human pathogen, and some recovered individuals become long-term carriers who continue to shed the bacteria in their faeces, thus becoming main reservoirs of infection. A comparative genomics analysis combined with a phylogenomic analysis revealed that the strains from the outbreak and carrier were closely related with microvariations and possibly derived from a common ancestor. Additionally, the comparative genomics analysis with all of the other completely sequenced S. Typhi genomes revealed that strains BL196 and CR0044 exhibit unusual genomic variations despite S. Typhi being generally regarded as highly clonal. The two genomes shared distinct chromosomal architectures and uncommon genome features; notably, the presence of a ~10 kb novel genomic island containing uncharacterised virulence-related genes, and zot in particular. Variations were also detected in the T6SS system and genes that were related to SPI-10, insertion sequences, CRISPRs and nsSNPs among the studied genomes. Interestingly, the carrier strain CR0044 harboured far more genetic polymorphisms (83% mutant nsSNPs) compared with the closely related BL196 outbreak strain. Notably, the two highly related virulence-determinant genes, rpoS and tviE, were mutated in strains BL196 and CR0044, respectively, which revealed that the mutation in rpoS is stabilising, while that in tviE is destabilising. These microvariations provide novel insight into the optimisation of genes by the pathogens. However, the sporadic strain was found to be far more conserved compared with the others. The uncommon genomic variations in the two closely related BL196 and CR0044 strains suggests that S. Typhi is more diverse than previously thought. Our study has

  1. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  2. Comparative genomics of free-living Gammaproteobacteria: pathogenesis-related genes or interaction-related genes?

    PubMed

    Vázquez-Rosas-Landa, Mirna; Ponce-Soto, Gabriel Yaxal; Eguiarte, Luis E; Souza, V

    2017-07-31

    Bacteria have numerous strategies to interact with themselves and with their environment, but genes associated with these interactions are usually cataloged as pathogenic. To understand the role that these genes have not only in pathogenesis but also in bacterial interactions, we compared the genomes of eight bacteria from human-impacted environments with those of free-living bacteria from the Cuatro Ciénegas Basin (CCB), a relatively pristine oligotrophic site. Fifty-one genomes from CCB bacteria, including Pseudomonas, Vibrio, Photobacterium and Aeromonas, were analyzed. We found that the CCB strains had several virulence-related genes, 15 of which were common to all strains and were related to flagella and chemotaxis. We also identified the presence of Type III and VI secretion systems, which leads us to propose that these systems play an important role in interactions among bacterial communities beyond pathogenesis. None of the CCB strains had pathogenicity islands, despite having genes associated with antibiotics. Integrons were rare, while CRISPR elements were common. The idea that pathogenicity-related genes in many cases form part of a wider strategy used by bacteria to interact with other organisms could help us to understand the role of pathogenicity-related elements in an ecological and evolutionary framework leading toward a more inclusive One Health concept. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Reference-Free Comparative Genomics of 174 Chloroplasts

    PubMed Central

    Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R.; Yu, Jun; Cannon, Charles H.

    2012-01-01

    Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ∼18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied

  4. Comparative Genomics and Extensive Recombinations in Phage Communities

    NASA Astrophysics Data System (ADS)

    Poisson, Guylaine; Belcaid, Mahdi; Bergeron, Anne

    Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities.

  5. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    PubMed

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches.

  6. Comparative Genomics Evidence That Only Protein Toxins are Tagging Bad Bugs

    PubMed Central

    Georgiades, Kalliopi; Raoult, Didier

    2011-01-01

    The term toxin was introduced by Roux and Yersin and describes macromolecular substances that, when produced during infection or when introduced parenterally or orally, cause an impairment of physiological functions that lead to disease or to the death of the infected organism. Long after the discovery of toxins, early genetic studies on bacterial virulence demonstrated that removing a certain number of genes from pathogenic bacteria decreases their capacity to infect hosts. Each of the removed factors was therefore referred to as a “virulence factor,” and it was speculated that non-pathogenic bacteria lack such supplementary factors. However, many recent comparative studies demonstrate that the specialization of bacteria to eukaryotic hosts is associated with massive gene loss. We recently demonstrated that the only features that seem to characterize 12 epidemic bacteria are toxin–antitoxin (TA) modules, which are addiction molecules in host bacteria. In this study, we investigated if protein toxins are indeed the only molecules specific to pathogenic bacteria by comparing 14 epidemic bacterial killers (“bad bugs”) with their 14 closest non-epidemic relatives (“controls”). We found protein toxins in significantly more elevated numbers in all of the “bad bugs.” For the first time, statistical principal components analysis, including genome size, GC%, TA modules, restriction enzymes, and toxins, revealed that toxins are the only proteins other than TA modules that are correlated with the pathogenic character of bacteria. Moreover, intracellular toxins appear to be more correlated with the pathogenic character of bacteria than secreted toxins. In conclusion, we hypothesize that the only truly identifiable phenomena, witnessing the convergent evolution of the most pathogenic bacteria for humans are the loss of metabolic activities, i.e., the outcome of the loss of regulatory and transcription factors and the presence of protein toxins, alone, or

  7. Dyneins Across Eukaryotes: A Comparative Genomic Analysis

    PubMed Central

    Wickstead, Bill; Gull, Keith

    2007-01-01

    Dyneins are large minus-end-directed microtubule motors. Each dynein contains at least one dynein heavy chain (DHC) and a variable number of intermediate chains (IC), light intermediate chains (LIC) and light chains (LC). Here, we used genome sequence data from 24 diverse eukaryotes to assess the distribution of DHCs, ICs, LICs and LCs across Eukaryota. Phylogenetic inference identified nine DHC families (two cytoplasmic and seven axonemal) and six IC families (one cytoplasmic). We confirm that dyneins have been lost from higher plants and show that this is most likely because of a single loss of cytoplasmic dynein 1 from the ancestor of Rhodophyta and Viridiplantae, followed by lineage-specific losses of other families. Independent losses in Entamoeba mean that at least three extant eukaryotic lineages are entirely devoid of dyneins. Cytoplasmic dynein 2 is associated with intraflagellar transport (IFT), but in two chromalveolate organisms, we find an IFT footprint without the retrograde motor. The distribution of one family of outer-arm dyneins accounts for 2-headed or 3-headed outer-arm ultrastructures observed in different organisms. One diatom species builds motile axonemes without any inner-arm dyneins (IAD), and the unexpected conservation of IAD I1 in non-flagellate algae and LC8 (DYNLL1/2) in all lineages reveals a surprising fluidity to dynein function. PMID:17897317

  8. Genome analysis of food grade lactic Acid-producing bacteria: from basics to applications.

    PubMed

    Mayo, B; van Sinderen, D; Ventura, M

    2008-05-01

    Whole-genome sequencing has revolutionized and accelerated scientific research that aims to study the genetics, biochemistry and molecular biology of bacteria. Lactic acid-producing bacteria, which include lactic acid bacteria (LAB) and bifidobacteria, are typically Gram-positive, catalase-negative organisms, which occupy a wide range of natural plant- and animal-associated environments. LAB species are frequently involved in the transformation of perishable raw materials into more stable, pleasant, palatable and safe fermented food products. LAB and bifidobacteria are also found among the resident microbiota of the gastrointestinal and/or genitourinary tracts of vertebrates, where they are believed to exert health-promoting effects. At present, the genomes of more than 20 LAB and bifidobacterial species have been completely sequenced. Their genome content reflects its specific metabolism, physiology, biosynthetic capabilities, and adaptability to varying conditions and environments. The typical LAB/bifidobacterial genome is relatively small (from 1.7 to 3.3 Mb) and thus harbors a limited assortment of genes (from around 1,600 to over 3,000). These small genomes code for a broad array of transporters for efficient carbon and nitrogen assimilation from the nutritionally-rich niches they usually inhabit, and specify a rather limited range of biosynthetic and degrading capabilities. The variation in the number of genes suggests that the genome evolution of each of these bacterial groups involved the processes of extensive gene loss from their particular ancestor, diversification of certain common biological activities through gene duplication, and acquisition of key functions via horizontal gene transfer. The availability of genome sequences is expected to revolutionize the exploitation of the metabolic potential of LAB and bifidobacteria, improving their use in bioprocessing and their utilization in biotechnological and health-related applications.

  9. Comparative genomics meets topology: a novel view on genome median and halving problems.

    PubMed

    Alexeev, Nikita; Avdeyev, Pavel; Alekseyev, Max A

    2016-11-11

    Genome median and genome halving are combinatorial optimization problems that aim at reconstruction of ancestral genomes by minimizing the number of evolutionary events between them and genomes of the extant species. While these problems have been widely studied in past decades, their solutions are often either not efficient or not biologically adequate. These shortcomings have been recently addressed by restricting the problems solution space. We show that the restricted variants of genome median and halving problems are, in fact, closely related. We demonstrate that these problems have a neat topological interpretation in terms of embedded graphs and polygon gluings. We illustrate how such interpretation can lead to solutions to these problems in particular cases. This study provides an unexpected link between comparative genomics and topology, and demonstrates advantages of solving genome median and halving problems within the topological framework.

  10. Whole Genome Amplification of Labeled Viable Single Cells Suited for Array-Comparative Genomic Hybridization.

    PubMed

    Kroneis, Thomas; El-Heliebi, Amin

    2015-01-01

    Understanding details of a complex biological system makes it necessary to dismantle it down to its components. Immunostaining techniques allow identification of several distinct cell types thereby giving an inside view of intercellular heterogeneity. Often staining reveals that the most remarkable cells are the rarest. To further characterize the target cells on a molecular level, single cell techniques are necessary. Here, we describe the immunostaining, micromanipulation, and whole genome amplification of single cells for the purpose of genomic characterization. First, we exemplify the preparation of cell suspensions from cultured cells as well as the isolation of peripheral mononucleated cells from blood. The target cell population is then subjected to immunostaining. After cytocentrifugation target cells are isolated by micromanipulation and forwarded to whole genome amplification. For whole genome amplification, we use GenomePlex(®) technology allowing downstream genomic analysis such as array-comparative genomic hybridization.

  11. The evolution of genomic base composition in bacteria.

    PubMed

    Haywood-Farmer, Eric; Otto, Sarah P

    2003-08-01

    Guanine plus cytosine (GC) content ranges broadly among bacterial genomes. In this study, we explore the use of a Brownian-motion model for the evolution of GC content over time. This model assumes that GC content varies over time in a continuous and homogeneous manner. Using this model and a maximum-likelihood approach, we analyzed the evolution of GC content across several bacterial phylogenies. Using three independent tests, we found that the observed divergence in GC content was consistent with a homogeneous Brownian-motion model. For example, similar rates of GC content evolution were inferred in several different bacterial subclades, indicating that there is relatively little rate heterogeneity in GC content evolution over broad evolutionary time scales. We thus argue that the homogeneous Brownian-motion model provides a good working model for GC content evolution. We then use this model to determine the overall rate of GC content evolution among eubacteria. We also determine the time frame over which GC content remains similar in related taxa, using a flexible definition for "similarity" in GC content so that, depending on the context, more or less stringent criteria may be applied. Our results have implications for models of sequence evolution, including those used for phylogenetic reconstruction and for inferring unusual changes in GC content.

  12. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    PubMed

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  13. Implementing sponge physiological and genomic information to enhance the diversity of its culturable associated bacteria.

    PubMed

    Lavy, Adi; Keren, Ray; Haber, Markus; Schwartz, Inbar; Ilan, Micha

    2014-02-01

    In recent years new approaches have emerged for culturing marine environmental bacteria. They include the use of novel culture media, sometimes with very low-nutrient content, and a variety of growth conditions such as temperature, oxygen levels, and different atmospheric pressures. These approaches have largely been neglected when it came to the cultivation of sponge-associated bacteria. Here, we used physiological and environmental conditions to reflect the environment of sponge-associated bacteria along with genomic data of the prominent sponge symbiont Candidatus Poribacteria sp. WGA-4E, to cultivate bacteria from the Red Sea sponge Theonella swinhoei. Designing culturing conditions to fit the metabolic needs of major bacterial taxa present in the sponge, through a combined use of diverse culture media compositions with aerobic and microaerophilic states, and addition of antibiotics, yielded higher diversity of the cultured bacteria and led to the isolation of novel sponge-associated and sponge-specific bacteria. In this work, 59 OTUs of six phyla were isolated. Of these, 22 have no close type strains at the species level (< 97% similarity of 16S rRNA gene sequence), representing novel bacteria species, and some are probably new genera and even families.

  14. The dog genome: survey sequencing and comparative analysis.

    PubMed

    Kirkness, Ewen F; Bafna, Vineet; Halpern, Aaron L; Levy, Samuel; Remington, Karin; Rusch, Douglas B; Delcher, Arthur L; Pop, Mihai; Wang, Wei; Fraser, Claire M; Venter, J Craig

    2003-09-26

    A survey of the dog genome sequence (6.22 million sequence reads; 1.5x coverage) demonstrates the power of sample sequencing for comparative analysis of mammalian genomes and the generation of species-specific resources. More than 650 million base pairs (>25%) of dog sequence align uniquely to the human genome, including fragments of putative orthologs for 18,473 of 24,567 annotated human genes. Mutation rates, conserved synteny, repeat content, and phylogeny can be compared among human, mouse, and dog. A variety of polymorphic elements are identified that will be valuable for mapping the genetic basis of diseases and traits in the dog.

  15. What constitutes an Arabian Helicobacter pylori? Lessons from comparative genomics.

    PubMed

    Kumar, Narender; Albert, M John; Al Abkal, Hanan; Siddique, Iqbal; Ahmed, Niyaz

    2017-02-01

    Helicobacter pylori, the human gastric pathogen, causes a variety of gastric diseases ranging from mild gastritis to gastric cancer. While the studies on H. pylori are dominated by those based on either East Asian or Western strains, information regarding H. pylori strains prevalent in the Middle East remains scarce. Therefore, we carried out whole-genome sequencing and comparative analysis of three H. pylori strains isolated from three native Arab, Kuwaiti patients. H. pylori strains were sequenced using Illumina platform. The sequence reads were filtered and draft genomes were assembled and annotated. Various pathogenicity-associated regions and phages present within the genomes were identified. Phylogenetic analysis was carried out to determine the genetic relatedness of Kuwaiti strains to various lineages of H. pylori. The core genome content and virulence-related genes were analyzed to assess the pathogenic potential. The three genomes clustered along with HpEurope strains in the phylogenetic tree comprising various H. pylori lineages. A total of 1187 genes spread among various functional classes were identified in the core genome analysis. The three genomes possessed a complete cagPAI and also retained most of the known outer membrane proteins as well as virulence-related genes. The cagA gene in all three strains consisted of an AB-C type EPIYA motif. The comparative genomic analysis of Kuwaiti H. pylori strains revealed a European ancestry and a high pathogenic potential. © 2016 John Wiley & Sons Ltd.

  16. Comparative Genomics Reveals the Core and Accessory Genomes of Streptomyces Species.

    PubMed

    Kim, Ji-Nu; Kim, Yeonbum; Jeong, Yujin; Roe, Jung-Hye; Kim, Byung-Gee; Cho, Byung-Kwan

    2015-10-01

    The development of rapid and efficient genome sequencing methods has enabled us to study the evolutionary background of bacterial genetic information. Here, we present comparative genomic analysis of 17 Streptomyces species, for which the genome has been completely sequenced, using the pan-genome approach. The analysis revealed that 34,592 ortholog clusters constituted the pan-genome of these Streptomyces species, including 2,018 in the core genome, 11,743 in the dispensable genome, and 20,831 in the unique genome. The core genome was converged to a smaller number of genes than reported previously, with 3,096 gene families. Functional enrichment analysis showed that genes involved in transcription were most abundant in the Streptomyces pan-genome. Finally, we investigated core genes for the sigma factors, mycothiol biosynthesis pathway, and secondary metabolism pathways; our data showed that many genes involved in stress response and morphological differentiation were commonly expressed in Streptomyces species. Elucidation of the core genome offers a basis for understanding the functional evolution of Streptomyces species and provides insights into target selection for the construction of industrial strains.

  17. Comparative genomic paleontology across plant kingdom reveals the dynamics of TE-driven genome evolution.

    PubMed

    El Baidouri, Moaine; Panaud, Olivier

    2013-01-01

    Long terminal repeat-retrotransposons (LTR-RTs) are the most abundant class of transposable elements (TEs) in plants. They strongly impact the structure, function, and evolution of their host genome, and, in particular, their role in genome size variation has been clearly established. However, the dynamics of the process through which LTR-RTs have differentially shaped plant genomes is still poorly understood because of a lack of comparative studies. Using a new robust and automated family classification procedure, we exhaustively characterized the LTR-RTs in eight plant genomes for which a high-quality sequence is available (i.e., Arabidopsis thaliana, A. lyrata, grapevine, soybean, rice, Brachypodium dystachion, sorghum, and maize). This allowed us to perform a comparative genome-wide study of the retrotranspositional landscape in these eight plant lineages from both monocots and dicots. We show that retrotransposition has recurrently occurred in all plant genomes investigated, regardless their size, and through bursts, rather than a continuous process. Moreover, in each genome, only one or few LTR-RT families have been active in the recent past, and the difference in genome size among the species studied could thus mostly be accounted for by the extent of the latest transpositional burst(s). Following these bursts, LTR-RTs are efficiently eliminated from their host genomes through recombination and deletion, but we show that the removal rate is not lineage specific. These new findings lead us to propose a new model of TE-driven genome evolution in plants.

  18. IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

    PubMed Central

    Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

    2009-01-01

    Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385

  19. Whole-Genome Relationships among Francisella Bacteria of Diverse Origins Define New Species and Provide Specific Regions for Detection.

    PubMed

    Challacombe, Jean F; Petersen, Jeannine M; Gallegos-Graves, La Verne; Hodge, David; Pillai, Segaran; Kuske, Cheryl R

    2017-02-01

    Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisella strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features-for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). This study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria.

  20. Whole-genome relationships among Francisella bacteria of diverse origins define new species and provide specific regions for detection

    DOE PAGES

    Challacombe, Jean Faust; Petersen, Jeannine M.; Gallegos-Graves, La Verne A.; ...

    2016-11-23

    Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisellamore » strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features—for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). Lastly, this study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria.« less

  1. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    SciTech Connect

    Rodionov, Dmitry A.; Novichkov, Pavel; Stavrovskaya, Elena D.; Rodionova, Irina A.; Li, Xiaoqing; Kazanov, Marat D.; Ravcheev, Dmitry A.; Gerasimova, Anna V.; Kazakov, Alexey E.; Kovaleva, Galina Y.; Permina, Elizabeth A.; Laikova, Olga N.; Overbeek, Ross; Romine, Margaret F.; Fredrickson, Jim K.; Arkin, Adam P.; Dubchak, Inna; Osterman, Andrei L.; Gelfand, Mikhail S.

    2011-06-15

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. Despite the growing number of genome-scale gene expression studies, our abilities to convert the results of these studies into accurate regulatory annotations and to project them from model to other organisms are extremely limited. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. However, even orthologous regulators with conserved DNA-binding motifs may control substantially different gene sets, revealing striking differences in regulatory strategies between the Shewanella spp. and E. coli. Multiple examples of regulatory network rewiring include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), and numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. NagR for N-acetylglucosamine catabolism and PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp).

  2. Comparative genomics of Mortierella elongata and its bacterial endosymbiont Mycoavidus cysteinexigens: Comparative genomics of Mortierella elongata

    SciTech Connect

    Uehling, J.; Gryganskyi, A.; Hameed, K.; Tschaplinski, T.; Misztal, P. K.; Wu, S.; Desirò, A.; Vande Pol, N.; Du, Z.; Zienkiewicz, A.; Zienkiewicz, K.; Morin, E.; Tisserant, E.; Splivallo, R.; Hainaut, M.; Kuo, A.; Yan, J.; Lipzen, A.; Nolan, M.; LaButti, K.; Barry, K.; Goldstein, A. H.; Labbé, J.; Schadt, C.; Tuskan, G.; Grigoriev, I.; Martin, F.; Vilgalys, R.; Bonito, G.

    2017-01-01

    Endosymbiosis of bacteria by eukaryotes is a defining feature of cellular evolution. In addition to well-known bacterial origins for mitochondria and chloroplasts, multiple origins of bacterial endosymbiosis are known within the cells of diverse animals, plants and fungi. Early-diverging lineages of terrestrial fungi harbor endosymbiotic bacteria belonging to the Burkholderiaceae. Furthermore, we sequenced the metagenome of the soil-inhabiting fungus Mortierella elongata and assembled the complete circular chromosome of its endosymbiont, Mycoavidus cysteinexigens, which we place within a lineage of endofungal symbionts that are sister clade to Burkholderia. The genome of M. elongata strain AG77 features a core set of primary metabolic pathways for degradation of simple carbohydrates and lipid biosynthesis, while the M. cysteinexigens (AG77) genome is reduced in size and function. Experiments using antibiotics to cure the endobacterium from the host demonstrate that the fungal host metabolism is highly modulated by presence/ absence of M. cysteinexigens. In independent comparative phylogenomic analyses of fungal and bacterial genomes we find that they are consistent with an ancient origin for M. elongata M. cysteinexigens symbiosis, most likely over 350 million years ago and concomitant with the terrestrialization of Earth and diversification of land fungi and plants.

  3. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss

    PubMed Central

    2010-01-01

    Background The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains. Results To better understand genome evolution and evolution of virulence characteristics in Listeria, we used a next generation sequencing approach to generate draft genomes for seven strains representing Listeria species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main Listeria species, showed evidence for (i) a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii) a critical role of gene loss events in transition of Listeria species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii) divergence of modern pathogenic and non-pathogenic Listeria species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes. Conclusions Genome evolution in Listeria involved limited gene loss and acquisition as supported by (i) a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii) conserved genome size (between 2.8 and 3.2 Mb), and (iii) a highly syntenic genome. Limited gene loss in Listeria did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus Listeria thus provides

  4. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss.

    PubMed

    den Bakker, Henk C; Cummings, Craig A; Ferreira, Vania; Vatta, Paolo; Orsi, Renato H; Degoricija, Lovorka; Barker, Melissa; Petrauskene, Olga; Furtado, Manohar R; Wiedmann, Martin

    2010-12-02

    The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains. To better understand genome evolution and evolution of virulence characteristics in Listeria, we used a next generation sequencing approach to generate draft genomes for seven strains representing Listeria species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main Listeria species, showed evidence for (i) a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii) a critical role of gene loss events in transition of Listeria species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii) divergence of modern pathogenic and non-pathogenic Listeria species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes. Genome evolution in Listeria involved limited gene loss and acquisition as supported by (i) a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii) conserved genome size (between 2.8 and 3.2 Mb), and (iii) a highly syntenic genome. Limited gene loss in Listeria did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus Listeria thus provides an example of a group of

  5. Biofilm bacteria: formation and comparative susceptibility to antibiotics

    PubMed Central

    Olson, Merle E.; Ceri, Howard; Morck, Douglas W.; Buret, Andre G.; Read, Ronald R.

    2002-01-01

    The Calgary Biofilm Device (CBD) was used to form bacterial biofilms of selected veterinary gram-negative and gram-positive pathogenic bacteria from cattle, sheep, pigs, chicken, and turkeys. The minimum inhibitory concentration (MIC) and minimum biofilm eradication concentration (MBEC) of ampicillin, ceftiofur, cloxacillin, oxytetracycline, penicillin G, streptomycin, tetracycline, enrofloxacin, erythromycin, gentamicin, tilmicosin, and trimethoprim-sulfadoxine for gram-positive and -negative bacteria were determined. Bacterial biofilms were readily formed on the CBD under selected conditions. The biofilms consisted of microcolonies encased in extracellular polysaccharide material. Biofilms composed of Arcanobacterium (Actinomyces) pyogenes, Staphylococcus aureus, Staphylococcus hyicus, Streptococcus agalactiae, Corynebacterium renale, or Corynebacterium pseudotuberculosis were not killed by the antibiotics tested but as planktonic bacteria they were sensitive at low concentrations. Biofilm and planktonic Streptococcus dysgalactiae and Streptococcus suis were sensitive to penicillin, ceftiofur, cloxacillin, ampicillin, and oxytetracycline. Planktonic Escherichia coli were sensitive to enrofloxacin, gentamicin, oxytetracycline and trimethoprim/ sulfadoxine. Enrofloxacin and gentamicin were the most effective antibiotics against E. coli growing as a biofilm. Salmonella spp. and Pseudomonas aeruginosa isolates growing as planktonic populations were sensitive to enrofloxacin, gentamicin, ampicillin, oxytetracycline, and trimethoprim/sulfadoxine, but as a biofilm, these bacteria were only sensitive to enrofloxacin. Planktonic and biofilm Pasteurella multocida and Mannheimia haemolytica had similar antibiotic sensitivity profiles and were sensitive to most of the antibiotics tested. The CBD provides a valuable new technology that can be used to select antibiotics that are able to kill bacteria growing as biofilms. PMID:11989739

  6. Comparative genomics boosts target prediction for bacterial small RNAs.

    PubMed

    Wright, Patrick R; Richter, Andreas S; Papenfort, Kai; Mann, Martin; Vogel, Jörg; Hess, Wolfgang R; Backofen, Rolf; Georg, Jens

    2013-09-10

    Small RNAs (sRNAs) constitute a large and heterogeneous class of bacterial gene expression regulators. Much like eukaryotic microRNAs, these sRNAs typically target multiple mRNAs through short seed pairing, thereby acting as global posttranscriptional regulators. In some bacteria, evidence for hundreds to possibly more than 1,000 different sRNAs has been obtained by transcriptome sequencing. However, the experimental identification of possible targets and, therefore, their confirmation as functional regulators of gene expression has remained laborious. Here, we present a strategy that integrates phylogenetic information to predict sRNA targets at the genomic scale and reconstructs regulatory networks upon functional enrichment and network analysis (CopraRNA, for Comparative Prediction Algorithm for sRNA Targets). Furthermore, CopraRNA precisely predicts the sRNA domains for target recognition and interaction. When applied to several model sRNAs, CopraRNA revealed additional targets and functions for the sRNAs CyaR, FnrS, RybB, RyhB, SgrS, and Spot42. Moreover, the mRNAs gdhA, lrp, marA, nagZ, ptsI, sdhA, and yobF-cspC were suggested as regulatory hubs targeted by up to seven different sRNAs. The verification of many previously undetected targets by CopraRNA, even for extensively investigated sRNAs, demonstrates its advantages and shows that CopraRNA-based analyses can compete with experimental target prediction approaches. A Web interface allows high-confidence target prediction and efficient classification of bacterial sRNAs.

  7. Comparative genomics boosts target prediction for bacterial small RNAs

    PubMed Central

    Wright, Patrick R.; Richter, Andreas S.; Papenfort, Kai; Mann, Martin; Vogel, Jörg; Hess, Wolfgang R.; Backofen, Rolf; Georg, Jens

    2013-01-01

    Small RNAs (sRNAs) constitute a large and heterogeneous class of bacterial gene expression regulators. Much like eukaryotic microRNAs, these sRNAs typically target multiple mRNAs through short seed pairing, thereby acting as global posttranscriptional regulators. In some bacteria, evidence for hundreds to possibly more than 1,000 different sRNAs has been obtained by transcriptome sequencing. However, the experimental identification of possible targets and, therefore, their confirmation as functional regulators of gene expression has remained laborious. Here, we present a strategy that integrates phylogenetic information to predict sRNA targets at the genomic scale and reconstructs regulatory networks upon functional enrichment and network analysis (CopraRNA, for Comparative Prediction Algorithm for sRNA Targets). Furthermore, CopraRNA precisely predicts the sRNA domains for target recognition and interaction. When applied to several model sRNAs, CopraRNA revealed additional targets and functions for the sRNAs CyaR, FnrS, RybB, RyhB, SgrS, and Spot42. Moreover, the mRNAs gdhA, lrp, marA, nagZ, ptsI, sdhA, and yobF-cspC were suggested as regulatory hubs targeted by up to seven different sRNAs. The verification of many previously undetected targets by CopraRNA, even for extensively investigated sRNAs, demonstrates its advantages and shows that CopraRNA-based analyses can compete with experimental target prediction approaches. A Web interface allows high-confidence target prediction and efficient classification of bacterial sRNAs. PMID:23980183

  8. Hidden Markov models for evolution and comparative genomics analysis.

    PubMed

    Bykova, Nadezda A; Favorov, Alexander V; Mironov, Andrey A

    2013-01-01

    The problem of reconstruction of ancestral states given a phylogeny and data from extant species arises in a wide range of biological studies. The continuous-time Markov model for the discrete states evolution is generally used for the reconstruction of ancestral states. We modify this model to account for a case when the states of the extant species are uncertain. This situation appears, for example, if the states for extant species are predicted by some program and thus are known only with some level of reliability; it is common for bioinformatics field. The main idea is formulation of the problem as a hidden Markov model on a tree (tree HMM, tHMM), where the basic continuous-time Markov model is expanded with the introduction of emission probabilities of observed data (e.g. prediction scores) for each underlying discrete state. Our tHMM decoding algorithm allows us to predict states at the ancestral nodes as well as to refine states at the leaves on the basis of quantitative comparative genomics. The test on the simulated data shows that the tHMM approach applied to the continuous variable reflecting the probabilities of the states (i.e. prediction score) appears to be more accurate then the reconstruction from the discrete states assignment defined by the best score threshold. We provide examples of applying our model to the evolutionary analysis of N-terminal signal peptides and transcription factor binding sites in bacteria. The program is freely available at http://bioinf.fbb.msu.ru/~nadya/tHMM and via web-service at http://bioinf.fbb.msu.ru/treehmmweb.

  9. Roundup 2.0: enabling comparative genomics for over 1800 genomes

    PubMed Central

    DeLuca, Todd F.; Cui, Jike; Jung, Jae-Yoon; St. Gabriel, Kristian Che; Wall, Dennis P.

    2012-01-01

    Summary: Roundup is an online database of gene orthologs for over 1800 genomes, including 226 Eukaryota, 1447 Bacteria, 113 Archaea and 21 Viruses. Orthologs are inferred using the Reciprocal Smallest Distance algorithm. Users may query Roundup for single-linkage clusters of orthologous genes based on any group of genomes. Annotated query results may be viewed in a variety of ways including as clusters of orthologs and as phylogenetic profiles. Genomic results may be downloaded in formats suitable for functional as well as phylogenetic analysis, including the recent OrthoXML standard. In addition, gene IDs can be retrieved using FASTA sequence search. All source code and orthologs are freely available. Availability: http://roundup.hms.harvard.edu Contact: dpwall@hms.harvard.edu; todd_deluca@hms.harvard.edu PMID:22247275

  10. Identification of a Bacteria-Specific Binding Protein from the Sequenced Bacterial Genome.

    PubMed

    Kong, Minsuk; Ryu, Sangryeol

    2016-01-01

    Novel and specific recognition elements are of central importance in the development of a pathogen detection method. Here, we describe a simple method for identifying the cell-wall binding domain (CBD) from a sequenced bacterial genome employing homology search for phage lysin genes. A putative CBD (CPF369_CBD) was identified from a genome of Clostridium perfringens type strain ATCC 13124, and its function was studied with the CBDGFP fusion protein recombinantly expressed in Escherichia coli. Fluorescence microscopy showed the specific binding of the fusion protein to C. perfringens cells, which demonstrates the potential of this method for the identification of novel bioprobes for specific detection of pathogenic bacteria.

  11. Towards long-read metagenomics: complete assembly of three novel genomes from bacteria dependent on a diazotrophic cyanobacterium in a freshwater lake co-culture.

    PubMed

    Driscoll, Connor B; Otten, Timothy G; Brown, Nathan M; Dreher, Theo W

    2017-01-01

    Here we report three complete bacterial genome assemblies from a PacBio shotgun metagenome of a co-culture from Upper Klamath Lake, OR. Genome annotations and culture conditions indicate these bacteria are dependent on carbon and nitrogen fixation from the cyanobacterium Aphanizomenon flos-aquae, whose genome was assembled to draft-quality. Due to their taxonomic novelty relative to previously sequenced bacteria, we have temporarily designated these bacteria as incertae sedis Hyphomonadaceae strain UKL13-1 (3,501,508 bp and 56.12% GC), incertae sedis Betaproteobacterium strain UKL13-2 (3,387,087 bp and 54.98% GC), and incertae sedis Bacteroidetes strain UKL13-3 (3,236,529 bp and 37.33% GC). Each genome consists of a single circular chromosome with no identified plasmids. When compared with binned Illumina assemblies of the same three genomes, there was ~7% discrepancy in total genome length. Gaps where Illumina assemblies broke were often due to repetitive elements. Within these missing sequences were essential genes and genes associated with a variety of functional categories. Annotated gene content reveals that both Proteobacteria are aerobic anoxygenic phototrophs, with Betaproteobacterium UKL13-2 potentially capable of phototrophic oxidation of sulfur compounds. Both proteobacterial genomes contain transporters suggesting they are scavenging fixed nitrogen from A. flos-aquae in the form of ammonium. Bacteroidetes UKL13-3 has few completely annotated biosynthetic pathways, and has a comparatively higher proportion of unannotated genes. The genomes were detected in only a few other freshwater metagenomes, suggesting that these bacteria are not ubiquitous in freshwater systems. Our results indicate that long-read sequencing is a viable method for sequencing dominant members from low-diversity microbial communities, and should be considered for environmental metagenomics when conditions meet these requirements.

  12. Evolutionary and comparative analyses of the soybean genome

    PubMed Central

    Cannon, Steven B.; Shoemaker, Randy C.

    2012-01-01

    The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplication that occurred ~13 million years ago (Mya); and fainter remnants of older polyploidies that occurred ~58 Mya and >130 Mya. The genome sequence has been used to identify the genetic basis for numerous traits, including disease resistance, nutritional characteristics, and developmental features. The genome sequence has provided a scaffold for placement of many genomic feature elements, both from within soybean and from related species. These may be accessed at several websites, including http://www.phytozome.net, http://soybase.org, http://comparative-legumes.org, and http://www.legumebase.brc.miyazaki-u.ac.jp. The taxonomic position of soybean in the Phaseoleae tribe of the legumes means that there are approximately two dozen other beans and relatives that have undergone independent domestication, and which may have traits that will be useful for transfer to soybean. Methods of translating information between species in the Phaseoleae range from design of markers for marker assisted selection, to transformation with Agrobacterium or with other experimental transformation methods. PMID:23136483

  13. Understanding the direction of evolution in Burkholderia glumae through comparative genomics.

    PubMed

    Lee, Hyun-Hee; Park, Jungwook; Kim, Jinnyun; Park, Inmyoung; Seo, Young-Su

    2016-02-01

    Members of the genus Burkholderia occupy remarkably diverse niches, with genome sizes ranging from ~3.75 to 11.29 Mbp. The genome of Burkholderia glumae ranges in size from ~5.81 to 7.89 Mbp. Unlike other plant pathogenic bacteria, B. glumae can infect a wide range of monocot and dicot plants. Comparative genome analysis of B. glumae strains can provide insight into genome variation as well as differential features of whole metabolism or pathways between multiple strains of B. glumae infecting the same host. Comparative analysis of complete genomes among B. glumae BGR1, B. glumae LMG 2196, and B. glumae PG1 revealed the largest departmentalization of genes onto separate replicons in B. glumae BGR1 and considerable downsizing of the genome in B. glumae LMG 2196. In addition, the presence of large-scale evolutionary events such as rearrangement and inversion and the development of highly specialized systems were found to be related to virulence-associated features in the three B. glumae strains. This connection may explain why this bacterium broadens its host range and reinforces its interaction with hosts.

  14. Comparative genomics of rhizobia nodulating soybean suggests extensive recruitment of lineage-specific genes in adaptations.

    PubMed

    Tian, Chang Fu; Zhou, Yuan Jie; Zhang, Yan Ming; Li, Qin Qin; Zhang, Yun Zeng; Li, Dong Fang; Wang, Shuang; Wang, Jun; Gilbert, Luz B; Li, Ying Rui; Chen, Wen Xin

    2012-05-29

    The rhizobium-legume symbiosis has been widely studied as the model of mutualistic evolution and the essential component of sustainable agriculture. Extensive genetic and recent genomic studies have led to the hypothesis that many distinct strategies, regardless of rhizobial phylogeny, contributed to the varied rhizobium-legume symbiosis. We sequenced 26 genomes of Sinorhizobium and Bradyrhizobium nodulating soybean to test this hypothesis. The Bradyrhizobium core genome is disproportionally enriched in lipid and secondary metabolism, whereas several gene clusters known to be involved in osmoprotection and adaptation to alkaline pH are specific to the Sinorhizobium core genome. These features are consistent with biogeographic patterns of these bacteria. Surprisingly, no genes are specifically shared by these soybean microsymbionts compared with other legume microsymbionts. On the other hand, phyletic patterns of 561 known symbiosis genes of rhizobia reflected the species phylogeny of these soybean microsymbionts and other rhizobia. Similar analyses with 887 known functional genes or the whole pan genome of rhizobia revealed that only the phyletic distribution of functional genes was consistent with the species tree of rhizobia. Further evolutionary genetics revealed that recombination dominated the evolution of core genome. Taken together, our results suggested that faithfully vertical genes were rare compared with those with history of recombination including lateral gene transfer, although rhizobial adaptations to symbiotic interactions and other environmental conditions extensively recruited lineage-specific shell genes under direct or indirect control through the speciation process.

  15. Comparative genome analysis of the closely related Synechocystis strains PCC 6714 and PCC 6803.

    PubMed

    Kopf, Matthias; Klähn, Stephan; Pade, Nadin; Weingärtner, Christian; Hagemann, Martin; Voß, Björn; Hess, Wolfgang R

    2014-06-01

    Synechocystis sp. PCC 6803 is the most popular cyanobacterial model for prokaryotic photosynthesis and for metabolic engineering to produce biofuels. Genomic and transcriptomic comparisons between closely related bacteria are powerful approaches to infer insights into their metabolic potentials and regulatory networks. To enable a comparative approach, we generated the draft genome sequence of Synechocystis sp. PCC 6714, a closely related strain of 6803 (16S rDNA identity 99.4%) that also is amenable to genetic manipulation. Both strains share 2838 protein-coding genes, leaving 845 unique genes in Synechocystis sp. PCC 6803 and 895 genes in Synechocystis sp. PCC 6714. The genetic differences include a prophage in the genome of strain 6714, a different composition of the pool of transposable elements, and a ∼ 40 kb genomic island encoding several glycosyltransferases and transport proteins. We verified several physiological differences that were predicted on the basis of the respective genome sequence. Strain 6714 exhibited a lower tolerance to Zn(2+) ions, associated with the lack of a corresponding export system and a lowered potential of salt acclimation due to the absence of a transport system for the re-uptake of the compatible solute glucosylglycerol. These new data will support the detailed comparative analyses of this important cyanobacterial group than has been possible thus far. Genome information for Synechocystis sp. PCC 6714 has been deposited in Genbank (accession no AMZV01000000). © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  16. Comparative Genome Analysis of the Closely Related Synechocystis Strains PCC 6714 and PCC 6803

    PubMed Central

    Kopf, Matthias; Klähn, Stephan; Pade, Nadin; Weingärtner, Christian; Hagemann, Martin; Voß, Björn; Hess, Wolfgang R.

    2014-01-01

    Synechocystis sp. PCC 6803 is the most popular cyanobacterial model for prokaryotic photosynthesis and for metabolic engineering to produce biofuels. Genomic and transcriptomic comparisons between closely related bacteria are powerful approaches to infer insights into their metabolic potentials and regulatory networks. To enable a comparative approach, we generated the draft genome sequence of Synechocystis sp. PCC 6714, a closely related strain of 6803 (16S rDNA identity 99.4%) that also is amenable to genetic manipulation. Both strains share 2838 protein-coding genes, leaving 845 unique genes in Synechocystis sp. PCC 6803 and 895 genes in Synechocystis sp. PCC 6714. The genetic differences include a prophage in the genome of strain 6714, a different composition of the pool of transposable elements, and a ∼40 kb genomic island encoding several glycosyltransferases and transport proteins. We verified several physiological differences that were predicted on the basis of the respective genome sequence. Strain 6714 exhibited a lower tolerance to Zn2+ ions, associated with the lack of a corresponding export system and a lowered potential of salt acclimation due to the absence of a transport system for the re-uptake of the compatible solute glucosylglycerol. These new data will support the detailed comparative analyses of this important cyanobacterial group than has been possible thus far. Genome information for Synechocystis sp. PCC 6714 has been deposited in Genbank (accession no AMZV01000000). PMID:24408876

  17. Comparative genomics of insect juvenile hormone biosynthesis⋆

    PubMed Central

    Noriega, F.G.; Ribeiro, J.M.C.; Koener, J.F.; Valenzuela, J.G.; Hernandez-Martinez, S.; Pham, V.M.; Feyereisen, R.

    2009-01-01

    The biosynthesis of insect juvenile hormone (JH) and its neuroendocrine control are attractive targets for chemical control of insect pests and vectors of disease. To facilitate the molecular study of JH biosynthesis, we analyzed ESTs from the glands producing JH, the corpora allata (CA) in the cockroach Diploptera punctata, an insect long used as a physiological model species and compared them with ESTs from the CA of the mosquitoes Aedes aegypti and Anopheles albimanus. The predicted genes were analyzed according to their probable functions with the Gene Ontology classification, and compared to Drosophila and Anopheles gambiae genes. A large number of reciprocal matches in the cDNA libraries of cockroach and mosquito CA were found. These matches defined known and suspected enzymes of the JH biosynthetic pathway, but also several proteins associated with signal transduction that might play a role in the modulation of JH synthesis by neuropeptides. The identification in both cockroach and mosquito CA of homologs of the small ligand binding proteins from insects, Takeout/JH binding protein and retinol-binding protein highlights a hitherto unsuspected complexity of metabolite trafficking, perhaps JH precursor trafficking, in these endocrine glands. Furthermore, many reciprocal matches for genes of unknown function may provide a fertile ground for an in-depth study of allatal-specific cell physiology. PMID:16551550

  18. Comparative genomics of Eucalyptus and Corymbia reveals low rates of genome structural rearrangement.

    PubMed

    Butler, J B; Vaillancourt, R E; Potts, B M; Lee, D J; King, G J; Baten, A; Shepherd, M; Freeman, J S

    2017-05-22

    Previous studies suggest genome structure is largely conserved between Eucalyptus species. However, it is unknown if this conservation extends to more divergent eucalypt taxa. We performed comparative genomics between the eucalypt genera Eucalyptus and Corymbia. Our results will facilitate transfer of genomic information between these important taxa and provide further insights into the rate of structural change in tree genomes. We constructed three high density linkage maps for two Corymbia species (Corymbia citriodora subsp. variegata and Corymbia torelliana) which were used to compare genome structure between both species and Eucalyptus grandis. Genome structure was highly conserved between the Corymbia species. However, the comparison of Corymbia and E. grandis suggests large (from 1-13 MB) intra-chromosomal rearrangements have occurred on seven of the 11 chromosomes. Most rearrangements were supported through comparisons of the three independent Corymbia maps to the E. grandis genome sequence, and to other independently constructed Eucalyptus linkage maps. These are the first large scale chromosomal rearrangements discovered between eucalypts. Nonetheless, in the general context of plants, the genomic structure of the two genera was remarkably conserved; adding to a growing body of evidence that conservation of genome structure is common amongst woody angiosperms.

  19. Comparing Vertebrate Whole-Genome Shotgun Reads to the Human Genome

    PubMed Central

    Chen, Rui; Bouck, John B.; Weinstock, George M.; Gibbs, Richard A.

    2001-01-01

    Multi-species sequence comparisons are a very efficient way to reveal conserved genes. Because sequence finishing is expensive and time consuming, many genome sequences are likely to stay incomplete. A challenge is to use these fragmented data for understanding the human genome. Methods for using cross-species whole-genome shotgun sequence (WGS) for genome annotation are described in this paper. About one-half million high-quality rat WGS reads (covering 7.5% of the rat genome) generated at the Baylor College of Medicine Human Genome Sequencing Center were compared with the human genome. Using computer-generated random reads as a negative control, a set of parameters was determined for reliable interpretation of BLAST search results. About 10% of the rat reads contain regions that are conserved in the human genomic sequence and about one-third of these include known gene-coding regions. Mapping the conserved regions to human chromosomes showed a 23-fold enrichment for coding regions compared with noncoding regions. This approach can also be applied to other mammalian genomes for gene finding. These data predicted ∼42,500 genes in the human, slightly more than reported previously. PMID:11691844

  20. Comparative genomics of vesicomyid clam (Bivalvia: Mollusca) chemosynthetic symbionts

    PubMed Central

    Newton, Irene LG; Girguis, Peter R; Cavanaugh, Colleen M

    2008-01-01

    Background The Vesicomyidae (Bivalvia: Mollusca) are a family of clams that form symbioses with chemosynthetic gamma-proteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a reduced gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. Recently, two vesicomyid symbiont genomes were sequenced, illuminating the possible nutritional contributions of the symbiont to the host and making genome-wide evolutionary analyses possible. Results To examine the genomic evolution of the vesicomyid symbionts, a comparative genomics framework, including the existing genomic data combined with heterologous microarray hybridization results, was used to analyze conserved gene content in four vesicomyid symbiont genomes. These four symbionts were chosen to include a broad phylogenetic sampling of the vesicomyid symbionts and represent distinct chemosynthetic environments: cold seeps and hydrothermal vents. Conclusion The results of this comparative genomics analysis emphasize the importance of the symbionts' chemoautotrophic metabolism within their hosts. The fact that these symbionts appear to be metabolically capable autotrophs underscores the extent to which the host depends on them for nutrition and reveals the key to invertebrate colonization of these challenging environments. PMID:19055818

  1. Azospirillum genomes reveal transition of bacteria from aquatic to terrestrial environments.

    PubMed

    Wisniewski-Dyé, Florence; Borziak, Kirill; Khalsa-Moyers, Gurusahai; Alexandre, Gladys; Sukharnikov, Leonid O; Wuichet, Kristin; Hurst, Gregory B; McDonald, W Hayes; Robertson, Jon S; Barbe, Valérie; Calteau, Alexandra; Rouy, Zoé; Mangenot, Sophie; Prigent-Combaret, Claire; Normand, Philippe; Boyer, Mickaël; Siguier, Patricia; Dessaux, Yves; Elmerich, Claudine; Condemine, Guy; Krishnen, Ganisan; Kennedy, Ivan; Paterson, Andrew H; González, Victor; Mavingui, Patrick; Zhulin, Igor B

    2011-12-01

    Fossil records indicate that life appeared in marine environments ∼3.5 billion years ago (Gyr) and transitioned to terrestrial ecosystems nearly 2.5 Gyr. Sequence analysis suggests that "hydrobacteria" and "terrabacteria" might have diverged as early as 3 Gyr. Bacteria of the genus Azospirillum are associated with roots of terrestrial plants; however, virtually all their close relatives are aquatic. We obtained genome sequences of two Azospirillum species and analyzed their gene origins. While most Azospirillum house-keeping genes have orthologs in its close aquatic relatives, this lineage has obtained nearly half of its genome from terrestrial organisms. The majority of genes encoding functions critical for association with plants are among horizontally transferred genes. Our results show that transition of some aquatic bacteria to terrestrial habitats occurred much later than the suggested initial divergence of hydro- and terrabacterial clades. The birth of the genus Azospirillum approximately coincided with the emergence of vascular plants on land.

  2. Whole-Genome Relationships among Francisella Bacteria of Diverse Origins Define New Species and Provide Specific Regions for Detection

    PubMed Central

    Challacombe, Jean F.; Petersen, Jeannine M.; Gallegos-Graves, La Verne; Hodge, David; Pillai, Segaran

    2016-01-01

    ABSTRACT Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisella strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features—for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). This study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria. IMPORTANCE DNA-based detection and sequencing methods have identified thousands of new bacteria in the human body and the environment. In most cases, there are no cultured

  3. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    PubMed

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  4. Comparative genomic analysis of eutherian interferon-γ-inducible GTPases.

    PubMed

    Premzl, Marko

    2012-11-01

    The interferon-γ-inducible GTPases, IFGGs, are intracellular proteins involved in immune response against pathogens. A comprehensive comparative genomic review and analysis of eutherian IFGGs was carried out using public genomic sequences. The 64 eutherian IFGG genes were examined in detail and annotated. The eutherian IFGG promoter types were first catalogued followed by a phylogenetic analysis of eutherian IFGGs, which described five major IFGG clusters. The patterns of differential gene expansions and protein regions that may regulate IFGG catalytic features suggested a new classification of eutherian IFGGs. This mini-review has also provided new tests of reliability of public genomic sequences as well as tests of protein molecular evolution.

  5. Comparative Genome Analysis of Basidiomycete Fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  6. Comparative genomics of first available bovine Anaplasma phagocytophilum genome obtained with targeted sequence capture.

    PubMed

    Dugat, Thibaud; Loux, Valentin; Marthey, Sylvain; Moroldo, Marco; Lagrée, Anne-Claire; Boulouis, Henri-Jean; Haddad, Nadia; Maillard, Renaud

    2014-11-17

    Anaplasma phagocytophilum is a zoonotic and obligate intracellular bacterium transmitted by ticks. In domestic ruminants, it is the causative agent of tick-borne fever, which causes significant economic losses in Europe. As A. phagocytophilum is difficult to isolate and cultivate, only nine genome sequences have been published to date, none of which originate from a bovine strain.Our goals were to; 1/ develop a sequencing methodology which efficiently circumvents the difficulties associated with A. phagocytophilum isolation and culture; 2/ describe the first genome of a bovine strain; and 3/ compare it with available genomes, in order to both explore key genomic features at the species level, and to identify candidate genes that could be specific to bovine strains. DNA was extracted from a bovine blood sample infected by A. phagocytophilum. Following a whole genome capture approach, A. phagocytophilum DNA was enriched 197-fold in the sample and then sequenced using Illumina technology. In total, 58.9% of obtained reads corresponded to the A. phagocytophilum genome, covering 85.3% of the HZ genome. Then by performing comparisons with nine previously-sequenced A. phagocytophilum genomes, we determined the core genome of these ten strains. Following analysis, 1281 coding DNA sequences, including 1001 complete sequences, were detected in the A. phagocytophilum bovine genome, of which four appeared to be unique to the bovine isolate. These four coding DNA sequences coded for "hypothetical proteins of unknown function" and require further analysis. We also identified nine proteins common to both European domestic ruminants tested. Using a whole genome capture approach, we have sequenced the first A. phagocytophilum genome isolated from a cow. To the best of our knowledge, this is the first time that this method has been used to selectively enrich pathogenic bacterial DNA from samples also containing host DNA. The four proteins unique to the A. phagocytophilum bovine

  7. Genomic and comparative genomic analyses of Rickettsia heilongjiangensis provide insight into its evolution and pathogenesis.

    PubMed

    Duan, Changsong; Xiong, Xiaolu; Qi, Yong; Gong, Wenping; Jiao, Jun; Wen, Bohai

    2014-08-01

    Rickettsia heilongjiangensis, the causative agent of far eastern spotted fever, is an obligate intracellular gram-negative bacterium that belongs to the spotted fever group rickettsiae. To understand the evolution and pathogenesis of R. heilongjiangensis, we analyzed its genome and compared it with other rickettsial genomes available in GenBank. The R. heilongjiangensis chromosome contains 1333 genes, including 1297 protein coding genes and 36 RNA coding genes. The genome also contains 121 pseudogenes, 54 insertion sequences, and 39 tandem repeats. Sixteen genes encoding the major components of the type IV secretion systems were identified in the R. heilongjiangensis genome. In total, 37 β-barrel outer membrane proteins were predicted in the genome, eight of which have been previously confirmed to be outer membrane proteins. In addition, 266 potential virulence factor genes, seven partially deleted antibiotic resistance genes, and a genomic island were identified in the genome. The codon usage in the genome is compatible with its low GC content, and the amino acid usage shows apparent bias. A comparative genomic analysis showed that R. heilongjiangensis and R. japonica share one unique fragment that may be a target sequence for a diagnostic assay. The orthologs of 37 genes of R. heilongjiangensis were found in pathogenic R. rickettsii str. Sheila Smith but not in non-pathogenic R. rickettsii str. Iowa, which may explain why R. heilongjiangensis is pathogenic. Pan-genome analysis showed that R. heilongjiangensis and 42 other rickettsiae strains share 693 core genes with a pan-genome size of 4837 genes. The pan-genome-based phylogeny showed that R. heilongjiangensis was closely related to R. japonica.

  8. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    PubMed

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; van Hylckama Vlieg, Johan E T; Siezen, Roland J

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link

  9. Sputnik: a database platform for comparative plant genomics.

    PubMed

    Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F X

    2003-01-01

    Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics.

  10. The MicrobesOnline Web site for comparative genomics

    SciTech Connect

    Alm, Eric J.; Huang, Katherine H.; Price, Morgan N.; Koche,Richard P.; Keller, Keith; Dubchak, Inna L.; Arkin, Adam P.

    2004-11-05

    At present, hundreds of microbial genomes have been sequenced, and hundreds more are currently in the pipeline. The Virtual Institute for Microbial Stress and Survival has developed a publicly available suite of Web-based comparative genomic tools (http://www.microbesonline.org) designed to facilitate multispecies comparison among prokaryotes. Highlights of the Microbes Online Web site include operon and regulon predictions, a multispecies genome browser, a multispecies Gene Ontology browser, a comparative KEGG metabolic pathway viewer, a Bioinformatics Workbench for in-depth sequence analysis, and Gene Carts that allow users to save genes of interest for further study while they browse. In addition, we provide an interface for genome annotation, which like all of the tools reported here, is freely available to the scientific community.

  11. Comparative genomics of transcriptional regulation of methionine metabolism in Proteobacteria.

    PubMed

    Leyn, Semen A; Suvorova, Inna A; Kholina, Tatiana D; Sherstneva, Sofia S; Novichkov, Pavel S; Gelfand, Mikhail S; Rodionov, Dmitry A

    2014-01-01

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.

  12. Sputnik: a database platform for comparative plant genomics

    PubMed Central

    Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F.X.

    2003-01-01

    Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics. PMID:12519965

  13. Genome of the Extremely Radiation-Resistant Bacterium Deinococcus radiodurans Viewed from the Perspective of Comparative Genomics

    PubMed Central

    Makarova, Kira S.; Aravind, L.; Wolf, Yuri I.; Tatusov, Roman L.; Minton, Kenneth W.; Koonin, Eugene V.; Daly, Michael J.

    2001-01-01

    The bacterium Deinococcus radiodurans shows remarkable resistance to a range of damage caused by ionizing radiation, desiccation, UV radiation, oxidizing agents, and electrophilic mutagens. D. radiodurans is best known for its extreme resistance to ionizing radiation; not only can it grow continuously in the presence of chronic radiation (6 kilorads/h), but also it can survive acute exposures to gamma radiation exceeding 1,500 kilorads without dying or undergoing induced mutation. These characteristics were the impetus for sequencing the genome of D. radiodurans and the ongoing development of its use for bioremediation of radioactive wastes. Although it is known that these multiple resistance phenotypes stem from efficient DNA repair processes, the mechanisms underlying these extraordinary repair capabilities remain poorly understood. In this work we present an extensive comparative sequence analysis of the Deinococcus genome. Deinococcus is the first representative with a completely sequenced genome from a distinct bacterial lineage of extremophiles, the Thermus-Deinococcus group. Phylogenetic tree analysis, combined with the identification of several synapomorphies between Thermus and Deinococcus, supports the hypothesis that it is an ancient group with no clear affinities to any of the other known bacterial lineages. Distinctive features of the Deinococcus genome as well as features shared with other free-living bacteria were revealed by comparison of its proteome to the collection of clusters of orthologous groups of proteins. Analysis of paralogs in Deinococcus has revealed several unique protein families. In addition, specific expansions of several other families including phosphatases, proteases, acyltransferases, and Nudix family pyrophosphohydrolases were detected. Genes that potentially affect DNA repair and recombination and stress responses were investigated in detail. Some proteins appear to have been horizontally transferred from eukaryotes and are

  14. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    PubMed Central

    Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J

    2008-01-01

    Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client

  15. Comparative genome-scale metabolic modeling of actinomycetes: the topology of essential core metabolism.

    PubMed

    Alam, Mohammad Tauqeer; Medema, Marnix H; Takano, Eriko; Breitling, Rainer

    2011-07-21

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of actinomycetes. Based on in silico knockouts we generated topological and genomic maps for each organism. Combining the collection of genome-wide models, we constructed a global enzyme association network to identify both a conserved "core network" and an "essential core network" of the entire group. As has been reported for low-degree metabolites in several organisms, low-degree enzymes (in linear pathways) turn out to be generally more essential than high-degree enzymes (in metabolic hubs).

  16. Alfresco—A Workbench for Comparative Genomic Sequence Analysis

    PubMed Central

    Jareborg, Niclas; Durbin, Richard

    2000-01-01

    Comparative analysis of genomic sequences provides a powerful tool for identifying regions of potential biologic function; by comparing corresponding regions of genomes from suitable species, protein coding or regulatory regions can be identified by their homology. This requires the use of several specific types of computational analysis tools. Many programs exist for these types of analysis; not many exist for overall view/control of the results, which is necessary for large-scale genomic sequence analysis. Using Java, we have developed a new visualization tool that allows effective comparative genome sequence analysis. The program handles a pair of sequences from putatively homologous regions in different species. Results from various different existing external analysis programs, such as database searching, gene prediction, repeat masking, and alignment programs, are visualized and used to find corresponding functional sequence domains in the two sequences. The user interacts with the program through a graphic display of the genome regions, in which an independently scrollable and zoomable symbolic representation of the sequences is shown. As an example, the analysis of two unannotated orthologous genomic sequences from human and mouse containing parts of the UTY locus is presented. PMID:10958633

  17. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-04

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  18. Gramene 2016: comparative plant genomics and pathway resources

    PubMed Central

    Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  19. Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes

    SciTech Connect

    Gupta, Nitin; Benhamida, Jamal; Bhargava, Vipul; Goodman, Daniel; Kain , Elisabeth; Kerman, Ian; Nguyen , Ngan; Ollikainen, Noah; Rodriguez, Jesse; Wang, J.; Lipton, Mary S.; Romine, Margaret F.; Bafna, Vineet; Smith, Richard D.; Pevzner, Pavel A.

    2008-07-30

    While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages and cleavage of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.

  20. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    PubMed

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  1. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia

    PubMed Central

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M.

    2016-01-01

    Abstract Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints. PMID:28175287

  2. SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny.

    PubMed

    Wilson, Derek; Pethica, Ralph; Zhou, Yiduo; Talbot, Charles; Vogel, Christine; Madera, Martin; Chothia, Cyrus; Gough, Julian

    2009-01-01

    SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site.

  3. SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny

    PubMed Central

    Wilson, Derek; Pethica, Ralph; Zhou, Yiduo; Talbot, Charles; Vogel, Christine; Madera, Martin; Chothia, Cyrus; Gough, Julian

    2009-01-01

    SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site. PMID:19036790

  4. Comparative Genomic and Phylogenomic Analyses Reveal a Conserved Core Genome Shared by Estuarine and Oceanic Cyanopodoviruses.

    PubMed

    Huang, Sijun; Zhang, Si; Jiao, Nianzhi; Chen, Feng

    2015-01-01

    Podoviruses are among the major viral groups that infect marine picocyanobacteria Prochlorococcus and Synechococcus. Here, we reported the genome sequences of five Synechococcus podoviruses isolated from the estuarine environment, and performed comparative genomic and phylogenomic analyses based on a total of 20 cyanopodovirus genomes. The genomes of all the known marine cyanopodoviruses are highly syntenic. A pan-genome of 349 clustered orthologous groups was determined, among which 15 were core genes. These core genes make up nearly half of each genome in length, reflecting the high level of genome conservation among this cyanophage type. The whole genome phylogenies based on concatenated core genes and gene content were highly consistent and confirmed the separation of two discrete marine cyanopodovirus clusters MPP-A and MPP-B. The genomes within cluster MPP-B grouped into subclusters mainly corresponding to Prochlorococcus or Synechococcus host types. Auxiliary metabolic genes tend to occur in a specific phylogenetic group of these cyanopodoviruses. All the MPP-B phages analyzed here encode the photosynthesis gene psbA, which are absent in all the MPP-A genomes thus far. Interestingly, all the MPP-B and two MPP-A Synechococcus podoviruses encode the thymidylate synthase gene thyX, while at the same genome locus all the MPP-B Prochlorococcus podoviruses encode the transaldolase gene talC. Both genes are hypothesized to have the potential to facilitate the biosynthesis of deoxynucleotide for phage replication. Inheritance of specific functional genes could be important to the evolution and ecological fitness of certain cyanophage genotypes. Our analyses demonstrate that cyanopodoviruses of estuarine and oceanic origins share a conserved core genome and suggest that accessory genes may be related to environmental adaptation.

  5. The surprising diversity of clostridial hydrogenases: a comparative genomic perspective.

    PubMed

    Calusinska, Magdalena; Happe, Thomas; Joris, Bernard; Wilmotte, Annick

    2010-06-01

    Among the large variety of micro-organisms capable of fermentative hydrogen production, strict anaerobes such as members of the genus Clostridium are the most widely studied. They can produce hydrogen by a reversible reduction of protons accumulated during fermentation to dihydrogen, a reaction which is catalysed by hydrogenases. Sequenced genomes provide completely new insights into the diversity of clostridial hydrogenases. Building on previous reports, we found that [FeFe] hydrogenases are not a homogeneous group of enzymes, but exist in multiple forms with different modular structures and are especially abundant in members of the genus Clostridium. This unusual diversity seems to support the central role of hydrogenases in cell metabolism. In particular, the presence of multiple putative operons encoding multisubunit [FeFe] hydrogenases highlights the fact that hydrogen metabolism is very complex in this genus. In contrast with [FeFe] hydrogenases, their [NiFe] hydrogenase counterparts, widely represented in other bacteria and archaea, are found in only a few clostridial species. Surprisingly, a heteromultimeric Ech hydrogenase, known to be an energy-converting [NiFe] hydrogenase and previously described only in methanogenic archaea and some sulfur-reducing bacteria, was found to be encoded by the genomes of four cellulolytic strains: Clostridum cellulolyticum, Clostridum papyrosolvens, Clostridum thermocellum and Clostridum phytofermentans.

  6. Piggy-BACing the human genome I: constructing a porcine BAC physical map through comparative genomics.

    PubMed

    Rogatcheva, Margarita B; Chen, Kefei; Larkin, Denis M; Meyers, Stacey N; Marron, Brandy M; He, Weisong; Schook, Lawrence B; Beever, Jonathan E

    2008-01-01

    Availability of the human genome sequence and high similarity between humans and pigs at the molecular level provides an opportunity to use a comparative mapping approach to piggy-BAC the human genome. In order to advance the pig genome sequencing initiative, sequence similarity between large-scale porcine BAC-end sequences (BESs) and human genome sequence was used to construct a comparatively-anchored porcine physical map that is a first step towards sequencing the pig genome. A total of 50,300 porcine BAC clones were end-sequenced, yielding 76,906 BESs after trimming with an average read length of 538 bp. To anchor the porcine BACs on the human genome, these BESs were subjected to BLAST analysis using the human draft sequence, revealing 31.5% significant hits (E < e(-5)). Both genic and non-genic regions of homology contributed to the alignments between the human and porcine genomes. Porcine BESs with unique homology matches within the human genome provided a source of markers spaced approximately 70 to 300 kb along each human chromosome. In order to evaluate the utility of piggy-BACing human genome sequences, and confirm predictions of orthology, 193 evenly spaced BESs with similarity to HSA3 and HSA21 were selected and then utilized for developing a high-resolution (1.22 Mb) comparative radiation hybrid map of SSC13 that represents a fusion of HSA3 and HSA21. Resulting RH mapping of SSC13 covers 99% and 97% of HSA3 and HSA21, respectively. Seven evolutionary conserved blocks were identified including six on HSA3 and a single syntenic block corresponding to HSA21. The strategy of piggy-BACing the human genome described in this study demonstrates that through a directed, targeted comparative genomics approach construction of a high-resolution anchored physical map of the pig genome can be achieved. This map supports the selection of BACs to construct a minimal tiling path for genome sequencing and targeted gap filling. Moreover, this approach is highly relevant

  7. Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria.

    PubMed

    Dy, Ron L; Pitman, Andrew R; Fineran, Peter C

    2013-09-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution.

  8. Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria

    PubMed Central

    Dy, Ron L; Pitman, Andrew R; Fineran, Peter C

    2013-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution. PMID:24251073

  9. IMG/M: integrated genome and metagenome comparative data analysis system.

    PubMed

    Chen, I-Min A; Markowitz, Victor M; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  10. IMG/M: integrated genome and metagenome comparative data analysis system

    PubMed Central

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2017-01-01

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. PMID:27738135

  11. IMG/M: integrated genome and metagenome comparative data analysis system

    DOE PAGES

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; ...

    2016-10-13

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support formore » examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review(ER) companion system (IMG/M ER: https://img.jgi.doe.gov/ mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.« less

  12. IMG/M: integrated genome and metagenome comparative data analysis system

    SciTech Connect

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2016-10-13

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review(ER) companion system (IMG/M ER: https://img.jgi.doe.gov/ mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.

  13. Genome evolution in an ancient bacteria-ant symbiosis: parallel gene loss among Blochmannia spanning the origin of the ant tribe Camponotini

    PubMed Central

    Williams, Laura E.

    2015-01-01

    Stable associations between bacterial endosymbionts and insect hosts provide opportunities to explore genome evolution in the context of established mutualisms and assess the roles of selection and genetic drift across host lineages and habitats. Blochmannia, obligate endosymbionts of ants of the tribe Camponotini, have coevolved with their ant hosts for ∼40 MY. To investigate early events in Blochmannia genome evolution across this ant host tribe, we sequenced Blochmannia from two divergent host lineages, Colobopsis obliquus and Polyrhachis turneri, and compared them with four published genomes from Blochmannia of Camponotus sensu stricto. Reconstructed gene content of the last common ancestor (LCA) of these six Blochmannia genomes is reduced (690 protein coding genes), consistent with rapid gene loss soon after establishment of the symbiosis. Differential gene loss among Blochmannia lineages has affected cellular functions and metabolic pathways, including DNA replication and repair, vitamin biosynthesis and membrane proteins. Blochmannia of P. turneri (i.e., B. turneri) encodes an intact DnaA chromosomal replication initiation protein, demonstrating that loss of dnaA was not essential for establishment of the symbiosis. Based on gene content, B. obliquus and B. turneri are unable to provision hosts with riboflavin. Of the six sequenced Blochmannia, B. obliquus is the earliest diverging lineage (i.e., the sister group of other Blochmannia sampled) and encodes the fewest protein-coding genes and the most pseudogenes. We identified 55 genes involved in parallel gene loss, including glutamine synthetase, which may participate in nitrogen recycling. Pathways for biosynthesis of coenzyme A, terpenoids and riboflavin were lost in multiple lineages, suggesting relaxed selection on the pathway after inactivation of one component. Analysis of Illumina read datasets did not detect evidence of plasmids encoding missing functions, nor the presence of coresident symbionts

  14. Reprogramming Bacteria to Seek and Destroy Small Molecules (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Gallivan, Justin [Emory University

    2016-07-12

    Justin Gallivan, of Emory University presents a talk titled "Reprogramming Bacteria to Seek and Destroy Small Molecules" at the JGI User 7th Annual Genomics of Energy & Environment Meeting on March 21, 2012 in Walnut Creek, Calif

  15. Reprogramming Bacteria to Seek and Destroy Small Molecules (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    SciTech Connect

    Gallivan, Justin

    2012-03-21

    Justin Gallivan, of Emory University presents a talk titled "Reprogramming Bacteria to Seek and Destroy Small Molecules" at the JGI User 7th Annual Genomics of Energy & Environment Meeting on March 21, 2012 in Walnut Creek, Calif

  16. OGRe: a relational database for comparative analysis of mitochondrial genomes

    PubMed Central

    Jameson, Daniel; Gibson, Andrew P.; Hudelot, Cendrine; Higgs, Paul G.

    2003-01-01

    Organellar Genome Retrieval (OGRe) is a relational database of complete mitochondrial genome sequences for over 250 Metazoan species. OGRe provides a resource for the comparative analysis of mitochondrial genomes at several levels. At the sequence level, OGRe allows the retrieval of any selected set of mitochondrial genes from any selected set of species. Species are classified using a taxonomic system that allows easy selection of related groups of species. Sequence alignments are also available for some species. At the level of individual nucleotides, the system contains information on base frequencies and codon usage frequencies that can be compared between organisms. At the level of whole genomes, OGRe provides several ways of visualizing information on gene order. Diagrams illustrating the genome arrangement can be generated for any selected set of species automatically from the information in the database. Searches can be done based on gene arrangement to find sets of species that have the same order as one another. Diagrams for pairwise comparison of species can be produced that show the positions of break-points in the gene order and use colour to highlight the sections of the genome that have moved. OGRe is available from http://www.bioinf.man.ac.uk/ogre. PMID:12519982

  17. DCODE.ORG Anthology of Comparative Genomic Tools

    SciTech Connect

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.

  18. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  19. Phytozome: a Tool for Green Plant Comparative Genomics

    DOE Data Explorer

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Clusters of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These clusters allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release v4.0, Phytozome provides access to nine sequenced and annotated green plant genomes, eight of which have been clustered into gene families at six evolutionarily significant nodes. Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are hyper-linked and searchable. [Copied from the Overview at http://www.phytozome.net/Phytozome_info.php

  20. FLAGdb(++): A Bioinformatic Environment to Study and Compare Plant Genomes.

    PubMed

    Tamby, Jean Philippe; Brunaud, Véronique

    2017-01-01

    Today, the growing knowledge and data accumulation on plant genomes do not solve in a simple way the task of gene function inference. Because data of different types are coming from various sources, we need to integrate and analyze them to help biologists in this task. We created FLAGdb(++) ( http://tools.ips2.u-psud.fr/FLAGdb ) to take up this challenge for a selection of plant genomes. In order to enrich gene function predictions, structural and functional annotations of the genomes are explored to generate meta-data and to compare them. Since data are numerous and complex, we focused on accessibility and visualization with an original and user-friendly interface. In this chapter we present the main tools of FLAGdb(++) and a use-case to explore a gene family: structural and functional properties of this family and research of orthologous genes in the other plant genomes.

  1. Comparative Genomics of Mycobacteria: Some Answers, Yet More New Questions

    PubMed Central

    Behr, Marcel A.

    2015-01-01

    Comparative genomic studies permit a genus-level perspective on the distinction between environmental mycobacteria and Mycobacterium tuberculosis, as well as a species-level assessment of genetic variability within M. tuberculosis. Both of these strata of evolutionary analysis serve to generate hypotheses regarding the genomic basis of M. tuberculosis virulence. In contrasting lessons from macroevolutionary study and microevolutionary study, one can form predictions about which segments of the genome are likely to be essential for or dispensable for the pathogenesis of tuberculosis. Although some of these predictions have been experimentally verified, notable exceptions challenge the direct link between these virulence factors and the capacity of M. tuberculosis to successfully cause disease and propagate between human hosts. These unexpected findings serve as the stimulus for further studies, using genomic comparisons and other approaches, to better define the remarkable success of this recalcitrant pathogen. PMID:25395374

  2. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    SciTech Connect

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn Marie; Johnson, Courtney M; Martin, Stanton; Land, Miriam L; Lu, Tse-Yuan; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  3. Genome sequencing of three bacteria associated to black band disease from a Colombian reef-building coral.

    PubMed

    Henao, Juan; Pérez, Hermes; Abril, Deisy; Ospina, Katterine; Piza, Adriana; Botero, Kelly; Rincón, Cristhian; Donato, Jhon; Hurtado, Andrea; García, Erika; Otero, Vanessa; Del Risco, Alexander; Guerra, Brenda; Cifuentes, Yina; Ordoñez, Alvaro; Rojas, Daniel; Suarez, Karen; Osorio, Daniel; Pinzón, Andrés

    2017-03-01

    We announce the draft genome sequence of three Gram-negative bacteria isolated from coral tissues affected with the black band disease (BBD), identified with the NCBI's Assembly Database accession numbers: MBQF, MAYB and MBQE. These genome drafts constitute an useful tool for the characterisation of these bacteria and for the understanding of the relationship between the microbial consortia associated with the disease and the onset and progression of the pathology.

  4. Malignant canine mammary tumours: Preliminary genomic insights using oligonucleotide array comparative genomic hybridisation analysis.

    PubMed

    Santos, Marta; Dias-Pereira, Patrícia; Williams, Christina; Lopes, Carlos; Breen, Matthew

    2017-03-28

    Neoplastic mammary disease in female dogs represents a major health concern for dog owners and veterinarians, but the genomic basis of the disease is poorly understood. In this study, we performed high resolution oligonucleotide array comparative genomic hybridisation (oaCGH) to assess genome wide DNA copy number changes in 10 malignant canine mammary tumours from seven female dogs, including multiple tumours collected at one time from each of three female dogs. In all but two tumours, genomic imbalances were detected, with losses being more common than gains. Canine chromosomes 9, 22, 26, 27, 34 and X were most frequently affected. Dissimilar oaCGH ratio profiles were observed in multiple tumours from the same dogs, providing preliminary evidence for probable independent pathogenesis. Analysis of adjacent samples of one tumour revealed regional differences in the number of genomic imbalances, suggesting heterogeneity within tumours.

  5. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species.

  6. The tiger genome and comparative analysis with lion and snow leopard genomes

    PubMed Central

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  7. USE OF COMPETITIVE GENOMIC HYBRIDIZATION TO ENRICH FOR GENOME-SPECIFIC DIFFERENCES BETWEEN TWO CLOSELY RELATED HUMAN FECAL INDICATOR BACTERIA

    EPA Science Inventory

    Enterococci are frequently used as indicators of fecal pollution in surface waters. To accelerate the identification of Enterococcus faecalis-specific DNA sequences, we employed a comparative genomic strategy utilizing a positive selection process to compare E. faec...

  8. USE OF COMPETITIVE GENOMIC HYBRIDIZATION TO ENRICH FOR GENOME-SPECIFIC DIFFERENCES BETWEEN TWO CLOSELY RELATED HUMAN FECAL INDICATOR BACTERIA

    EPA Science Inventory

    Enterococci are frequently used as indicators of fecal pollution in surface waters. To accelerate the identification of Enterococcus faecalis-specific DNA sequences, we employed a comparative genomic strategy utilizing a positive selection process to compare E. faec...

  9. Low-pass sequencing for microbial comparative genomics

    PubMed Central

    Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy

    2004-01-01

    Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich

  10. Low-pass sequencing for microbial comparative genomics.

    PubMed

    Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Ng, Wailap Victor; Hood, Leroy

    2004-01-12

    We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1

  11. CRISPR-based screening of genomic island excision events in bacteria.

    PubMed

    Selle, Kurt; Klaenhammer, Todd R; Barrangou, Rodolphe

    2015-06-30

    Genomic analysis of Streptococcus thermophilus revealed that mobile genetic elements (MGEs) likely contributed to gene acquisition and loss during evolutionary adaptation to milk. Clustered regularly interspaced short palindromic repeats-CRISPR-associated genes (CRISPR-Cas), the adaptive immune system in bacteria, limits genetic diversity by targeting MGEs including bacteriophages, transposons, and plasmids. CRISPR-Cas systems are widespread in streptococci, suggesting that the interplay between CRISPR-Cas systems and MGEs is one of the driving forces governing genome homeostasis in this genus. To investigate the genetic outcomes resulting from CRISPR-Cas targeting of integrated MGEs, in silico prediction revealed four genomic islands without essential genes in lengths from 8 to 102 kbp, totaling 7% of the genome. In this study, the endogenous CRISPR3 type II system was programmed to target the four islands independently through plasmid-based expression of engineered CRISPR arrays. Targeting lacZ within the largest 102-kbp genomic island was lethal to wild-type cells and resulted in a reduction of up to 2.5-log in the surviving population. Genotyping of Lac(-) survivors revealed variable deletion events between the flanking insertion-sequence elements, all resulting in elimination of the Lac-encoding island. Chimeric insertion sequence footprints were observed at the deletion junctions after targeting all of the four genomic islands, suggesting a common mechanism of deletion via recombination between flanking insertion sequences. These results established that self-targeting CRISPR-Cas systems may direct significant evolution of bacterial genomes on a population level, influencing genome homeostasis and remodeling.

  12. Comparative and functional genomic analyses of the pathogenicity of phytopathogen Xanthomonas campestris pv. campestris

    PubMed Central

    Qian, Wei; Jia, Yantao; Ren, Shuang-Xi; He, Yong-Qiang; Feng, Jia-Xun; Lu, Ling-Feng; Sun, Qihong; Ying, Ge; Tang, Dong-Jie; Tang, Hua; Wu, Wei; Hao, Pei; Wang, Lifeng; Jiang, Bo-Le; Zeng, Shenyan; Gu, Wen-Yi; Lu, Gang; Rong, Li; Tian, Yingchuan; Yao, Zhijian; Fu, Gang; Chen, Baoshan; Fang, Rongxiang; Qiang, Boqin; Chen, Zhu; Zhao, Guo-Ping; Tang, Ji-Liang; He, Chaozu

    2005-01-01

    Xanthomonas campestris pathovar campestris (Xcc) is the causative agent of crucifer black rot disease, which causes severe losses in agricultural yield world-wide. This bacterium is a model organism for studying plant-bacteria interactions. We sequenced the complete genome of Xcc 8004 (5,148,708 bp), which is highly conserved relative to that of Xcc ATCC 33913. Comparative genomics analysis indicated that, in addition to a significant genomic-scale rearrangement cross the replication axis between two IS1478 elements, loss and acquisition of blocks of genes, rather than point mutations, constitute the main genetic variation between the two Xcc strains. Screening of a high-density transposon insertional mutant library (16,512 clones) of Xcc 8004 against a host plant (Brassica oleraceae) identified 75 nonredundant, single-copy insertions in protein-coding sequences (CDSs) and intergenic regions. In addition to known virulence factors, full virulence was found to require several additional metabolic pathways and regulatory systems, such as fatty acid degradation, type IV secretion system, cell signaling, and amino acids and nucleotide metabolism. Among the identified pathogenicity-related genes, three of unknown function were found in Xcc 8004-specific chromosomal segments, revealing a direct correlation between genomic dynamics and Xcc virulence. The present combination of comparative and functional genomic analyses provides valuable information about the genetic basis of Xcc pathogenicity, which may offer novel insight toward the development of efficient methods for prevention of this important plant disease. PMID:15899963

  13. Comparative and functional genomic analyses of the pathogenicity of phytopathogen Xanthomonas campestris pv. campestris.

    PubMed

    Qian, Wei; Jia, Yantao; Ren, Shuang-Xi; He, Yong-Qiang; Feng, Jia-Xun; Lu, Ling-Feng; Sun, Qihong; Ying, Ge; Tang, Dong-Jie; Tang, Hua; Wu, Wei; Hao, Pei; Wang, Lifeng; Jiang, Bo-Le; Zeng, Shenyan; Gu, Wen-Yi; Lu, Gang; Rong, Li; Tian, Yingchuan; Yao, Zhijian; Fu, Gang; Chen, Baoshan; Fang, Rongxiang; Qiang, Boqin; Chen, Zhu; Zhao, Guo-Ping; Tang, Ji-Liang; He, Chaozu

    2005-06-01

    Xanthomonas campestris pathovar campestris (Xcc) is the causative agent of crucifer black rot disease, which causes severe losses in agricultural yield world-wide. This bacterium is a model organism for studying plant-bacteria interactions. We sequenced the complete genome of Xcc 8004 (5,148,708 bp), which is highly conserved relative to that of Xcc ATCC 33913. Comparative genomics analysis indicated that, in addition to a significant genomic-scale rearrangement cross the replication axis between two IS1478 elements, loss and acquisition of blocks of genes, rather than point mutations, constitute the main genetic variation between the two Xcc strains. Screening of a high-density transposon insertional mutant library (16,512 clones) of Xcc 8004 against a host plant (Brassica oleraceae) identified 75 nonredundant, single-copy insertions in protein-coding sequences (CDSs) and intergenic regions. In addition to known virulence factors, full virulence was found to require several additional metabolic pathways and regulatory systems, such as fatty acid degradation, type IV secretion system, cell signaling, and amino acids and nucleotide metabolism. Among the identified pathogenicity-related genes, three of unknown function were found in Xcc 8004-specific chromosomal segments, revealing a direct correlation between genomic dynamics and Xcc virulence. The present combination of comparative and functional genomic analyses provides valuable information about the genetic basis of Xcc pathogenicity, which may offer novel insight toward the development of efficient methods for prevention of this important plant disease.

  14. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  15. Comparative analysis of the Oenococcus oeni pan genome reveals genetic diversity in industrially-relevant pathways

    PubMed Central

    2012-01-01

    Background Oenococcus oeni, a member of the lactic acid bacteria, is one of a limited number of microorganisms that not only survive, but actively proliferate in wine. It is also unusual as, unlike the majority of bacteria present in wine, it is beneficial to wine quality rather than causing spoilage. These benefits are realised primarily through catalysing malolactic fermentation, but also through imparting other positive sensory properties. However, many of these industrially-important secondary attributes have been shown to be strain-dependent and their genetic basis it yet to be determined. Results In order to investigate the scale and scope of genetic variation in O. oeni, we have performed whole-genome sequencing on eleven strains of this bacterium, bringing the total number of strains for which genome sequences are available to fourteen. While any single strain of O. oeni was shown to contain around 1800 protein-coding genes, in-depth comparative annotation based on genomic synteny and protein orthology identified over 2800 orthologous open reading frames that comprise the pan genome of this species, and less than 1200 genes that make up the conserved genomic core present in all of the strains. The expansion of the pan genome relative to the coding potential of individual strains was shown to be due to the varied presence and location of multiple distinct bacteriophage sequences and also in various metabolic functions with potential impacts on the industrial performance of this species, including cell wall exopolysaccharide biosynthesis, sugar transport and utilisation and amino acid biosynthesis. Conclusions By providing a large cohort of sequenced strains, this study provides a broad insight into the genetic variation present within O. oeni. This data is vital to understanding and harnessing the phenotypic variation present in this economically-important species. PMID:22863143

  16. Bootstrap, Bayesian probability and maximum likelihood mapping: exploring new tools for comparative genome analyses

    PubMed Central

    Zhaxybayeva, Olga; Gogarten, J Peter

    2002-01-01

    Background Horizontal gene transfer (HGT) played an important role in shaping microbial genomes. In addition to genes under sporadic selection, HGT also affects housekeeping genes and those involved in information processing, even ribosomal RNA encoding genes. Here we describe tools that provide an assessment and graphic illustration of the mosaic nature of microbial genomes. Results We adapted the Maximum Likelihood (ML) mapping to the analyses of all detected quartets of orthologous genes found in four genomes. We have automated the assembly and analyses of these quartets of orthologs given the selection of four genomes. We compared the ML-mapping approach to more rigorous Bayesian probability and Bootstrap mapping techniques. The latter two approaches appear to be more conservative than the ML-mapping approach, but qualitatively all three approaches give equivalent results. All three tools were tested on mitochondrial genomes, which presumably were inherited as a single linkage group. Conclusions In some instances of interphylum relationships we find nearly equal numbers of quartets strongly supporting the three possible topologies. In contrast, our analyses of genome quartets containing the cyanobacterium Synechocystis sp. indicate that a large part of the cyanobacterial genome is related to that of low GC Gram positives. Other groups that had been suggested as sister groups to the cyanobacteria contain many fewer genes that group with the Synechocystis orthologs. Interdomain comparisons of genome quartets containing the archaeon Halobacterium sp. revealed that Halobacterium sp. shares more genes with Bacteria that live in the same environment than with Bacteria that are more closely related based on rRNA phylogeny . Many of these genes encode proteins involved in substrate transport and metabolism and in information storage and processing. The performed analyses demonstrate that relationships among prokaryotes cannot be accurately depicted by or inferred from

  17. Comparative genomic analysis of two brucellaphages of distant origins.

    PubMed

    Flores, Victor; López-Merino, Ahidé; Mendoza-Hernandez, Guillermo; Guarneros, Gabriel

    2012-04-01

    Here, we present the first complete genome sequence of brucellaphage Tbilisi (Tb) and compared it with that of Pr, a broad host-range brucellaphage recently isolated in Mexico. The genomes consist of 41,148 bp (Tb) and 38,253 bp (Pr), they differ mainly in the region encoding structural proteins, in which the genome of Tb shows two major insertions. Both genomes share 99.87% nucleotide identity, a high percentage of identity among phages isolated at so globally distant locations and temporally different occasions. Sequence analysis revealed 57 conserved ORFs, three transcriptional terminators and four putative transcriptional promoters. The co-occurrence of an ORF encoding a putative DnaA-like protein and a putative oriC-like origin of replication was found in both brucellaphages genomes, a feature not described in any other phage genome. These elements suggest that DNA replication in brucellaphages differs from other phages, and might resemble that of bacterial chromosomes. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. Sequencing and comparative analyses of the genomes of zoysiagrasses

    PubMed Central

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-01-01

    Zoysia is a warm-season turfgrass, which comprises 11 allotetraploid species (2n = 4x = 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession ‘Nagirizaki’ (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella ‘Wakaba’ and Z. pacifica ‘Zanpa’ were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica ‘Kyoto’, Z. japonica ‘Miyagi’ and Z. matrella ‘Chiba Fair Green’, were accumulated, and aligned against the reference genome of ‘Nagirizaki’ along with those from ‘Wakaba’ and ‘Zanpa’. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the ‘Zoysia Genome Database’ at http://zoysia.kazusa.or.jp. PMID:26975196

  19. Metabolic Environments and Genomic Features Associated with Pathogenic and Mutualistic Interactions between Bacteria and Plants is accepted for publication in MPMI

    SciTech Connect

    Karpinets, Tatiana V; Park, Byung H; Syed, Mustafa H; Klotz, Martin G; Uberbacher, Edward C

    2014-01-01

    Most bacterial symbionts of plants are phenotypically characterized by their parasitic or matualistic relationship with the host; however, the genomic characteristics that likely discriminate mutualistic symbionts from pathogens of plants are poorly understood. This study comparatively analyzed the genomes of 54 plant-symbiontic bacteria, 27 mutualists and 27 pathogens, to discover genomic determinants of their parasitic and mutualistic nature in terms of protein family domains, KEGG orthologous groups, metabolic pathways and families of carbohydrate-active enzymes (CAZymes). We further used all bacteria with sequenced genomesl, published microarrays and transcriptomics experimental datasets, and literature to validate and to explore results of the comparison. The analysis revealed that genomes of mutualists are larger in size and higher in GC content and encode greater molecular, functional and metabolic diversity than the investigated genomes of pathogens. This enriched molecular and functional enzyme diversity included constructive biosynthetic signatures of CAZymes and metabolic pathways in genomes of mutualists compared with catabolic signatures dominant in the genomes of pathogens. Another discriminative characteristic of mutualists is the co-occurence of gene clusters required for the expression and function of nitrogenase and RuBisCO. Analysis of previously published experimental data indicate that nitrogen-fixing mutualists may employ Rubisco to fix CO2 not in the canonical Calvin-Benson-Basham cycle but in a novel metabolic pathway, here called Rubisco-based glycolysis , to increase efficiency of sugar utilization during the symbiosis with plants. An important discriminative characteristic of plant pathogenic bacteria is two groups of genes likely encoding effector proteins involved in host invasion and a genomic locus encoding a putative secretion system that includes a DUF1525 domain protein conserved in pathogens of plants and of other organisms. The

  20. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  1. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  2. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    PubMed Central

    2011-01-01

    Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921

  3. Comparative and demographic analysis of orang-utan genomes.

    PubMed

    Locke, Devin P; Hillier, LaDeana W; Warren, Wesley C; Worley, Kim C; Nazareth, Lynne V; Muzny, Donna M; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T; Minx, Pat; Mitreva, Makedonka; Cook, Lisa; Delehaunty, Kim D; Fronick, Catrina; Schmidt, Heather; Fulton, Lucinda A; Fulton, Robert S; Nelson, Joanne O; Magrini, Vincent; Pohl, Craig; Graves, Tina A; Markovic, Chris; Cree, Andy; Dinh, Huyen H; Hume, Jennifer; Kovar, Christie L; Fowler, Gerald R; Lunter, Gerton; Meader, Stephen; Heger, Andreas; Ponting, Chris P; Marques-Bonet, Tomas; Alkan, Can; Chen, Lin; Cheng, Ze; Kidd, Jeffrey M; Eichler, Evan E; White, Simon; Searle, Stephen; Vilella, Albert J; Chen, Yuan; Flicek, Paul; Ma, Jian; Raney, Brian; Suh, Bernard; Burhans, Richard; Herrero, Javier; Haussler, David; Faria, Rui; Fernando, Olga; Darré, Fleur; Farré, Domènec; Gazave, Elodie; Oliva, Meritxell; Navarro, Arcadi; Roberto, Roberta; Capozzi, Oronzo; Archidiacono, Nicoletta; Della Valle, Giuliano; Purgato, Stefania; Rocchi, Mariano; Konkel, Miriam K; Walker, Jerilyn A; Ullmer, Brygg; Batzer, Mark A; Smit, Arian F A; Hubley, Robert; Casola, Claudio; Schrider, Daniel R; Hahn, Matthew W; Quesada, Victor; Puente, Xose S; Ordoñez, Gonzalo R; López-Otín, Carlos; Vinar, Tomas; Brejova, Brona; Ratan, Aakrosh; Harris, Robert S; Miller, Webb; Kosiol, Carolin; Lawson, Heather A; Taliwal, Vikas; Martins, André L; Siepel, Adam; Roychoudhury, Arindam; Ma, Xin; Degenhardt, Jeremiah; Bustamante, Carlos D; Gutenkunst, Ryan N; Mailund, Thomas; Dutheil, Julien Y; Hobolth, Asger; Schierup, Mikkel H; Ryder, Oliver A; Yoshinaga, Yuko; de Jong, Pieter J; Weinstock, George M; Rogers, Jeffrey; Mardis, Elaine R; Gibbs, Richard A; Wilson, Richard K

    2011-01-27

    'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new

  4. Comparative Analysis of Six Lagerstroemia Complete Chloroplast Genomes

    PubMed Central

    Xu, Chao; Dong, Wenpan; Li, Wenqing; Lu, Yizeng; Xie, Xiaoman; Jin, Xiaobai; Shi, Jipu; He, Kaihong; Suo, Zhili

    2017-01-01

    Crape myrtles are economically important ornamental trees of the genus Lagerstroemia L. (Lythraceae), with a distribution from tropical to northern temperate zones. They are positioned phylogenetically to a large subclade of rosids (in the eudicots) which contain more than 25% of all the angiosperms. They commonly bloom from summer till fall and are of significant value in city landscape and environmental protection. Morphological traits are shared inter-specifically among plants of Lagerstroemia to certain extent and are also influenced by environmental conditions and different developmental stages. Thus, classification of plants in Lagerstroemia at species and cultivar levels is still a challenging task. Chloroplast (cp) genome sequences have been proven to be an informative and valuable source of cp DNA markers for genetic diversity evaluation. In this study, the complete cp genomes of three Lagerstroemia species were newly sequenced, and three other published cp genome sequences of Lagerstroemia were retrieved for comparative analyses in order to obtain an upgraded understanding of the application value of genetic information from the cp genomes. The six cp genomes ranged from 152,049 bp (L. subcostata) to 152,526 bp (L. speciosa) in length. We analyzed nucleotide substitutions, insertions/deletions, and simple sequence repeats in the cp genomes, and discovered 12 relatively highly variable regions that will potentially provide plastid markers for further taxonomic, phylogenetic, and population genetics studies in Lagerstroemia. The phylogenetic relationships of the Lagerstroemia taxa inferred from the datasets from the cp genomes obtained high support, indicating that cp genome data may be useful in resolving relationships in this genus. PMID:28154574

  5. Floral gene resources from basal angiosperms for comparative genomics research.

    PubMed

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; DePamphilis, Claude W; Leebens-Mack, James H

    2005-03-30

    The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and

  6. Evolution of cancer suppression as revealed by mammalian comparative genomics.

    PubMed

    Tollis, Marc; Schiffman, Joshua D; Boddy, Amy M

    2017-02-02

    Cancer suppression is an important feature in the evolution of large and long-lived animals. While some tumor suppression pathways are conserved among all multicellular organisms, others mechanisms of cancer resistance are uniquely lineage specific. Comparative genomics has become a powerful tool to discover these unique and shared molecular adaptations in respect to cancer suppression. These findings may one day be translated to human patients through evolutionary medicine. Here, we will review theory and methods of comparative cancer genomics and highlight major findings of cancer suppression across mammals. Our current knowledge of cancer genomics suggests that more efficient DNA repair and higher sensitivity to DNA damage may be the key to tumor suppression in large or long-lived mammals.

  7. Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics.

    PubMed

    Haft, Daniel H; Selengut, Jeremy D; Brinkac, Lauren M; Zafar, Nikhat; White, Owen

    2005-02-01

    The presence or absence of metabolic pathways and structures provide a context that makes protein annotation far more reliable. Compiling such information across microbial genomes improves the functional classification of proteins and provides a valuable resource for comparative genomics. We have created a Genome Properties system to present key aspects of prokaryotic biology using standardized computational methods and controlled vocabularies. Properties reflect gene content, phenotype, phylogeny and computational analyses. The results of searches using hidden Markov models allow many properties to be deduced automatically, especially for families of proteins (equivalogs) conserved in function since their last common ancestor. Additional properties are derived from curation, published reports and other forms of evidence. Genome Properties system was applied to 156 complete prokaryotic genomes, and is easily mined to find differences between species, correlations between metabolic features and families of uncharacterized proteins, or relationships among properties. Genome Properties can be found at http://www.tigr.org/Genome_Properties http://www.tigr.org/tigr-scripts/CMR2/genome_properties_references.spl.

  8. Comparative Studies of Class IIa Bacteriocins of Lactic Acid Bacteria

    PubMed Central

    Eijsink, Vincent G. H.; Skeie, Marianne; Middelhoven, P. Hans; Brurberg, May Bente; Nes, Ingolf F.

    1998-01-01

    Four class IIa bacteriocins (pediocin PA-1, enterocin A, sakacin P, and curvacin A) were purified to homogeneity and tested for activity toward a variety of indicator strains. Pediocin PA-1 and enterocin A inhibited more strains and had generally lower MICs than sakacin P and curvacin A. The antagonistic activity of pediocin-PA1 and enterocin A was much more sensitive to reduction of disulfide bonds than the antagonistic activity of sakacin P and curvacin A, suggesting that an extra disulfide bond that is present in the former two may contribute to their high levels of activity. The food pathogen Listeria monocytogenes was among the most sensitive indicator strains for all four bacteriocins. Enterocin A was most effective in inhibiting Listeria, having MICs in the range of 0.1 to 1 ng/ml. Sakacin P had the interesting property of being very active toward Listeria but not having concomitant high levels of activity toward lactic acid bacteria. Strains producing class IIa bacteriocins displayed various degrees of resistance toward noncognate class IIa bacteriocins; for the sakacin P producer, it was shown that this resistance is correlated with the expression of immunity genes. It is hypothesized that variation in the presence and/or expression of such immunity genes accounts in part for the remarkably large variation in bacteriocin sensitivity displayed by lactic acid bacteria. PMID:9726871

  9. A Genomic Encyclopedia of the Root Nodule Bacteria: assessing genetic diversity through a systematic biogeographic survey.

    PubMed

    Reeve, Wayne; Ardley, Julie; Tian, Rui; Eshragi, Leila; Yoon, Je Won; Ngamwisetkun, Pinyaruk; Seshadri, Rekha; Ivanova, Natalia N; Kyrpides, Nikos C

    2015-01-01

    Root nodule bacteria are free-living soil bacteria, belonging to diverse genera within the Alphaproteobacteria and Betaproteobacteria, that have the capacity to form nitrogen-fixing symbioses with legumes. The symbiosis is specific and is governed by signaling molecules produced from both host and bacteria. Sequencing of several model RNB genomes has provided valuable insights into the genetic basis of symbiosis. However, the small number of sequenced RNB genomes available does not currently reflect the phylogenetic diversity of RNB, or the variety of mechanisms that lead to symbiosis in different legume hosts. This prevents a broad understanding of symbiotic interactions and the factors that govern the biogeography of host-microbe symbioses. Here, we outline a proposal to expand the number of sequenced RNB strains, which aims to capture this phylogenetic and biogeographic diversity. Through the Vavilov centers of diversity (Proposal ID: 231) and GEBA-RNB (Proposal ID: 882) projects we will sequence 107 RNB strains, isolated from diverse legume hosts in various geographic locations around the world. The nominated strains belong to nine of the 16 currently validly described RNB genera. They include 13 type strains, as well as elite inoculant strains of high commercial importance. These projects will strongly support systematic sequence-based studies of RNB and contribute to our understanding of the effects of biogeography on the evolution of different species of RNB, as well as the mechanisms that determine the specificity and effectiveness of nodulation and symbiotic nitrogen fixation by RNB with diverse legume hosts.

  10. A Genomic Encyclopedia of the Root Nodule Bacteria: assessing genetic diversity through a systematic biogeographic survey

    PubMed Central

    2015-01-01

    Root nodule bacteria are free-living soil bacteria, belonging to diverse genera within the Alphaproteobacteria and Betaproteobacteria, that have the capacity to form nitrogen-fixing symbioses with legumes. The symbiosis is specific and is governed by signaling molecules produced from both host and bacteria. Sequencing of several model RNB genomes has provided valuable insights into the genetic basis of symbiosis. However, the small number of sequenced RNB genomes available does not currently reflect the phylogenetic diversity of RNB, or the variety of mechanisms that lead to symbiosis in different legume hosts. This prevents a broad understanding of symbiotic interactions and the factors that govern the biogeography of host-microbe symbioses. Here, we outline a proposal to expand the number of sequenced RNB strains, which aims to capture this phylogenetic and biogeographic diversity. Through the Vavilov centers of diversity (Proposal ID: 231) and GEBA-RNB (Proposal ID: 882) projects we will sequence 107 RNB strains, isolated from diverse legume hosts in various geographic locations around the world. The nominated strains belong to nine of the 16 currently validly described RNB genera. They include 13 type strains, as well as elite inoculant strains of high commercial importance. These projects will strongly support systematic sequence-based studies of RNB and contribute to our understanding of the effects of biogeography on the evolution of different species of RNB, as well as the mechanisms that determine the specificity and effectiveness of nodulation and symbiotic nitrogen fixation by RNB with diverse legume hosts. PMID:25685260

  11. Single-cell genomics reveals co-metabolic interactions within uncultivated Marine Group A bacteria

    NASA Astrophysics Data System (ADS)

    Hawley, A. K.; Hallam, S. J.

    2016-02-01

    Marine Group A (MGA) bacteria represent a ubiquitous and abundant candidate phylum enriched in oxygen minimum zones (OMZs) and the deep ocean. Despite MGA prevalence little is known about their ecology and biogeochemistry. Here we chart the metabolic potential of 26 MGA single-cell amplified genomes sourced from different environments spanning ecothermodynamic gradients including open ocean waters, OMZs and methanogenic environments including a terephthalate-degrading bioreactor. Metagenomic contig recruitment to SAGs combined with tetra-nucleotide frequency distribution patterns resolved nine MGA population genome bins. All population genomes exhibited genomic streamlining with open ocean MGA being the most reduced. Different strategies for carbohydrate utilization, carbon fixation energy metabolism and respiratory pathways were identified between population genome bins, including various roles in the nitrogen and sulfur cycles. MGA inhabiting OMZ oxyclines encoded genes for partial denitrification with potential to feed into anammox and nitrification as well as a polysulfide reductase with a potential role in the cryptic sulfur cycle. MGA inhabiting anoxic waters, encoded NiFe hydrogenase and nitrous oxide reductase with the potential to complete partial denitrification pathways previously linked to sulfur oxidation in SUP05 bacteria. MGA from methanogenic environments encoded genes mediating cascading syntrophic interactions with fatty acid degraders and methanogens including reverse electron transport potential. The MGA phylum appears to have evolved alternative metabolic innovations adapting specific subgroups to occupy specific niches along ecothermodynamic gradients. Additionally, expression of MGA genes from different OMZ environments supports that these subgroups manifest an increasing propensity for co-metabolic interactions under energy limiting conditions that mandates a cooperative mode of existence with important implications for C, N and S cycling in

  12. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  13. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome

  14. Whole genome annotation and comparative genomic analyses of bio-control fungus Purpureocillium lilacinum.

    PubMed

    Prasad, Pushplata; Varshney, Deepti; Adholeya, Alok

    2015-11-25

    The fungus Purpureocillium lilacinum is widely known as a biological control agent against plant parasitic nematodes. This research article consists of genomic annotation of the first draft of whole genome sequence of P. lilacinum. The study aims to decipher the putative genetic components of the fungus involved in nematode pathogenesis by performing comparative genomic analysis with nine closely related fungal species in Hypocreales. de novo genomic assembly was done and a total of 301 scaffolds were constructed for P. lilacinum genomic DNA. By employing structural genome prediction models, 13, 266 genes coding for proteins were predicted in the genome. Approximately 73% of the predicted genes were functionally annotated using Blastp, InterProScan and Gene Ontology. A 14.7% fraction of the predicted genes shared significant homology with genes in the Pathogen Host Interactions (PHI) database. The phylogenomic analysis carried out using maximum likelihood RAxML algorithm provided insight into the evolutionary relationship of P. lilacinum. In congruence with other closely related species in the Hypocreales namely, Metarhizium spp., Pochonia chlamydosporia, Cordyceps militaris, Trichoderma reesei and Fusarium spp., P. lilacinum has large gene sets coding for G-protein coupled receptors (GPCRs), proteases, glycoside hydrolases and carbohydrate esterases that are required for degradation of nematode-egg shell components. Screening of the genome by Antibiotics & Secondary Metabolite Analysis Shell (AntiSMASH) pipeline indicated that the genome potentially codes for a variety of secondary metabolites, possibly required for adaptation to heterogeneous lifestyles reported for P. lilacinum. Significant up-regulation of subtilisin-like serine protease genes in presence of nematode eggs in quantitative real-time analyses suggested potential role of serine proteases in nematode pathogenesis. The data offer a better understanding of Purpureocillium lilacinum genome and will

  15. Using comparative genomics to drive new discoveries in microbiology.

    PubMed

    Haft, Daniel H

    2015-02-01

    Bioinformatics looks to many microbiologists like a service industry. In this view, annotation starts with what is known from experiments in the lab, makes reasonable inferences of which genes match other genes in function, builds databases to make all that we know accessible, but creates nothing truly new. Experiments lead, then biocuration and computational biology follow. But the astounding success of genome sequencing is changing the annotation paradigm. Every genome sequenced is an intercepted coded message from the microbial world, and as all cryptographers know, it is easier to decode a thousand messages than a single message. Some biology is best discovered not by phenomenology, but by decoding genome content, forming hypotheses, and doing the first few rounds of validation computationally. Through such reasoning, a role and function may be assigned to a protein with no sequence similarity to any protein yet studied. Experimentation can follow after the discovery to cement and to extend the findings. Unfortunately, this approach remains so unfamiliar to most bench scientists that lab work and comparative genomics typically segregate to different teams working on unconnected projects. This review will discuss several themes in comparative genomics as a discovery method, including highly derived data, use of patterns of design to reason by analogy, and in silico testing of computationally generated hypotheses.

  16. Comparative omics-driven genome annotation refinement: application across Yersiniae.

    PubMed

    Schrimpe-Rutledge, Alexandra C; Jones, Marcus B; Chauhan, Sadhana; Purvine, Samuel O; Sanford, James A; Monroe, Matthew E; Brewer, Heather M; Payne, Samuel H; Ansong, Charles; Frank, Bryan C; Smith, Richard D; Peterson, Scott N; Motin, Vladimir L; Adkins, Joshua N

    2012-01-01

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. The annotation process is now performed almost exclusively in an automated fashion to balance the large number of sequences generated. One possible way of reducing errors inherent to automated computational annotations is to apply data from omics measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. Here, the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species. Transcriptomic and proteomic data derived from highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis Pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 incorrect (i.e., observed frameshifts, extended start sites, and translated pseudogenes) protein-coding sequences within the three current genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes, including the insertion-ablated argD, underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, a transcriptional regulator, and many hypothetical proteins that were missed during annotation.

  17. CFGP: a web-based, comparative fungal genomics platform.

    PubMed

    Park, Jongsun; Park, Bongsoo; Jung, Kyongyong; Jang, Suwang; Yu, Kwangyul; Choi, Jaeyoung; Kong, Sunghyung; Park, Jaejin; Kim, Seryun; Kim, Hyojeong; Kim, Soonok; Kim, Jihyun F; Blair, Jaime E; Lee, Kwangwon; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI.

  18. Comparative Whole-Genome Mapping To Determine Staphylococcus aureus Genome Size, Virulence Motifs, and Clonality

    PubMed Central

    Pantrang, Madhulatha; Stahl, Buffy; Briska, Adam M.; Stemper, Mary E.; Wagner, Trevor K.; Zentz, Emily B.; Callister, Steven M.; Lovrich, Steven D.; Henkhaus, John K.; Dykes, Colin W.

    2012-01-01

    Despite being a clonal pathogen, Staphylococcus aureus continues to acquire virulence and antibiotic-resistant genes located on mobile genetic elements such as genomic islands, prophages, pathogenicity islands, and the staphylococcal chromosomal cassette mec (SCCmec) by horizontal gene transfer from other staphylococci. The potential virulence of a S. aureus strain is often determined by comparing its pulsed-field gel electrophoresis (PFGE) or multilocus sequence typing profiles to that of known epidemic or virulent clones and by PCR of the toxin genes. Whole-genome mapping (formerly optical mapping), which is a high-resolution ordered restriction mapping of a bacterial genome, is a relatively new genomic tool that allows comparative analysis across entire bacterial genomes to identify regions of genomic similarities and dissimilarities, including small and large insertions and deletions. We explored whether whole-genome maps (WGMs) of methicillin-resistant S. aureus (MRSA) could be used to predict the presence of methicillin resistance, SCCmec type, and Panton-Valentine leukocidin (PVL)-producing genes on an S. aureus genome. We determined the WGMs of 47 diverse clinical isolates of S. aureus, including well-characterized reference MRSA strains, and annotated the signature restriction pattern in SCCmec types, arginine catabolic mobile element (ACME), and PVL-carrying prophage, PhiSa2 or PhiSa2-like regions on the genome. WGMs of these isolates accurately characterized them as MRSA or methicillin-sensitive S. aureus based on the presence or absence of the SCCmec motif, ACME and the unique signature pattern for the prophage insertion that harbored the PVL genes. Susceptibility to methicillin resistance and the presence of mecA, SCCmec types, and PVL genes were confirmed by PCR. A WGM clustering approach was further able to discriminate isolates within the same PFGE clonal group. These results showed that WGMs could be used not only to genotype S. aureus but also to

  19. Comparative whole-genome mapping to determine Staphylococcus aureus genome size, virulence motifs, and clonality.

    PubMed

    Shukla, Sanjay K; Pantrangi, Madhulatha; Stahl, Buffy; Briska, Adam M; Stemper, Mary E; Wagner, Trevor K; Zentz, Emily B; Callister, Steven M; Lovrich, Steven D; Henkhaus, John K; Dykes, Colin W

    2012-11-01

    Despite being a clonal pathogen, Staphylococcus aureus continues to acquire virulence and antibiotic-resistant genes located on mobile genetic elements such as genomic islands, prophages, pathogenicity islands, and the staphylococcal chromosomal cassette mec (SCCmec) by horizontal gene transfer from other staphylococci. The potential virulence of a S. aureus strain is often determined by comparing its pulsed-field gel electrophoresis (PFGE) or multilocus sequence typing profiles to that of known epidemic or virulent clones and by PCR of the toxin genes. Whole-genome mapping (formerly optical mapping), which is a high-resolution ordered restriction mapping of a bacterial genome, is a relatively new genomic tool that allows comparative analysis across entire bacterial genomes to identify regions of genomic similarities and dissimilarities, including small and large insertions and deletions. We explored whether whole-genome maps (WGMs) of methicillin-resistant S. aureus (MRSA) could be used to predict the presence of methicillin resistance, SCCmec type, and Panton-Valentine leukocidin (PVL)-producing genes on an S. aureus genome. We determined the WGMs of 47 diverse clinical isolates of S. aureus, including well-characterized reference MRSA strains, and annotated the signature restriction pattern in SCCmec types, arginine catabolic mobile element (ACME), and PVL-carrying prophage, PhiSa2 or PhiSa2-like regions on the genome. WGMs of these isolates accurately characterized them as MRSA or methicillin-sensitive S. aureus based on the presence or absence of the SCCmec motif, ACME and the unique signature pattern for the prophage insertion that harbored the PVL genes. Susceptibility to methicillin resistance and the presence of mecA, SCCmec types, and PVL genes were confirmed by PCR. A WGM clustering approach was further able to discriminate isolates within the same PFGE clonal group. These results showed that WGMs could be used not only to genotype S. aureus but also to

  20. Comparing thousands of circular genomes using the CGView Comparison Tool

    PubMed Central

    2012-01-01

    Background Continued sequencing efforts coupled with advances in sequencing technology will lead to the completion of a vast number of small genomes. Whole-genome comparisons represent an important part of the analysis of any new genome sequence, as they can provide a better understanding of the biology and evolution of the source organism. Visualization of the results is important, as it allows information from a variety of sources to be integrated and interpreted. However, existing graphical comparison tools lack features needed for efficiently comparing a new genome to hundreds or thousands of existing sequences. Moreover, existing tools are limited in terms of the types of comparisons that can be performed, the extent to which the output can be customized, and the ease with which the entire process can be automated. Results The CGView Comparison Tool (CCT) is a package for visually comparing bacterial, plasmid, chloroplast, or mitochondrial sequences of interest to existing genomes or sequence collections. The comparisons are conducted using BLAST, and the BLAST results are presented in the form of graphical maps that can also show sequence features, gene and protein names, COG (Clusters of Orthologous Groups of proteins) category assignments, and sequence composition characteristics. CCT can generate maps in a variety of sizes, including 400 Megapixel maps suitable for posters. Comparisons can be conducted within a particular species or genus, or all available genomes can be used. The entire map creation process, from downloading sequences to redrawing zoomed maps, can be completed easily using scripts included with the CCT. User-defined features or analysis results can be included on maps, and maps can be extensively customized. To simplify program setup, a CCT virtual machine that includes all dependencies preinstalled is available. Detailed tutorials illustrating the use of CCT are included with the CCT documentation. Conclusion CCT can be used to visually

  1. Inferring divergence of context-dependent substitution rates in Drosophila genomes with applications to comparative genomics.

    PubMed

    Chachick, Ran; Tanay, Amos

    2012-07-01

    Nucleotide substitution is a major evolutionary driving force that can incrementally and stochastically give rise to broad divergence patterns among species. The substitution process at each genomic position is frequently modeled independently of the other positions, although complex interactions between nearby bases are known to significantly affect mutation rates. Here, we study the evolution of 12 fly genomes using new algorithms for accurate inference of parameter-rich substitution models. By comparing models between lineages, we reveal the evolutionary histories of substitution rates at different flanking nucleotide contexts. We demonstrate these driving forces of molecular evolution to be constantly changing, suggesting that neutral drift of mutation rates is an important factor in the evolution of genomes and their sequence composition. This observation is used to develop a scalable approach for parameter-rich comparative genomics. By screening short DNA sequences, we demonstrate how homeoboxes and other transcription factor binding motifs are highly conserved based on our parameter-rich models but not according to standard conservation assays. With the increasing availability of genome sequences, rich substitution models become an attractive and practical approach for evolutionary analysis in general and comparative genomics in particular.

  2. Genome mining reveals unlocked bioactive potential of marine Gram-negative bacteria.

    PubMed

    Machado, Henrique; Sonnenschein, Eva C; Melchiorsen, Jette; Gram, Lone

    2015-03-07

    Antibiotic resistance in bacteria spreads quickly, overtaking the pace at which new compounds are discovered and this emphasizes the immediate need to discover new compounds for control of infectious diseases. Terrestrial bacteria have for decades been investigated as a source of bioactive compounds leading to successful applications in pharmaceutical and biotech industries. Marine bacteria have so far not been exploited to the same extent; however, they are believed to harbor a multitude of novel bioactive chemistry. To explore this potential, genomes of 21 marine Alpha- and Gammaproteobacteria collected during the Galathea 3 expedition were sequenced and mined for natural product encoding gene clusters. Independently of genome size, bacteria of all tested genera carried a large number of clusters encoding different potential bioactivities, especially within the Vibrionaceae and Pseudoalteromonadaceae families. A very high potential was identified in pigmented pseudoalteromonads with up to 20 clusters in a single strain, mostly NRPSs and NRPS-PKS hybrids. Furthermore, regulatory elements in bioactivity-related pathways including chitin metabolism, quorum sensing and iron scavenging systems were investigated both in silico and in vitro. Genes with siderophore function were identified in 50% of the strains, however, all but one harboured the ferric-uptake-regulator gene. Genes encoding the syntethase of acylated homoserine lactones were found in Roseobacter-clade bacteria, but not in the Vibrionaceae strains and only in one Pseudoalteromonas strains. The understanding and manipulation of these elements can help in the discovery and production of new compounds never identified under regular laboratory cultivation conditions. High chitinolytic potential was demonstrated and verified for Vibrio and Pseudoalteromonas species that commonly live in close association with eukaryotic organisms in the environment. Chitin regulation by the ChiS histidine-kinase seems to be a

  3. Statistical methods for detecting genomic alterations through array-based comparative genomic hybridization (CGH).

    PubMed

    Wang, Yuedong; Guo, Sun-Wei

    2004-01-01

    Array-based comparative genomic hybridization (ABCGH) is an emerging high-resolution and high-throughput molecular genetic technique that allows genome-wide screening for chromosome alterations associated with tumorigenesis. Like the cDNA microarrays, ABCGH uses two differentially labeled test and reference DNAs which are cohybridized to cloned genomic fragments immobilized on glass slides. The hybridized DNAs are then detected in two different fluorochromes, and the significant deviation from unity in the ratios of the digitized intensity values is indicative of copy-number differences between the test and reference genomes. Proper statistical analyses need to account for many sources of variation besides genuine differences between the two genomes. In particular, spatial correlations, the variable nature of the ratio variance and non-Normal distribution call for careful statistical modeling. We propose two new statistics, the standard t-statistic and its modification with variances smoothed along the genome, and two tests for each statistic, the standard t-test and a test based on the hybrid adaptive spline (HAS). Simulations indicate that the smoothed t-statistic always improves the performance over the standard t-statistic. The t-tests are more powerful in detecting isolated alterations while those based on HAS are more powerful in detecting a cluster of alterations. We apply the proposed methods to the identification of genomic alterations in endometrium in women with endometriosis.

  4. Comparative Genomics between Two Xenorhabdus bovienii Strains Highlights Differential Evolutionary Scenarios within an Entomopathogenic Bacterial Species

    PubMed Central

    Bisch, Gaëlle; Ogier, Jean-Claude; Médigue, Claudine; Rouy, Zoé; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie

    2016-01-01

    Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Within Xenorhabdus bovienii species, the X. bovienii CS03 strain (Xb CS03) is nonvirulent when directly injected into lepidopteran insects, and displays a low virulence when associated with its Steinernema symbiont. The genome of Xb CS03 was sequenced and compared with the genome of a virulent strain, X. bovienii SS-2004 (Xb SS-2004). The genome size and content widely differed between the two strains. Indeed, Xb CS03 had a large genome containing several specific loci involved in the inhibition of competitors, including a few NRPS-PKS loci (nonribosomal peptide synthetases and polyketide synthases) producing antimicrobial molecules. Consistently, Xb CS03 had a greater antimicrobial activity than Xb SS-2004. The Xb CS03 strain contained more pseudogenes than Xb SS-2004. Decay of genes involved in the host invasion and exploitation (toxins, invasins, or extracellular enzymes) was particularly important in Xb CS03. This may provide an explanation for the nonvirulence of the strain when injected into an insect host. We suggest that Xb CS03 and Xb SS-2004 followed divergent evolutionary scenarios to cope with their peculiar life cycle. The fitness strategy of Xb CS03 would involve competitor inhibition, whereas Xb SS-2004 would quickly and efficiently kill the insect host. Hence, Xenorhabdus strains would have widely divergent host exploitation strategies, which impact their genome structure. PMID:26769959

  5. fPoxDB: fungal peroxidase database for comparative genomics.

    PubMed

    Choi, Jaeyoung; Détry, Nicolas; Kim, Ki-Tae; Asiegbu, Fred O; Valkonen, Jari P T; Lee, Yong-Hwan

    2014-05-08

    analysis toolkits with easy-to-follow web interface offer a useful workbench to study comparative and evolutionary genomics of peroxidases in fungi.

  6. Target recognition, resistance, immunity and genome mining of class II bacteriocins from Gram-positive bacteria.

    PubMed

    Kjos, Morten; Borrero, Juan; Opsata, Mona; Birri, Dagim J; Holo, Helge; Cintas, Luis M; Snipen, Lars; Hernández, Pablo E; Nes, Ingolf F; Diep, Dzung B

    2011-12-01

    Due to their very potent antimicrobial activity against diverse food-spoiling bacteria and pathogens and their favourable biochemical properties, peptide bacteriocins from Gram-positive bacteria have long been considered promising for applications in food preservation or medical treatment. To take advantage of bacteriocins in different applications, it is crucial to have detailed knowledge on the molecular mechanisms by which these peptides recognize and kill target cells, how producer cells protect themselves from their own bacteriocin (self-immunity) and how target cells may develop resistance. In this review we discuss some important recent progress in these areas for the non-lantibiotic (class II) bacteriocins. We also discuss some examples of how the current wealth of genome sequences provides an invaluable source in the search for novel class II bacteriocins.

  7. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    SciTech Connect

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  8. A web server for mining Comparative Genomic Hybridization (CGH) data

    NASA Astrophysics Data System (ADS)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  9. CyanoClust: comparative genome resources of cyanobacteria and plastids.

    PubMed

    Sasaki, Naobumi V; Sato, Naoki

    2010-01-01

    Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.

  10. Comparative genomics of Neisseria meningitidis: core genome, islands of horizontal transfer and pathogen-specific genes.

    PubMed

    Dunning Hotopp, Julie C; Grifantini, Renata; Kumar, Nikhil; Tzeng, Yih Ling; Fouts, Derrick; Frigimelica, Elisabetta; Draghi, Monia; Giuliani, Marzia Monica; Rappuoli, Rino; Stephens, David S; Grandi, Guido; Tettelin, Hervé

    2006-12-01

    To better understand Neisseria meningitidis genomes and virulence, microarray comparative genome hybridization (mCGH) data were collected from one Neisseria cinerea, two Neisseria lactamica, two Neisseria gonorrhoeae and 48 Neisseria meningitidis isolates. For N. meningitidis, these isolates are from diverse clonal complexes, invasive and carriage strains, and all major serogroups. The microarray platform represented N. meningitidis strains MC58, Z2491 and FAM18, and N. gonorrhoeae FA1090. By comparing hybridization data to genome sequences, the core N. meningitidis genome and insertions/deletions (e.g. capsule locus, type I secretion system) related to pathogenicity were identified, including further characterization of the capsule locus, bioinformatics analysis of a type I secretion system, and identification of some metabolic pathways associated with intracellular survival in pathogens. Hybridization data clustered meningococcal isolates from similar clonal complexes that were distinguished by the differential presence of six distinct islands of horizontal transfer. Several of these islands contained prophage or other mobile elements, including a novel prophage and a transposon carrying portions of a type I secretion system. Acquisition of some genetic islands appears to have occurred in multiple lineages, including transfer between N. lactamica and N. meningitidis. However, island acquisition occurs infrequently, such that the genomic-level relationship is not obscured within clonal complexes. The N. meningitidis genome is characterized by the horizontal acquisition of multiple genetic islands; the study of these islands reveals important sets of genes varying between isolates and likely to be related to pathogenicity.

  11. Allelic genome structural variations in maize detected by array comparative genome hybridization.

    PubMed

    Beló, André; Beatty, Mary K; Hondred, David; Fengler, Kevin A; Li, Bailin; Rafalski, Antoni

    2010-01-01

    DNA polymorphisms such as insertion/deletions and duplications affecting genome segments larger than 1 kb are known as copy-number variations (CNVs) or structural variations (SVs). They have been recently studied in animals and humans by using array-comparative genome hybridization (aCGH), and have been associated with several human diseases. Their presence and phenotypic effects in plants have not been investigated on a genomic scale, although individual structural variations affecting traits have been described. We used aCGH to investigate the presence of CNVs in maize by comparing the genome of 13 maize inbred lines to B73. Analysis of hybridization signal ratios of 60,472 60-mer oligonucleotide probes between inbreds in relation to their location in the reference genome (B73) allowed us to identify clusters of probes that deviated from the ratio expected for equal copy-numbers. We found CNVs distributed along the maize genome in all chromosome arms. They occur with appreciable frequency in different germplasm subgroups, suggesting ancient origin. Validation of several CNV regions showed both insertion/deletions and copy-number differences. The nature of CNVs detected suggests CNVs might have a considerable impact on plant phenotypes, including disease response and heterosis.

  12. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  13. Sequencing and comparative genome analysis of two pathogenic Streptococcus gallolyticus subspecies: genome plasticity, adaptation and virulence.

    PubMed

    Lin, I-Hsuan; Liu, Tze-Tze; Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops.

  14. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  15. Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species

    PubMed Central

    Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

    2012-01-01

    Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of

  16. Comparative genomics of Mortierella elongata and its bacterial endosymbiont Mycoavidus cysteinexigens

    DOE PAGES

    Uehling, J.; Gryganskyi, A.; Hameed, K.; ...

    2017-01-11

    Endosymbiosis of bacteria by eukaryotes is a defining feature of cellular evolution. In addition to well-known bacterial origins for mitochondria and chloroplasts, multiple origins of bacterial endosymbiosis are known within the cells of diverse animals, plants and fungi. Early-diverging lineages of terrestrial fungi harbor endosymbiotic bacteria belonging to the Burkholderiaceae. Furthermore, we sequenced the metagenome of the soil-inhabiting fungus Mortierella elongata and assembled the complete circular chromosome of its endosymbiont, Mycoavidus cysteinexigens, which we place within a lineage of endofungal symbionts that are sister clade to Burkholderia. The genome of M. elongata strain AG77 features a core set of primarymore » metabolic pathways for degradation of simple carbohydrates and lipid biosynthesis, while the M. cysteinexigens (AG77) genome is reduced in size and function. Experiments using antibiotics to cure the endobacterium from the host demonstrate that the fungal host metabolism is highly modulated by presence/ absence of M. cysteinexigens. In independent comparative phylogenomic analyses of fungal and bacterial genomes we find that they are consistent with an ancient origin for M. elongata M. cysteinexigens symbiosis, most likely over 350 million years ago and concomitant with the terrestrialization of Earth and diversification of land fungi and plants.« less

  17. Bivariate genomic analysis identifies a hidden locus associated with bacteria hypersensitive response in Arabidopsis thaliana

    PubMed Central

    Wang, Biao; Li, Zhuocheng; Xu, Weilin; Feng, Xiao; Wan, Qianhui; Zan, Yanjun; Sheng, Sitong; Shen, Xia

    2017-01-01

    Multi-phenotype analysis has drawn increasing attention to high-throughput genomic studies, whereas only a few applications have justified the use of multivariate techniques. We applied a recently developed multi-trait analysis method on a small set of bacteria hypersensitive response phenotypes and identified a single novel locus missed by conventional single-trait genome-wide association studies. The detected locus harbors a minor allele that elevates the risk of leaf collapse response to the injection of avrRpm1-modified Pseudomonas syringae (P = 1.66e-08). Candidate gene AT3G32930 with in the detected region and its co-expressed genes showed significantly reduced expression after P. syringae interference. Our results again emphasize that multi-trait analysis should not be neglected in association studies, as the power of specific multi-trait genotype-phenotype maps might only be tractable when jointly considering multiple phenotypes. PMID:28338080

  18. Genome Survey and Characterization of Endophytic Bacteria Exhibiting a Beneficial Effect on Growth and Development of Poplar Trees ▿ †

    PubMed Central

    Taghavi, Safiyh; Garafola, Craig; Monchy, Sébastien; Newman, Lee; Hoffman, Adam; Weyens, Nele; Barac, Tanja; Vangronsveld, Jaco; van der Lelie, Daniel

    2009-01-01

    The association of endophytic bacteria with their plant hosts has a beneficial effect for many different plant species. Our goal is to identify endophytic bacteria that improve the biomass production and the carbon sequestration potential of poplar trees (Populus spp.) when grown in marginal soil and to gain an insight in the mechanisms underlying plant growth promotion. Members of the Gammaproteobacteria dominated a collection of 78 bacterial endophytes isolated from poplar and willow trees. As representatives for the dominant genera of endophytic gammaproteobacteria, we selected Enterobacter sp. strain 638, Stenotrophomonas maltophilia R551-3, Pseudomonas putida W619, and Serratia proteamaculans 568 for genome sequencing and analysis of their plant growth-promoting effects, including root development. Derivatives of these endophytes, labeled with gfp, were also used to study the colonization of their poplar hosts. In greenhouse studies, poplar cuttings (Populus deltoides × Populus nigra DN-34) inoculated with Enterobacter sp. strain 638 repeatedly showed the highest increase in biomass production compared to cuttings of noninoculated control plants. Sequence data combined with the analysis of their metabolic properties resulted in the identification of many putative mechanisms, including carbon source utilization, that help these endophytes to thrive within a plant environment and to potentially affect the growth and development of their plant hosts. Understanding the interactions between endophytic bacteria and their host plants should ultimately result in the design of strategies for improved poplar biomass production on marginal soils as a feedstock for biofuels. PMID:19060168

  19. Genome survey and characterization of endophytic bacteria exhibiting a beneficial effect on growth and development of poplar trees.

    PubMed

    Taghavi, Safiyh; Garafola, Craig; Monchy, Sébastien; Newman, Lee; Hoffman, Adam; Weyens, Nele; Barac, Tanja; Vangronsveld, Jaco; van der Lelie, Daniel

    2009-02-01

    The association of endophytic bacteria with their plant hosts has a beneficial effect for many different plant species. Our goal is to identify endophytic bacteria that improve the biomass production and the carbon sequestration potential of poplar trees (Populus spp.) when grown in marginal soil and to gain an insight in the mechanisms underlying plant growth promotion. Members of the Gammaproteobacteria dominated a collection of 78 bacterial endophytes isolated from poplar and willow trees. As representatives for the dominant genera of endophytic gammaproteobacteria, we selected Enterobacter sp. strain 638, Stenotrophomonas maltophilia R551-3, Pseudomonas putida W619, and Serratia proteamaculans 568 for genome sequencing and analysis of their plant growth-promoting effects, including root development. Derivatives of these endophytes, labeled with gfp, were also used to study the colonization of their poplar hosts. In greenhouse studies, poplar cuttings (Populus deltoides x Populus nigra DN-34) inoculated with Enterobacter sp. strain 638 repeatedly showed the highest increase in biomass production compared to cuttings of noninoculated control plants. Sequence data combined with the analysis of their metabolic properties resulted in the identification of many putative mechanisms, including carbon source utilization, that help these endophytes to thrive within a plant environment and to potentially affect the growth and development of their plant hosts. Understanding the interactions between endophytic bacteria and their host plants should ultimately result in the design of strategies for improved poplar biomass production on marginal soils as a feedstock for biofuels.

  20. Genome Sequence of Azospirillum brasilense CBG497 and Comparative Analyses of Azospirillum Core and Accessory Genomes provide Insight into Niche Adaptation

    PubMed Central

    Wisniewski-Dyé, Florence; Lozano, Luis; Acosta-Cruz, Erika; Borland, Stéphanie; Drogue, Benoît; Prigent-Combaret, Claire; Rouy, Zoé; Barbe, Valérie; Mendoza Herrera, Alberto; González, Victor; Mavingui, Patrick

    2012-01-01

    Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by se